Vlkommen till tv.swedb.se

Hem
Nyheter
Introduktion
TV-tabl
Anvndningsvillkor
Donationer
Kanaler
Dokumentation
Kontakta oss
Filarkiv XMLTV
Forum
Sk
Lnkar

Inloggning





Glmt ditt lsenord?
Inget konto n? Skapa ett

Registrerade anvndare
17804 registrerade
0 idag
30 denna vecka
1650 denna mnad
Senaste: MiltonTib

Hem arrow Dokumentation arrow Downloading TV-schedules from a Swedb server

Downloading TV-schedules from a Swedb server PDF Skriv ut E-post
Skrivet av Mattias Holmlund   
2006-07-18
This document describes how to write a program to download data from a swedb-server.

Downloading TV-schedules from a Swedb server

Background

The site tv.swedb.se was started in 2004 to provide TV-schedules for Swedish channels. It was originally only used for providing data via the grabber tv_grab_se_swedb (part of the XMLTV Project, but since then a number of other programs have been written that utilize the same data files.

The data-format is based on the xmltv-format. It is generic and is not specific to Sweden in any way. We are hoping that data will be provided for other countries in the same format in the future, so that the same applications can be used in several different countries.

This document describes how to write a program that downloads data from the tv.swedb.se servers. Since the tv.swedb.se project is run on a voluntary basis with no income generated from the service, it is important to us that all our users behave properly and don't put an unnecessarily high load on our servers. Please follow the rules below if you want to use our data.

Data Layout

Data is stored in a number of separate gzipped xml-files that can be downloaded from an http-server. To download data, you should start by retrieving the root-url. For tv.swedb.se, the root-url is http://tv.swedb.se/xmltv/channels.xml.gz. The root-url for Sweden will likely remain the same for the foreseeable future, but there might be data-sources available in the future, so you should make the root-url user configurable.

This file describes which channels are available and where data can be found for each channel. A typical entry looks like this:

 
<channel id="svt1.svt.se">
 
<display-name lang="sv">SVT1</display-name>
 
<base-url>http://xmltv.tvsajten.com/xmltv/</base-url>
 
<icon src ="http://xmltv.tvsajten.com/chanlogos/svt1.svt.se.png"/>
 
</channel>
 

The contents of the channel-entry is the same as specified by the xmltv-dtd with the addition of the base-url element. The base-url specifies where data for this particular channel can be found. Note that one base-url is specified for each channel. Right now, all channels use the same base-url, but this might change in the future.

If a channel-entry specifies more than one base-url for the channel,
the grabber shall use the first base-url.

The actual programs for each channel are stored in one file per channel and day in the location specified by the base-url for the channel. The name of each file is <id>_<yyyy-mm-dd>.xml.gz. As an example, the data for SVT1 on July 2nd, 2006, can be found at http://xmltv.tvsajten.com/xmltv/svt1.svt.se_2006-07-02.xml.gz

Each of these files follow the xmltv dtd, with the exception that they don't contain any <channel> elements.

A valid xmltv file can be constructed from the above data by removing all base-url fields from channels.xml.gz and outputting the relevant channel-entries concatenated with the contents of all program-files with the first and last lines omitted.

HTTP Caching

All http-requests against swedb-servers must implement http-caching properly. The cache must be stored persistently. Each http-response from a swedb-server contains a Last-Modified field and/or an ETag field. These fields shall be used in subsequent requests for the same url as If-Modified-Since and If-None-Match respectively.

For a tutorial on http-caching, see http://fishbowl.pastiche.org/2002/10/21/http_conditional_get_for_rss_hackers.

The reason for these caching requirements is that programme data change infrequently and by utilizing http caching, the bandwidth requirements for our servers decrease drastically.

Proper User-Agent

All http-requests must include a User-Agent value that is unique to this particular version of the grabbing application. The User-Agent shall consist of an alphanumeric string that is unique for the program, followed by "/" and an alphanumeric versionnumber. Optionally, more information may be added with a space after the version-number followed by an arbitrary string.

Examples:

  • xmltv/0.5.44
  • AirTimes/0.9 (Symbian OS; MIDP-1.0 MIDP-2.0; CLDC-1.0; en)

The User-Agent gives us two advantages:

  • It allows us to gather statistics of which grabbers are in use. We can then share these statistics with the grabber authors.
  • It allows us to block non-conforming grabbers.

We will always work with grabber authors before we decide to block a grabber. The reason that we may want to block a grabber is primarily that the grabber contains a bug that leads to unnecessarily high bandwidth usage, e.g. if the grabber fails to implement http-caching properly or requests data too often.

Update Interval

A grabber should normally download data at most once a day. If you feel that your particular grabber needs to download data more often than that, please contact us.

Update time

If your application fetches data automatically, it must not have a hard-coded time at which it fetches data. The time must be user-configurable and it should be randomized as default. If a lot of users try to download data from our servers at the exact same time, our servers suffer a lot.

Parallel requests

An application may run up to two http-requests against the swedb-servers simultaneously, but not more than that.

Senast uppdaterad ( 2006-07-27 )
 
Nsta >