keywords:
RSS, PHP, SimpleXML, BBC weather
The data
from the weather channel was provided in a format which was specific to this
source. If data is being combined form several sources, it may be better
to have one common format. One such
format (or family of formats) is called RSS – which could stand fro Real Simple
Syndication. RSS is an XML format for
news items. There are a number of different standards used (RSS 0.92, RSS
2.0, Atom) so writing a general reader for RSS is difficult. However for a
single feed, with a known format, this is rather straightforward to do with PHP
and the SimpleXML functions.
The BBC provides a wide range of RSS
feeds. Amongst them are feeds of weather
forecasts. They are listed on this page which also explains the background to
RSS.
http://news.bbc.co.uk/1/hi/help/rss/3223484.stm
Starting
at http://www.bbc.co.uk/weather/ locate the page for
Note
that there is the sign for an RSS feed on the page – an
( it should
be http://feeds.bbc.co.uk/weather/feeds/rss/5day/id/1263.xml
)
where
the BBC has coded
To see the raw XML, view source. A
listing of a similar file is show in Appendix 1. This feed, as you can
see from the rss tag, is in RSS 2.0 format.
The browser displays the file in a readable
format. This is because of the link to an xsl stylesheet in the xml-stylesheet
directive (the second line). You can see
the stylesheet itself if you browse the link to it.
We will be doing more on XSLT later in the term.
Q: How could you change the stylesheet
to one of your own?
A: Not sure if this is possible.
Save the xml file, open it in a text editor
like PFE32, remove the xsl directive and display this modified file in a
browser. Note that without the stylesheet, the
browser displays the file as a foldable hierarchy.
Note which items are XML attributes and which
elements. What is ° ?
In this example there is only one item in the
feed but normally there are many.
Write a PHP script to display the forecast in a
format suitable for display on the Foyer monitors. A simple script can
just show the text, a more complex task is to show the different data items
separately, perhaps graphically.
Hint - use the
example of the extraction of data from The Weather Channel site. You
will have to use the function to read a file via the proxy here if you run this
on a CEMS server.
Q. How
can the individual data items in the BBC feed be extracted?
A. You
can use Regular Expressions to match text around the value of interest – see Patterns,
in Wikipedia and in
PHP ereg()
function. This technique, particularly
when it is applied to a whole HTML page, is called 'screen scraping'
Yahoo
also provide a weather feed – for example:
http://xml.weather.yahoo.com/forecastrss?p=UKXX0025
which is described in
their developer page:
http://developer.yahoo.com/weather
In this RSS feed, the weather data itself is included
using another namespace.
Note that these prefixes are just local names - the URIs
are the global unique identifiers
For the geo namespace - http://www.w3.org/2003/01/geo/wgs84_pos#
For the yweather namespace - http://xml.weather.yahoo.com/ns/rss/1.0
Compare the data in these two feeds and that
from the Weather
Channel .
Q. What
differences do you see between the XML representations of weather data in these
three sources? Which is most useful for later processing?
Q: What
differences and similarities to you see in the data itself? Can you explain this?
Generalise your script to allow different
locations to be displayed, and use this to display your own home town or other
place of interest in the
Q. How do I find the code for this town?
A.
Go to the BBC Weather site at http://www.bbc.co.uk/weather/
and search for the town or postcode. You may have to resolve any ambiguity on
the disambiguation page (often airports appear separately to the city).
On the forecast page itself, note the number in the URL - id=nnnn. [There has to be a better way to do this !]
Q
how to I create the XML file
A.
Use any text editor – design the file to use only text elements, not attributes
– its simpler
Q
How do I use XPath in PHP?
A
Using the xpath function in SimpleXML in PHP is a bit tricky, so here is how to do
the decoding:
Create an XML file like this. called bbcCodes.xml
and these PHP statements do the lookup:
$name = $_REQUEST["name"];
$places = simplexml_load_file("bbcCodes.xml");
$codes = $places->xpath("//Place[name='$name']/code");
print $codes[0];
Here it is running
-
This PHP script uses the xpath function which returns
an array of SimpleXMLElements (since this match will
usually produce a sequence of elements) so you need to pick out the first one
(assuming there is only one match)
Display
the location of the place: Note that the
location data is in a different namespace with a prefix geo.
To
access the location data, you need to understand how SimpleXML
treats namespaces. See Elliotte's article and Mark's
script which shows how namespaces can be registered and used in XPath expressions
In
a later exercise we will use this data to create an overlay for GoogleEarth and Google Maps.
RSS
Help from the BBC http://www.bbc.co.uk/feedfactory/help_what.shtml
Elliotte
Rusty Harold on SimpleXML : http://www-128.ibm.com/developerworks/library/x-simplexml.html
This is the
RSS feed for the
<?xml
version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl"
href="http://feeds.bbc.co.uk/weather/feeds/rss/5day/weather.xsl"?>
<rss version="2.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
>
<channel>
<title>BBC
- Weather Centre - Forecast for Bristol (BS1),
<link>http://feeds.bbc.co.uk/weather/feeds/rss/5day/id/1263.xml</link>
<description>by
the BBC Weather Centre in association with the Met Office</description>
<language>en</language>
<copyright>Copyright:
(C) British Broadcasting Corporation</copyright>
<pubDate>Sat, 23 Dec 2006 21:32:01 +0000</pubDate>
<lastBuildDate>Sat, 23 Dec 2006 21:32:01 +0000</lastBuildDate>
<managingEditor>weather@bbc.co.uk</managingEditor>
<webMaster>weather@bbc.co.uk</webMaster>
<item>
<title>The
forecast for Bristol (BS1),
<link>http://www.bbc.co.uk/weather/5day.shtml?id=1263</link>
<description>The
forecast for Bristol (BS1),
<guid
isPermaLink="false">tag:feeds.bbc.co.uk,2006-12-23:/weather/5day/id/1263</guid>
<pubDate>Sat, 23 Dec 2006 21:32:01 +0000</pubDate>
<geo:lat>51.45</geo:lat>
<geo:long>-2.60</geo:long>
</item>
</channel>
</rss>