CSE 1710.03A Programming for Digital Media

Lab 10

Due date: Nov 23, 2009 at 20:00

Extracting Headlines and Generating a New WWW Page

Your task is to create a WWW page that lists the current news headlines, with some text substitutions to make things a bit more interesting. The program should be designed so that it can be called automatically every hour to refresh the contents of the generated HTML file.

As a source of headlines you will be using http://www.cse.yorku.ca/course/1710/labs/GoogleNews.html, which is a local copy of the WWW page http://news.google.com. The text subsitutions should swap two random words in the headline.

For this, you have to write a function createNews() that has no arguments Each time it is called, it should automatically get the news headlines and (over-)write the output file, which has to be located at: "Z:\news.html".

The format of the output should be a normal HTML page, with an appropriate header. Each headline should be written as a level-1 heading, according to the HTML specification, i.e. using the '<h1>' tag. Do not forget to finish the generated HTML page with appropriate HTML sequences.

You must name your program lab10.py.

As a starting point, you can use the following code segment, which downloads the contents of a specified WWW page into a string.

import urllib

contents = urllib.urlopen('http://www.cbc.ca')
text = contents.read()
contents.close()
print text
Here are a couple of hints:

What to Turn in

As mentioned previously, you have to add comments with your own identification (name, student ID, date). Remember to save the file after you modify it!

How to submit the lab

For details on how to submit, please refer to lab2. However, this time please submit to lab10, i.e. issue the command

submit 1710 lab10 lab10.py

Note: You must do all the above steps correctly for receiving full credit for this labtest.