Automatically watching for updates on web pages ?

Once in a while, I come upon a web page that :

  1. Doesn’t offer an RSS/Atom feed.
  2. Doesn’t change very often.
  3. I would like to be warned when it’s updated.

I would like to be automatically warned when such pages change. websec does this :

Description: Web Secretary – Web page monitoring software
A visual Web page monitoring software. However, it goes
beyond the normal functionalities offered by such software. Not only
does it detect changes based on content analysis (instead of date/time
stamp or simple textual comparison), it will email the changed page to
you with the new content highlighted.

But :

  1. It sends emails. Generating an RSS feed would be much better.
  2. It sends HTML emails. OK, you can use AsciiMarker to view the changes with a text MUA, but still…

Anybody knows of another piece of software I could use ?

12 thoughts on “Automatically watching for updates on web pages ?

  1. You could try email the website(s) author(s) and ask them to include an RSS feed…they may not know about it.

  2. Back a few months ago when I was looking for work I wrote a python script that scraped job listings off a few company sites where I was interested in working. It would then sumarize it into an e-mail, but it would be no work at all to make it also create an RSS feed. Here’s the quick and dirty code: http://rafb.net/paste/results/ljHElf22.html

    Note that it has to submit a form in one case in order get the job listings. It would just spit out the results and, since it was running via cron, the output got e-mailed to me.

    The real brains here is the “mechanoid” library for python.

  3. My bookmarks page (linked above) is some nasty python that diffs the current version with a saved copy, and orders all pages by date. Source is available on request. I find it much better than an RSS reader, since I can use it anywhere, and I can ignore sites easily without the unread count going up and making me guilty.

  4. Ah, kids these days… everybody did this before RSS became ubiquitous. One system that did it was newsclipper.

    …disclaimer: I wrote one of newsclipper’s modules, but to be honest I threw newsclipper away and wrote our aggregator for myself, its easy enough to do, and necessary when the page you’re scraping is weird. Today, I’d try Feed43 first.

  5. A little utility I stumbled across a while ago is Specto (specto.sf.net) it does the basics of watching webpages and will likely do a whole lot more pretty soon (especially if you help!).

    It has a nice and simple PyGTK gui, and docks in the notification area.

  6. Windows has a German program, WebSite-Watcher, which monitors (and highlights) changes in Web sites as well as any program I’ve ever tried. I don’t think the author is interested in a Linux port, but this is really the level of functionality I would like to have on Linux, to keep me from slipping back to Windows.

    Wataru Tenga, Tokyo

Comments are closed.