I've had a little time lately to look into generating a "blog" or more accurately a feed,
publishing this site's changes.

With a traditional blog one uses a content management system to store items in a database,
and there is logic to both chronologically present those items to web browsers on demand,
and generate XML files (feeds) that are polled (subscribed to) by readers which allow them
to see new/updated items automatically.

Now I really don't want the runtime or administrative overhead of a blogging CMS,
and I want to have full control of my data, so hosting my blog somewhere else is out of the question.
Anyway I'm happy with my existing content management system, aka: a linux filesystem and vim.

So how do we map static HTML content to a feed?
Well each HTML file corresponds to a "blog" item, and the filesystem
maintains the metadata, like file name, modification time etc.
Also there are de facto standards for HTML which map very nicely
to other feed item attributes, as illustrated in the following elements:
<head>
  <title>Item title</title>
  <meta name="description" content="One line item description">
  <meta name="keywords" content="Item tags">
</head>
These elements also have the advantage of being parsed automatically by search engines.

So I wrote a shell script called bashfeed to extract this info automatically from
my static website and generate the RSS 2.0 feed.
I run bashfeed when I upload the website with the following script:
site=www.pixelbeat.org
cd $site
./scripts/gen_timeline > timeline.html
./scripts/bashfeed > feed/rss2.xml
rsync -vicaz --exclude="stats/" --delete -e ssh . user@$site:'~/public_html'
wget --quiet http://www.technorati.com/ping.html?url=http://$site/ -O- > /dev/null
Note also how I ping technorati automatically so that it picks up my updated feed.
My feed represents tags (HTML "keywords") as RSS 2.0 <category> elements
which technorati parses fine.
© Jun 19 2006