Well, very cool. Yahoo has an API that’s meant to let you do database-like read operations on anything web – you just pass your SELECT… statement and the URL and a few other arguments, and it chucks back a JSON document with your information. I haven’t managed to make it do anything interesting yet, but then, my requirements may be strange. And it doesn’t do SELECT COUNT… or GROUP BY… statements, so there are some fairly strict limits on its usefulness. It’s true, however, that had it existed a while ago it would have rendered part of the Viktorfeed much easier. But, y’know, it’s mine.
I’m tempted, once I manage to make it do something useful, to build a sort of Web 2.0 turducken – after all, the query could be applied to a Google search URL, or better, to the Google archive of USENET.
The new version of IBM Many Eyes lets you provide various kinds of URLs as data sources in a visualisation, and I could embed that in the blog, and if only WordPress.com would let you do that, the initial request would be initiated by something running on a WordPress website, using an IBM backend, to slurp data from a Yahoo! URL, that points at an SQL emulator somewhere in there, that gets data from an HTML parser in there, that operates on query results from Google, results which originate from some bored and kinky academic shooting the breeze over an underutilised departmental T-1 in 1992. Perhaps I could work in updates on Twitter, too.
But doesn’t this remind you of something – specifically the joke about the convoluted program which involves a document being printed out, placed on a brown table, photographed, and scanned? Think of all those layers of caching servers, app servers, Web frameworks, standard libraries, bureaucracy, operating systems, virtualisation, before you get to an actual computer. No wonder Richard Stallman don’t like it.
This tension has always defined the culture of IT; the Big Database, the mainframe, the semiconductor fab on one side, the Lone Hacker and the Garage Startup on the other. Like all good myths, it’s highly flexible in practice; a lot of people started off as the second and made the ancient march to the right as they got old and rich and conservative, and the very origins of the second are in huge state-run research labs.
It’s also highly ambiguous – people who at least think they are on the side of the second are often the archetypal Internet libertarians and the warbloggers yelling for torture, and doesn’t the ordered, white concrete Arthur C. Clarke world of the first sound good now?
In an out of the way corner of Oregon, Amazon is joining Google and Microsoft in building a really enormous data centre, to take advantage of cheap hydroelectricity and water cooling. The power comes from the New Deal. At the end of the 1930s, the US Federal Government built a string of big dams there; their first customer for the power was the aluminium industry as it geared up first to supply the RAF and then to create the USAAF. As a result, Boeing would build the 707, B-47, B-52, 727, 747, 737, 757, 767, and 777 in Seattle.
Today, it’s still the Bonneville Power Administration which Amazon will be paying for the electricity and cool water its IT factory needs. You can’t get around the infrastructure. Decisions we take now will last as long; will there one day be an IT equivalent of the Lochaber smelter, somewhere with fibre in the ground and wind in the sky?