September 11, 2008 at 8:43 pm | Posted in programming | 1 Comment

Last week i found out why it was a piece of cake for JSON to become the X in Ajax. XmlHttpRequest with XML is just a pain. Here’s why.

I wanted to pull data from a feed into a page. The feed is served as application/rss+xml. After a successful XmlHttpRequest, i wanted to retrieve the items form the request object’s responseXML property.

Unfortunately, this didn’t work on IE 6/7. Turns out IE does only parse the response text when the response is served as text/xml. Ok, so we just grab responseText and parse it ourselves. Problem solved.

Hm. It doesn’t work in Konqueror either. Fortunately the trouble with IE led me into the right direction. While Konqueror does parse the response for mime-type application/xml my feed still would end up unparsed.

Of course there’s no cross-browser way of parsing XML. So what i ended up with was:

try {
  var items = req.responseXML.documentElement.getElementsByTagName('item');
} catch(e) {
  try {
    // for IE we have to do the parsing ourselves, because the feed isn't delivered as text/xml ...
    var doc = new ActiveXObject("Microsoft.XMLDOM");
    var items = doc.documentElement.getElementsByTagName('item');
  } catch(e) {
     try {
        // ... same for Konqueror
        var p = new DOMParser();
        var doc =  p.parseFromString(req.responseText, "text/xml");
        var items = doc.documentElement.getElementsByTagName('item');
     } catch(e) {
        // well, at least we'll get the title later
        var items = [];

And don’t get me started on handling namespaced XML …

Article 2.0

September 2, 2008 at 5:00 pm | Posted in library, publishing | 2 Comments

Another post in the category “failed miserably at web 2.0”.

So Elsevier just started a contest to get some ideas about how a scientific article might look in a browser today. Since the motto “WHAT IF YOU WERE THE PUBLISHER?” resonated with me, I thought I might just enter the competition – if only to have a peek at what their article markup looks like.

Unfortunately I couldn’t make the registration. The registration form is a PDF that has to be filled in and emailed back. Looks like they really need help on web 2.0; at least they clearly missed the usability and browser-is-the-platform parts. None of the PDF readers on my machine – including acroread – (ubuntu 7.10) would let me fill in a PDF form, although I admit to not trying too hard.

In short, making sharing of ideas that hard is most definitely not web 2.0.

Migrating data from b2evolution to wordpress μ

August 15, 2008 at 4:29 pm | Posted in programming, publishing, wordpress | 8 Comments

At work, we are just under way to migrate blogs from an b2evolution installation to wordpress μ. Shopping around for tools to do this didn’t reveal anything usable. Either the versions wouldn’t match or the functionality was too limited (generic import from RSS for example wouldn’t help with files, users and comments).

So i came up with my own script. I didn’t advertise it at wordpress.org, because obviously, just as most other migration tools, it’s tailored to our situation. In particular, it’s only tested with b2evolution 2.3 and wordpress 2.6; but since it relies mostly on the database schema, it should also be compatible with wordpress >= 2.3.


  • migrates users, categories, posts, comments and files.
  • migrates one blog at a time.
  • works for wordpress and wordpress μ.
  • undo functionality, thus testing imports is easy.


The script is written in python and supposed to be run on the machine where the wordpress installation is located (to be able to copy files into wp-content). It also assumes that the wordpress database is located on this machine.

The b2evolution installation does not have to be on the same machine, though, because the b2evolution data is scraped from an SQL dump of the database.

So in case somebody finds this useful – but not quite perfect – let me know, and i’ll see what i can do to help.

Update: More info about how to use the script and how to do the migration in general is available at https://dev.livingreviews.org/projects/wpmu/wiki/b2evolutionMigration.

so we got wordpress μ …

July 31, 2008 at 6:00 pm | Posted in programming, publishing, wordpress | Leave a comment

… and what i wanted to do most, was creating posts automatically, using xmlrpc. alas! i couldn’t get it to work for quite some time.

googleing turned up some discussion threads, which told me, that xmlrpc support isn’t enabled by default anymore, possibly for good reason. this is already useful information, because the error you may get from you xmlrpc client is about failed authentication – a strange way to signal: not enabled.

anyway, one comment got me almost there. “Site Admin->Blogs->Edit”. but what then? no mention of xmlrpc, API, blogger, you name it. the setting to change turned out to be “Advanced edit”. set it from 0 to 1, and xmlrpc should work.

LaTeX, eh?

May 6, 2008 at 8:15 am | Posted in Uncategorized | Leave a comment


Squid as accelerator for TurboGears

April 16, 2008 at 5:49 am | Posted in http, programming | 3 Comments

I’m trying to use squid as transparent caching proxy between apache httpd and a – largely read-only – TurboGears web application.

Apache already acts as proxy, sending requests to the web app which listens on a different port. But using mod_cache was not an option, because for apache 2.0 it’s still experimental and additionally it doesn’t seem to work well with mod_rewrite.

So the idea was, to simply plug in squid.

The main problem so far was to narrow down the monumental squid configuration to the few lines i actually need. This is what i came up with so far:

http_port accel defaultsite=site.served.by.web.app
cache_peer parent 8080 0 no-query originserver
refresh_pattern ^http: 1440 20% 1440 override-expire override-lastmod ignore-no-cache ignore-reload ignore-private
acl all src
acl our_sites dstdomain
http_access allow all
http_access allow our_sites

The http_port directive tells squid to listen on port 8888, in accelerator mode, proxying the dafault site.

The cache_peer directive specifies – i.e. the web app – as only cache peer. So whenever squid cannot serve a request from its cache, this is where it will turn to for help. The last three tokens 0 no-query originserver basically say that this is not another squid proxy, by setting the ICP port to 0.

The refresh_pattern directive specifies the rules according to which an item in the cache is to be treated as fresh or stale. In this case, all items with an http URL will be regarded as fresh for one day (1440 minutes). The options override-expire override-lastmod ignore-no-cache ignore-reload ignore-private basically override whatever either client or web app say about caching – so this setup is NOT an http compliant cache. But that’s alright, since we only cache stuff that we are the producers of, so we should know.

I didn’t spend much time investigationg the access control settings, since i figure my setup – squid only listening on an internal port – does already away with most security concerns.

So this is what the results look like in squid’s access log:

1208325630.099 688 TCP_MISS/200 11103 GET http://localhost/feature/28 - FIRST_UP_PARENT/ text/html
1208325634.274 1 TCP_HIT/200 11109 GET http://localhost/feature/28 - NONE/- text/html

The second token is the number of milliseconds squid needed to process the requests.

Fun with Alternative Names in Certificates

February 1, 2008 at 7:55 am | Posted in Uncategorized | Leave a comment

Yesterday we eventually got around to putting new certificates on our servers at work. And we tried to do it right. In particular we wanted the certificates to be valid for all DNS names, the server can be accessed with.

Easy! Use Alternative Names! So in addition to the Common Name we’ve had before, we put in all the other DNS names as alternative names. Bummer!

The result with Firefox (on various platforms):

When trying the Common Name: The familiar popup

You have attempted to establish a connection with “<Common Name>”. However, the security certificate presented belongs to “<Common Name>”…

Note the funny twist with mentioning twice the same name.

When trying one of the alternative names, it worked well.

So the lesson we learnt: Add the Common Name as Alternative Name, too, and you’ll be happy.


January 9, 2008 at 10:32 am | Posted in apache, http | Leave a comment

After fiddling around trying to get a proxy rewrite rule to work yet again, it was time for a blog post.

On ubuntu 6.10, things were easy:

a2enmod proxy

and everything is fine.

On SuSE SLES 10, things were harder: Load proxy_http explicitely, too, and in the correct order!

On ubuntu 7.10:

a2enmod proxy

didn’t cut it. After some tearing out of hairs, I remembered this proxy_http thing.

a2enmod proxy_http

informs me, that proxy_http is already enabled as dependency of proxy. It still adds a new link to proxy_http.load in mods-enabled. Apparently the line loading proxy_http has moved from proxy.load to its own load config.

web 0.5?

January 2, 2008 at 7:11 pm | Posted in Uncategorized | Leave a comment

Now I don’t want to bash this particular publisher – but looking at the website makes me fear the transition to the digital age will take a long time.

Pretty printing xml

November 16, 2007 at 4:37 pm | Posted in programming, xml | Leave a comment

Simple problems should have simple solutions. As a mathematician I do know, that’s not always the case; but I still feel it should be. So how simple is pretty printing xml?

The simplest (in terms of usability on a linux machine) I’ve come up with so far:

tidy -i -xml -asxml input.xml

« Previous PageNext Page »

Create a free website or blog at WordPress.com.
Entries and comments feeds.