Article 2.0

September 2, 2008 at 5:00 pm | Posted in library, publishing | 2 Comments

Another post in the category “failed miserably at web 2.0”.

So Elsevier just started a contest to get some ideas about how a scientific article might look in a browser today. Since the motto “WHAT IF YOU WERE THE PUBLISHER?” resonated with me, I thought I might just enter the competition – if only to have a peek at what their article markup looks like.

Unfortunately I couldn’t make the registration. The registration form is a PDF that has to be filled in and emailed back. Looks like they really need help on web 2.0; at least they clearly missed the usability and browser-is-the-platform parts. None of the PDF readers on my machine – including acroread – (ubuntu 7.10) would let me fill in a PDF form, although I admit to not trying too hard.

In short, making sharing of ideas that hard is most definitely not web 2.0.

Migrating data from b2evolution to wordpress μ

August 15, 2008 at 4:29 pm | Posted in programming, publishing, wordpress | 8 Comments

At work, we are just under way to migrate blogs from an b2evolution installation to wordpress μ. Shopping around for tools to do this didn’t reveal anything usable. Either the versions wouldn’t match or the functionality was too limited (generic import from RSS for example wouldn’t help with files, users and comments).

So i came up with my own script. I didn’t advertise it at wordpress.org, because obviously, just as most other migration tools, it’s tailored to our situation. In particular, it’s only tested with b2evolution 2.3 and wordpress 2.6; but since it relies mostly on the database schema, it should also be compatible with wordpress >= 2.3.

Features

  • migrates users, categories, posts, comments and files.
  • migrates one blog at a time.
  • works for wordpress and wordpress μ.
  • undo functionality, thus testing imports is easy.

Limitations

The script is written in python and supposed to be run on the machine where the wordpress installation is located (to be able to copy files into wp-content). It also assumes that the wordpress database is located on this machine.

The b2evolution installation does not have to be on the same machine, though, because the b2evolution data is scraped from an SQL dump of the database.

So in case somebody finds this useful – but not quite perfect – let me know, and i’ll see what i can do to help.

Update: More info about how to use the script and how to do the migration in general is available at https://dev.livingreviews.org/projects/wpmu/wiki/b2evolutionMigration.

so we got wordpress μ …

July 31, 2008 at 6:00 pm | Posted in programming, publishing, wordpress | Leave a comment

… and what i wanted to do most, was creating posts automatically, using xmlrpc. alas! i couldn’t get it to work for quite some time.

googleing turned up some discussion threads, which told me, that xmlrpc support isn’t enabled by default anymore, possibly for good reason. this is already useful information, because the error you may get from you xmlrpc client is about failed authentication – a strange way to signal: not enabled.

anyway, one comment got me almost there. “Site Admin->Blogs->Edit”. but what then? no mention of xmlrpc, API, blogger, you name it. the setting to change turned out to be “Advanced edit”. set it from 0 to 1, and xmlrpc should work.

web 2.0 in the library

September 9, 2007 at 6:48 pm | Posted in library, openaccess, publishing | Leave a comment

Coming back from a workshop titled “web 2.0 applications for
librarians”, I thought about what web 2.0 means for me.

To me, web 2.0 is basically defined as third part of the trinity

  • REST – HTTP done right
  • AJAX – JavaScript done right
  • Web 2.0 – the Web done right

And just as the other two, it’s more defined as a correction of prior
erring. It’s no longer cool – or even ok – to publish a web site in
PDF to browsers and some other data via a WS-* type web service to
others. Instead, once you forget about browsers being the only user
agents, quite a bit of the web 2.0 developments seem very natural.

So the participative aspect of web 2.0 starts well before everyone
creating content; it starts with not making restrictive assumptions
about who and how people – or programs – will use your site. Let the
web participate in reusing your content.

I think this aspect is particularly important for librarians to
embrace. They don’t (necessarily) need to have ideas themselves of how
their data (catalogs, …) may be used, but they should care about
making it accessible:

  • Provide URLs (as stable as possible) for resources.
  • Publish new resources via feeds.
  • Offer open interfaces (sru, OpenSearch, unapi, Coins, …).
  • Make sure you publish semantically meaningful HTML (because screen scrapers have become first class citizens of the web, too).

My guess is, that the above recommendations do also make sense for
intranet services in many scientific institutions, because the
mechanisms to mashup data have become so easy that they are actually
available to end users as well.

So, let’s open up the catalog and see what happens.

i didn’t expect openaccess, but …

September 3, 2007 at 7:04 am | Posted in library, openaccess, publishing | Leave a comment

So while checking links programmatically, I came across http://npg.nature.com – the Nature Publishing Group’s portal. It’s a publisher, so you’d expect some interest in giving the public access. This is what i got:

--- request ---
GET / HTTP/1.1
Host: npg.nature.com

--- response ---
HTTP/1.1 302 Moved Temporarily
content-length: 0
set-cookie: ...
server: Apache-Coyote/1.1
connection: close
location: http://npg.nature.com/index.html
date: Mon, 03 Sep 2007 06:40:00 GMT
content-type: text/plain; charset=ISO-8859-1

Ok. We’ll follow:

--- request ---
GET /index.html HTTP/1.1
Host: npg.nature.com

--- response ---
HTTP/1.1 302 Found
content-length: 292
set-cookie: ...
server: Apache/2.0.46 (Red Hat)
connection: close
location: http://www.nature.com/npg
date: Mon, 03 Sep 2007 06:39:56 GMT
content-type: text/html; charset=iso-8859-1

Ah. We’re still not there.

--- request ---
GET /npg HTTP/1.1
Host: www.nature.com

--- response ---
HTTP/1.1 302 Moved Temporarily
content-length: 0
set-cookie: ...
server: Apache-Coyote/1.1
connection: keep-alive
location: http://www.nature.com/npg/
date: Mon, 03 Sep 2007 06:40:34 GMT
content-type: text/plain; charset=UTF-8

Forgot the damn trailing slash …

--- request ---
GET /npg/ HTTP/1.1
Host: www.nature.com

--- response ---
HTTP/1.1 404 Not Found
content-length: 19547
set-cookie: ...
server: Apache-Coyote/1.1
connection: keep-alive
date: Mon, 03 Sep 2007 06:40:35 GMT
content-type: text/html; charset=UTF-8

Oops.

Ok. They’re probably picky about user agents, cookies and so on. So let’s bring in the headers:

--- request ---
GET / HTTP/1.1
Host: npg.nature.com
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Keep-Alive: 300
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20060601 Firefox/2.0.0.4 (Ubuntu-edgy)
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection: keep-alive

--- response ---
HTTP/1.1 302 Moved Temporarily
content-length: 0
set-cookie: BIGipServerJava=302055690.20480.0000; expires=Mon, 03-Sep-2007 06:57:02 GMT; server: Apache-Coconnection: close
location: http://npg.nature.com/index.html
date: Mon, 03 Sep 2007 06:45:00 GMT
content-type: text/plain; charset=ISO-8859-1

--- request ---
GET /index.html HTTP/1.1
Host: npg.nature.com
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Keep-Alive: 300
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20060601 Firefox/2.0.0.4 (Ubuntu-edgy)
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection: keep-alive
Referer: http://npg.nature.com
Cookie: BIGipServerJava=302055690.20480.0000; expires=Mon; 03-Sep-2007 06:57:02 GMT; path=/; JSESSIONID=80AC4030C76B7E28DFE67DC7C2BA5993; Path=/

--- response ---
HTTP/1.1 302 Found
content-length: 292
set-cookie: BIGipServerJava=302055690.20480.0000; expires=Mon, 03-Sep-2007 06:57:03 GMT; path=/
server: Apache/2.0.46 (Red Hat)
connection: close
location: http://www.nature.com/npg
date: Mon, 03 Sep 2007 06:45:01 GMT
content-type: text/html; charset=iso-8859-1

--- request ---
GET /npg HTTP/1.1
Host: www.nature.com
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Keep-Alive: 300
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20060601 Firefox/2.0.0.4 (Ubuntu-edgy)
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection: keep-alive
Referer: http://npg.nature.com/index.html
Cookie: BIGipServerJava=302055690.20480.0000; expires=Mon; 03-Sep-2007 06:57:03 GMT; path=/

--- response ---
HTTP/1.1 302 Moved Temporarily
content-length: 0
set-cookie: BIGipServerJava=302055690.20480.0000; expires=Mon, 03-Sep-2007 06:57:04 GMT; path=/
server: Apache-Coyote/1.1
connection: keep-alive
location: http://www.nature.com/npg/
date: Mon, 03 Sep 2007 06:45:23 GMT
content-type: text/plain; charset=ISO-8859-1

--- request ---
GET /npg/ HTTP/1.1
Host: www.nature.com
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Keep-Alive: 300
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20060601 Firefox/2.0.0.4 (Ubuntu-edgy)
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection: keep-alive
Referer: http://www.nature.com/npg
Cookie: BIGipServerJava=302055690.20480.0000; expires=Mon; 03-Sep-2007 06:57:04 GMT; path=/

--- response ---
HTTP/1.1 404 Not Found
content-length: 19547
set-cookie: BIGipServerJava=302055690.20480.0000; expires=Mon, 03-Sep-2007 06:57:05 GMT; path=/
vary: Accept-Encoding
server: Apache-Coyote/1.1
connection: keep-alive
date: Mon, 03 Sep 2007 06:45:25 GMT
content-type: text/html; charset=ISO-8859-1

Hm. Didn’t really help. But

content-length: 19547

seems pretty big for an error page. http://www.nature.com/npg/ looks fine in the browser, too. So checking the response headers firefox got

Response Headers - http://www.nature.com/npg/
Server: Apache-Coyote/1.1
Content-Type: text/html; charset=ISO-8859-1
Date: Mon, 03 Sep 2007 06:47:07 GMT
Content-Length: 19547
Connection: keep-alive
Vary: Accept-Encoding
Set-Cookie: BIGipServerJava=302055690.20480.0000; expires=Mon, 03-Sep-2007 06:58:47 GMT; path=/  
404 Not Found

What is this supposed to be good for? I mean, as far as I know, you’ve got to actively do something in terms of server configuration to get this behaviour. This doesn’t inspire much confidence in Nature being ready for the web.

Create a free website or blog at WordPress.com.
Entries and comments feeds.