[Pyrex] Pythonic wrapping of libxml2

Martijn Faassen faassen at infrae.com
Wed Sep 29 18:36:42 CEST 2004


vng1 at mac.com wrote:
[snip]
> 2 - I badly need libxml2 in an application I am writing.  The modules I 
> absolutely need are tree and xpath.  If I can get  relaxng and 
> xmlreader, that would be really nice, but mostly I just need tree and 
> xpath.

I'm in the position at Infrae where we use libxml2 in a bunch of course 
application. Mostly simple parsing, the tree and xpath, though we are 
starting to use xslt now too. But the libxml2 bindings are such a pain 
to work with..

> The 'standard' libxml2 bindings are not just un-pythonic, they're just 
> plain broken.  If you are building up a document and you use XPath at 
> the same time, you have no way of managing the memory allocation without 
> leaking memory.
> XPath queries always return new xmlNode wrappers.  xmlNode's do not 
> properly implement object identity so you can't 'know' when to free 
> something from memory.

I am not sure I understand this particular use case, but I think it's 
far too easy to leak in the libxml2 bindings as they currently stand. 
I've been thinking an awful lot about how to do memory management 
properly in the context of a writable DOM, though, so that may be 
related. Perhaps you have some ideas. See here for some of my run-on 
thinking; the file ends without a good solution as I haven't found an 
attractive one yet:

http://codespeak.net/svn/lxml/trunk/doc/memorymanagement.txt

> http://mail.gnome.org/archives/xml/2004-June/msg00153.html
> 
> Obviously - that kind of a bug makes using libxml2 useless for a long 
> running process - so I'm trying to rectify that using Pyrex.

So as I was saying, I've been trying to do similar things with lxml, so 
would you like to join forces? Hm, I see you've been doing test postings 
to the lxml mailing list, so I hope that's a "yes". I've done a lot of 
work on wrapping the tree API both DOM-style and ElementTree style, 
though both still aren't done yet.

This invitation also goes to anyone else who is interested in a sane 
Python wrapper for libxml2/libxslt, of course.

Regards,

Martijn




More information about the Pyrex mailing list