[Pyrex] lxml's patches against cython

William Stein wstein at gmail.com
Fri Jul 27 21:48:05 CEST 2007


On 7/26/07, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Hi,
>
> here is an updated C-API patch for cython (capi-diff.patch, actually against
> sagex-20070710, hadn't downloaded cython yet while doing offline work).
>
> However, when I compile lxml with the resulting translator, it yields a number
> of errors, some of which are ok (and easily fixed), but some of which should
> be fixed in cython (and some also exist in Pyrex).
>
> The biggest number of errors (note that lxml wraps C libraries) result from
> the fact that Pyrex doesn't handle enums as ints, so you can't |, &, etc. enum
> values. That was a bug in 0.9.5 that is still not fixed in an official
> release. The attached "enum.patch" fixes it (not the first time it goes
> through this list, BTW).
>
> A cython specific feature seems to be that it knows about (c-implemented)
> builtins and requires them to obey a specific signature. However, the
> "unicode" function can be used without argument in Python, so the easiest way
> to create an empty unicode object in Pyrex is to say "unicode()". cython maps
> this to PyObject_Unicode, which requires an argument. Easy to work around in
> the code with unicode('') or a direct call to the C-function, though.
>
> I think cython should support unicode literals, preferably following the
> source encoding PEPs for Py3k (defaulting to UTF-8 etc.), but allowing only
> ASCII escapes would be fine for the beginning. It would at least allow you to
> create unicode strings straight away in the source, without explicitly
> wrapping it in unicode("literal", encoding).
>
> Although I was opposed at the beginning, I'm actually quite happy with the
> compile-time globals/builtins detection now. cython found two long standing
> typos in never-tested-corner-case-code of lxml.etree. :)
>
> I also added a patch that allows switching off assertions at compile time
> based on a compiler define (nicely supported by distutils).
>
> One remaining problem is that the module is now named "src.lxml.etree"
> internally (visible in exceptions, especially doctests). But "src" is not a
> package, just the main source directory. What is the best way to fix that?

I'm not sure yet, but we'll figure it out.

> Apart from that, the C-API implementation is working nicely with cython, so
> now the trunk of lxml can be compiled with cython plus the attached patches.

Excellent.  Many thanks for the patches!

> In case cython wants a bug tracker for project management, I'm currently
> getting a very good impression of Ubuntu's launchpad. Simple, non-intrusive
> sign-up, pretty good features and a close-to-intuitive interface.

Great idea, we definitely need such a thing for cython soon.
Could you setup a launchpad bug/feature tracker for Cython and post a link?
You could start by including all the issues you mention above, plus some
of the ones listed here under "goals": http://cython.org/

William



More information about the Pyrex mailing list