[Pyrex] intern(str) fails if string is not a C string

Robert Bradshaw robertwb at math.washington.edu
Thu Oct 22 06:27:33 CEST 2009


On Oct 21, 2009, at 1:10 PM, John Arbash Meinel wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> This bug seems to affect both cython and pyrex.
>
> Namely, I'm parsing a string that has NULL characters in it (known
> width), which is also likely to be redundant within the data stream.
>
> I'm doing something like:
>
> mystr = PyString_FromStringAndSize(NULL, count+other_count)
> memcpy(mystr, some_bytes, count)
> memcpy(mystr+count, more_bytes, other_count)
>
> mystr = intern(mystr)
>
> This fails because pyrex and cython both effectively translate this  
> code
> into:
>
> char *temp;
> temp = PyString_AsString(mystr);
> mystr = PyString_InternFromString(temp);
>
> With, of course, appropriate error checking and incref/decref  
> handling.
>
> I would, of course, like to use PyString_InternInPlace(PyObject **),
> however that fails for other reasons. "taking address of a non l- 
> value"
> if you try to do:
>
> cdef extern from "Python.h":
>  ctypedef struct PyObject:
>    pass
>  void PyString_InternInPlace(PyObject **)
>
>
> st = 'my string'
> PyString_InternInPlace(&<PyObject *>st)
>
>
> Now I can probably do some trickery with
>
> cdef PyObject *as_ptr
>
> as_ptr = <PyObject *>st
> PyString_InternInPlace(&as_ptr)
> st = <object>as_ptr
>
> However, because InternInPlace may destroy 'st', and that final
> assignment will be doing a DECREF on the 'st' object, I'm pretty  
> sure it
> will blow up.
>
> It feels like the only thing left to do is define a macro in a header
> with something like:
>
> #define INTERN_STRING(obj) (PyString_InternInPlace(&(obj))))
>
> and then
>
> cdef extern from "myheader.h":
>  INTERN_STRING(object)
>
> Is this true?
>
> John

Good catch. I've disabled optimizing the intern builtin in Cython for  
now. We could re-enable it for char* only if someone finds interning  
strings to be a bottleneck.

- Robert




More information about the Pyrex mailing list