[Pyrex] intern(str) fails if string is not a C string

John Arbash Meinel john at arbash-meinel.com
Wed Oct 21 22:10:53 CEST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This bug seems to affect both cython and pyrex.

Namely, I'm parsing a string that has NULL characters in it (known
width), which is also likely to be redundant within the data stream.

I'm doing something like:

mystr = PyString_FromStringAndSize(NULL, count+other_count)
memcpy(mystr, some_bytes, count)
memcpy(mystr+count, more_bytes, other_count)

mystr = intern(mystr)

This fails because pyrex and cython both effectively translate this code
into:

char *temp;
temp = PyString_AsString(mystr);
mystr = PyString_InternFromString(temp);

With, of course, appropriate error checking and incref/decref handling.

I would, of course, like to use PyString_InternInPlace(PyObject **),
however that fails for other reasons. "taking address of a non l-value"
if you try to do:

cdef extern from "Python.h":
  ctypedef struct PyObject:
    pass
  void PyString_InternInPlace(PyObject **)


st = 'my string'
PyString_InternInPlace(&<PyObject *>st)


Now I can probably do some trickery with

cdef PyObject *as_ptr

as_ptr = <PyObject *>st
PyString_InternInPlace(&as_ptr)
st = <object>as_ptr

However, because InternInPlace may destroy 'st', and that final
assignment will be doing a DECREF on the 'st' object, I'm pretty sure it
will blow up.

It feels like the only thing left to do is define a macro in a header
with something like:

#define INTERN_STRING(obj) (PyString_InternInPlace(&(obj))))

and then

cdef extern from "myheader.h":
  INTERN_STRING(object)

Is this true?

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrfas0ACgkQJdeBCYSNAAMagACgqc/z4fwv7/XnJDsNrH01AE2K
6KcAoLJiUjwe8dcO2AGvha9OBX7k+o5r
=Ejlk
-----END PGP SIGNATURE-----



More information about the Pyrex mailing list