[Pyrex] C-API implementation in Pyrex 0.9.6

Stefan Behnel stefan_ml at behnel.de
Sat Oct 13 09:14:54 CEST 2007


Hi Greg,

Greg Ewing wrote:
> However, I can see that you might want to export some
> types for use by Pyrex but not for C, so I'll give
> this some more thought.

Not even to Pyrex. What I would like to do is select which C-level attributes
of which types I want to export and even if such a type references another
type that I do not want to export, I do not want to be required to export the
private referenced type.

Here is a real-life example from the current Cython-compiled lxml. I have two
public types and a public function in a .pyx file called etree.pyx:

------------------
cimport tree

cdef public class _Document [ type LxmlDocumentType, object LxmlDocument ]:
    cdef unsigned int _ns_counter
    cdef object _prefix_format
    cdef tree.xmlDoc* _c_doc
    cdef _BaseParser _parser

cdef public class _Element [ type LxmlElementType, object LxmlElement ]:
    cdef python.PyObject* _gc_doc
    cdef _Document _doc
    cdef tree.xmlNode* _c_node
    cdef object _tag
    cdef object _attrib

cdef public object getAttributeValue(_Element element, key, default):
    return "" # whatever
------------------

This is the public C-API definition in a separate .pxd file called
etreepublic.pxd:

------------------
cimport tree      # also C-includes relevant headers

cdef extern from "etree.h":
    cdef int import_etree(etree_module) except -1

    cdef class lxml.etree._Document [ object LxmlDocument ]:
        cdef tree.xmlDoc* _c_doc

    cdef class lxml.etree._Element [ object LxmlElement ]:
        cdef _Document _doc
        cdef tree.xmlNode* _c_node

    cdef object getAttributeValue(_Element element, key, default)
------------------

"etree.h" is the header file generated from etree.pyx that has the complete
type definitions:

------------------
__PYX_EXTERN_C DL_IMPORT(PyTypeObject) LxmlDocumentType;

struct LxmlDocument {
  PyObject_HEAD
  struct __pyx_vtabstruct_5etree__Document *__pyx_vtab;
  unsigned int _ns_counter;
  PyObject *_prefix_format;
  xmlDoc (*_c_doc);
  struct __pyx_obj_5etree__BaseParser *_parser;
};
__PYX_EXTERN_C DL_IMPORT(PyTypeObject) LxmlElementType;

struct LxmlElement {
  PyObject_HEAD
  PyObject (*_gc_doc);
  struct LxmlDocument *_doc;
  xmlNode (*_c_node);
  PyObject *_tag;
  PyObject *_attrib;
};
[...]
static PyObject *((*getAttributeValue)(struct LxmlElement *,PyObject
*,PyObject *));
[...]
[plus C-API import code]
------------------

Note that "__pyx_obj_5etree__BaseParser" is undefined in _Document - it's not
public and it's not written into the header file. But the C compiler ignores
this nicely and no user should care as the public parts of the type are
defined in the .pxd file.

In current Cython, I can cimport and use the C-API as follows:

------------------
from etreepublic cimport _Element, _Document
from etreepublic cimport getAttributeValue, import_etree

cdef object etree
from lxml import etree
# initialize C-API of lxml.etree (Pyrex does this behind the scenes)
import_etree(etree)

# now do something
cdef _Element element
element = etree.Element("tagname")                   # Python
print getAttributeValue(element, "attribute", None)  # C
print element._c_node.name                           # C
------------------

Pyrex 0.9.6 does not allow me do do this. The main reason is that the public
API must be defined in the .pxd file of the same name, which must define the
type completely.

So I would like to see these things changed in Pyrex:

1) Only generate public types into the module header file to avoid code
dependencies on module internals and irrelevant external header files.

2) Support incomplete type definitions in .pxd files to have users choose what
is public and what isn't. (Alternatively, do not require the public API to be
defined in the module .pxd file).

3) Avoid the Python module namespace pollution by accessing the C-API through
a single object (preferably underscore-prefixed), instead of one per function.

I hope this makes my criticism of Pyrex' current implementation a bit clearer.

Stefan



More information about the Pyrex mailing list