[Pyrex] cdef'd classes initialization

Daniele Varrazzo daniele.varrazzo at gmail.com
Fri Aug 11 13:09:38 UTC 2006


Greg Ewing wrote:
> Daniele Varrazzo wrote:
> 
>> I stumbled against the Pyrex __new__ peculiar behavior: all the 
>> superclasses' __new__ get called and no communication is available in 
>> the calls chain.
> 
> I realise that this is an awkward restriction at times,
> and I'm thinking about ways of removing it, but it's
> tricky.
> 
>> Which can be read "In __new__ method, virtual methods are not virtual",
>> just as in C++ constructors.
> 
> That behaviour mostly fell out of the implementation.
> Setting the vtable pointer is done by the tp_new code
> that Pyrex generates, just before calling the object's
> __new__ method. This means that it's initially set to
> the vtable of the root class, its __new__ is executed,
> then it's re-set to the vtable of the next class down
> the chain, its __new__ is executed, etc.
> 
> However, this seemed to be not-unreasonable behaviour,
> since calling the methods of a given type before the
> object has been fully initialised as an object of that
> type could lead to surprises. So I didn't worry about
> trying to do it any differently.

The behaviour is reasonable indeed. Anyway in C++ you can have protected 
constructor to customize subclasses initialization process, so the need 
to access virtual methods can be avoided.

Virtual methods should deal with a class which has not been initialized 
yet, so they are generally doomed to fail. But _not_ generally speaking, 
a method specifically crafted for initialization (i.e. not accessing the 
instance state) may be used to customize the __new__ method, for example 
passing it a different constant value.

> However, one of the tricky things I mentioned in the
> first paragraph concerns where and how to set the
> vtable pointer, so this might change one day.

I tried the following implementation: i wrote:

----[ testinh.pyx ]----
cdef class A:
     def __new__(self):
         print "A.__new__()"
         print "size:", self.getSize()

     cdef int getSize(self):
         return 10

cdef class B(A):
     def __new__(self):
         print "B.__new__()"

     cdef int getSize(self):
         return 20

cdef class C(B):
     def __new__(self):
         print "C.__new__()"

     cdef int getSize(self):
         return 30
----[ testinh.pyx ]----

the output of

     python -c "import testinh; testinh.C()"

is

     A.__new__()
     size: 10
     B.__new__()
     C.__new__()

If testinh.c (generated with Pyrex version 0.9.4.1) is patched as follows:

----[ testinh.c patch ]----
@@ -375,10 +375,13 @@
  static struct __pyx_vtabstruct_7testinh_B __pyx_vtable_7testinh_B;

  static PyObject *__pyx_tp_new_7testinh_B(PyTypeObject *t, PyObject *a, 
PyObject *k) {
-  PyObject *o = __pyx_ptype_7testinh_A->tp_new(t, a, k);
+  PyObject *o = (*t->tp_alloc)(t, 0);
    struct __pyx_obj_7testinh_B *p = (struct __pyx_obj_7testinh_B *)o;
    *(struct __pyx_vtabstruct_7testinh_B **)&p->__pyx_base.__pyx_vtab = 
__pyx_vtabptr_7testinh_B;
-  if (__pyx_f_7testinh_1B___new__(o, a, k) < 0) {
+  if (__pyx_f_7testinh_1A___new__(o, a, k) < 0) {
+    Py_DECREF(o); o = 0;
+  }
+  else if (__pyx_f_7testinh_1B___new__(o, a, k) < 0) {
      Py_DECREF(o); o = 0;
    }
    return o;
@@ -524,10 +527,16 @@
  static struct __pyx_vtabstruct_7testinh_C __pyx_vtable_7testinh_C;

  static PyObject *__pyx_tp_new_7testinh_C(PyTypeObject *t, PyObject *a, 
PyObject *k) {
-  PyObject *o = __pyx_ptype_7testinh_B->tp_new(t, a, k);
+  PyObject *o = (*t->tp_alloc)(t, 0);
    struct __pyx_obj_7testinh_C *p = (struct __pyx_obj_7testinh_C *)o;
    *(struct __pyx_vtabstruct_7testinh_C 
**)&p->__pyx_base.__pyx_base.__pyx_vtab = __pyx_vtabptr_7testinh_C;
-  if (__pyx_f_7testinh_1C___new__(o, a, k) < 0) {
+  if (__pyx_f_7testinh_1A___new__(o, a, k) < 0) {
+    Py_DECREF(o); o = 0;
+  }
+  else if (__pyx_f_7testinh_1B___new__(o, a, k) < 0) {
+    Py_DECREF(o); o = 0;
+  }
+  else if (__pyx_f_7testinh_1C___new__(o, a, k) < 0) {
      Py_DECREF(o); o = 0;
    }
    return o;
----[ testinh.c patch ]----

the output becomes

     $ python -c "import testinh; testinh.A()"
     A.__new__()
     size: 10

     $ python -c "import testinh; testinh.B()"
     A.__new__()
     size: 20
     B.__new__()

     $ python -c "import testinh; testinh.C()"
     A.__new__()
     size: 30
     B.__new__()
     C.__new__()

This would enable people to customize a base class behaviour using 
virtual methods. In the patched source, each class doesn't call its 
superclass to receive an object, but instead it creates one using 
tp_alloc, sets its own vtable and then calls the __new__ methods in the 
proper order.

Of course the example is minimal: each subclass should perform all the 
work currently only done in the base class (such as setting the 
__weakref__ slot).

I took a look to Pyrex code and ModuleNode.generate_new_function() seems 
really easy to patch: if you'd like to have virtual functions at __new__ 
time (with the proposed implementation) i guess i could provide a patch 
for it.

>> would it be
>> possible to hide CRoot from the module dict?
> 
> You can do something like this:
> 
>   # foo.pyx
> 
>   cdef class C:
>     ...
>     
>   import foo
>   del foo.C
> 
> However, keep in mind that Python code can always
> get at a class given an instance of that class,
> so at best this reduces the chance of accidental
> misuse.

Nice trick :) It is enough to hide implementation details cluttering the 
module interface.

Daniele




More information about the Pyrex mailing list