[Pyrex] experiences with numpy array import/export in pyrex

Wed May 3 21:28:27 CEST 2006

Hi,

> I was curious if there are people who have tried numpy/pyrex in such a way
> that they could dynamically export arrays into python.

> What I had in mind was a pyrex function which takes a number of numpy arrays
> as input (images of a video sequence). Then manipulate the array data (fast)
> by use of a pointer to the raw array data. And output a number of newly
> allocated arrays (filled and allocated in C of course, but returning python
> numpy objects).

Maybe not video sequences but exactly like this.

> This means creating new numpy arrays and associated buffers by use of malloc
> and then exporting these into numpy python arrays.. quite difficult...

Hopefully there are simpler ways to access numpy arrays data and expose
your own data.

1. Use Python buffer interface http://docs.python.org/api/abstract-buffer.html,
   but I don't think you are looking for this so go to point 2.

2. Declare ndarray external extension type and take the advantage of
   Pyrex language. This is the simplest and most common way of directly
   using (reading/writing) numpy arrays (not only given as input, but
   also created in your code). See a standard example:

# compile as normal Pyrex module
# assuming a recent numpy (>=0.9.6)
cdef extern from "numpy/arrayobject.h":
    ctypedef int intp

    ctypedef extern class numpy.dtype [object PyArray_Descr]:
        cdef int type_num, elsize, alignment
        cdef char type, kind, byteorder, hasobject
        cdef object fields, typeobj

    ctypedef extern class numpy.ndarray [object PyArrayObject]:
        cdef char *data
        cdef int nd
        cdef intp *dimensions
        cdef intp *strides
        cdef object base
        cdef dtype descr
        cdef int flags
    void import_array()

import numpy
import_array()

def fun(ndarray input_array):
    ## do some basic checking
    if input_array.descr.type != C"d":
         raise TypeError("Double array required")
    if input_array.nd <> 2:
         raise ValueError("2 dimensional array required")
    ## also check input_array.flags here

    cdef int x, y
    cdef double *input_data, *output_data
    cdef ndarray output_array

    x = input_array.dimensions[0]
    y = input_array.dimensions[1]
    input_data = <double*> input_array.data

    ## create some output array
    ## this can be done in Python or with numpy C-api (seek for PyArray_SimpleNew)
    output_array = numpy.zeros((10,10), 'd')
    output_data = <double*> output_array.data

    ## do the stuff

    return output_array

## end

Here, the numpy array is created with standard numpy function. There are
also special functions in the C-api to create array object.
The thing you probably shouldn't do is to mix array objects and custom
mallocated data. I don't.

3. Use array interface (my favourite) to directly access numpy/Numeric/...
   data and/or return data wrapped in Python extensions.
   This approach is very flexible because yours and others code can work
   with data of any object that exposes this interface.

   There are some nice examples of this in the scipy cookbook "Using
   NumPy With Other Languages" section at http://scipy.org/Wiki/Cookbook/,
   especially
   http://scipy.org/Wiki/Cookbook/A_Numerical_Agnostic_Pyrex_Class and
   http://scipy.org/Wiki/Cookbook/ArrayStruct_and_Pyrex.

   For specification see http://numeric.scipy.org/array_interface.html.

To sum up, if you are looking for a way to expose a custom mallocated
data then use array interface to wrap it. If you only want to allocate
numpy array and modify it's content then use numpy functions to create
arrays and one of the given methods (2. or 3.) to access them.

Filip