[Pyrex] Pyrex v Psyco

Philip Smith philipasmith at blueyonder.co.uk
Sun Jun 18 22:59:01 UTC 2006


Hi

First of all thanks to Greg for Pyrex (which I have come to very recently and which will certainly make wrapping libraries easier)

I've been looking for an alternative to Psyco for distributing Python apps to other platforms than Win32 - however (while not being in general a 'speed' junkie) I don't want to sacrifice performance.

I've been trying Pyrex out on a small text search programme which uses the Boyer-Moore algorithm and which I have adapted to search for all occurrences of a string in the given text.

With algorithmic improvements I managed something like a 10 fold increase in speed performance and Psyco gives me another 10 fold increase.  I've read all the tips and tricks re: Pyrex I can find and tried to implement them but I can't seem to replicate the performance enhancement over Psyco which the Pyrex test suite demonstrates - in fact Psyco wins hands down every time.

Am I (as a newcomer) doing soething obviously wrong or is there something about this code that means one wouldn't expect particularly fast execution???

Thanks

Phil

This is my pyx file:

-----------------------------------

cdef extern from "Python.h":

    cdef object PySequence_GetItem(object o, int i)

    cdef int PyList_Append(object list, object item)

    #cdef object PySequence_GetSlice(object o, int i1, int i2)

    cdef int PyList_SET_ITEM(object PyList, int idx, object obj) except -1

    cdef void *PyList_GET_ITEM(object PyList, int idx) # except NULL

    cdef void Py_INCREF(void *)

    cdef int PySequence_Size(object o)

    ctypedef struct PyObject

cdef int setListItem(object PyList, int idx, object obj) except -1:

    Py_INCREF(<PyObject *>obj)

    PyList_SET_ITEM(PyList, idx, obj)

cdef object getListItem(object PyList, int idx):

    return <object>PyList_GET_ITEM(PyList, idx)

def findall(pattern, text): #adapted to find all occurences of pattern in text

    cdef int m,j

    cdef long n,k,i

    m=len(pattern) #once only

    n=len(text) #once only

    skip=[] #once only

    for k from 0<=k<256:

    #for k in range(256):

        PyList_Append(skip,m)

        #skip.append(m) #once only

    for k from 0<=k<m-1:

    #for k in range(m-1):

        setListItem(skip,ord(PySequence_GetItem(pattern,k)),m-k-1)

        #skip[ord(pattern[k])]=m-k-1 #once only

    result=[] #once only

    while True:

        if m>n: #if pattern is longer than (remaining) text

            return result #can't match more so return matches found

        k=m-1 #need to reset each time - k is offset into text

        while k<n:

            j=m-1 #j is offset to last char of pattern

            i=k #i offset into text where backward search commences

            while j >= 0 and PySequence_GetItem(text,i) == PySequence_GetItem(pattern,j):

            #while j>=0 and text[i]==pattern[j]: #keep matching chars backwards

                j=j-1

                i=i-1

            if j==-1: #whole pattern was matched

                PyList_Append(result,i+1)

                #result.append(i+1) #save the location of this match

            k=k+getListItem(skip,ord(PySequence_GetItem(text,k)))

            #k=k+skip[ord(text[k])]

        n=n-k #update n to number of chars remaining in text

    return result
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.copyleft.no/pipermail/pyrex/attachments/20060618/dea1d3fa/attachment.html 


More information about the Pyrex mailing list