[Pyrex] any ideas to speed this up?
Brian Blais
bblais at bryant.edu
Thu Feb 23 03:20:23 CET 2006
Hello,
I am a bit new to python and pyrex, and was wondering if anyone can help me here. I
am trying to translate some Matlab/mex code to Python, for doing neural simulations.
This application is definitely computing-time limited, and I need to optimize at
least one inner loop of the code, or perhaps even rethink the algorithm. I've coded
it into Pyrex, because it seems straightforward and efficient. In doing so, I have
made sure to
1) take any python objects, and extract C datatypes from them outside of the main
loop, like:
cdef double eta
eta=params['eta']
2) all of the variables in the loop are C datatypes, either ints or pointers.
Is there something else that I could do to make it speed up?
For my algorithm, the procedure is very simple, after initializing any variables:
1) select a random input vector, which I will call "x". right now I have it as an
array, and I choose columns from that array randomly. in other cases, I may need to
take an image, select a patch, and then make that a column vector.
2) calculate an output value, which is the dot product of the "x" and a weight
vector, "w", so
y=dot(x,w)
3) modify the weight vector based on a matrix equation, like:
w=w+ eta * (y*x - y**2*w)
^
|
+---- learning rate constant
4) repeat steps 1-3 many times
I've organized it like:
for e in 100: # outer loop
for i in 1000: # inner loop
(steps 1-3)
display things.
so that the bulk of the computation is in the inner loop, and is amenable to
converting to a faster language. This is my issue:
straight python, in the example posted below for 250000 inner-loop steps, takes 20
seconds for each outer-loop step. I tried Pyrex, which should work very fast on such
a problem, takes about 8.5 seconds per outer-loop step. The same code as a C-mex
file in matlab takes 1.5 seconds per outer-loop step.
Given the huge difference between the Pyrex and the Mex, I feel that there is
something I am doing wrong, because the C-code for both should run comparably.
Perhaps the approach is wrong? I'm willing to take any suggestions!
The code is below.
thanks,
Brian Blais
#--------------------------------
# main.py
from dohebb import *
import pylab as p
from Numeric import *
from RandomArray import *
import time
import numpy
x=random((100,1000)) # 1000 input vectors
numpats=x.shape[0]
w=random((numpats,1));
th=random((1,1))
params={}
params['eta']=0.001;
params['tau']=100.0;
old_mx=0;
for e in range(100):
rnd=randint(0,numpats,250000)
t1=time.time()
if 0: # straight python
for i in range(len(rnd)):
pat=rnd[i]
xx=reshape(x[:,pat],(1,-1))
y=matrixmultiply(xx,w)
w+=params['eta']*(y*transpose(xx)-y**2*w);
th=th+(1.0/params['tau'])*(y**2-th);
elif 1: # pyrex with Numeric
dohebb(params,w,th,x,rnd)
print time.time()-t1
p.plot(w,'o-')
p.xlabel('weights')
p.show()
#----------------------------------------------------
# dohebb.pyx
cdef extern from "Numeric/arrayobject.h":
struct PyArray_Descr:
int type_num, elsize
char type
ctypedef class Numeric.ArrayType [object PyArrayObject]:
cdef char *data
cdef int nd
cdef int *dimensions, *strides
cdef object base
cdef PyArray_Descr *descr
cdef int flags
def dohebb(params,ArrayType w,ArrayType th,ArrayType X,ArrayType rnd):
cdef int num_iterations
cdef int num_inputs
cdef int offset
cdef double *wp,*xp,*thp
cdef int *rndp
cdef double eta,tau
eta=params['eta'] # learning rate
tau=params['tau'] # used for variance estimate
cdef double y
num_iterations=rnd.dimensions[0]
num_inputs=w.dimensions[0]
# get the pointers
wp=<double *>w.data
xp=<double *>X.data
rndp=<int *>rnd.data
thp=<double *>th.data
for it from 0 <= it < num_iterations:
offset=rndp[it]*num_inputs
# calculate the output
y=0.0
for i from 0 <= i < num_inputs:
y=y+wp[i]*xp[i+offset]
# change in the weights
for i from 0 <= i < num_inputs:
wp[i]=wp[i]+eta*(y*xp[i+offset] - y*y*wp[i])
# estimate the variance
thp[0]=thp[0]+(1.0/tau)*(y**2-thp[0])
--
-----------------
bblais at bryant.edu
http://web.bryant.edu/~bblais
More information about the Pyrex
mailing list