[Pyrex] Question, plus suggestions for improvement?

Giovanni Bajo rasky at develer.com
Tue Aug 1 23:31:54 UTC 2006


Josiah Carlson wrote:

>> Depends. What if you have an unbounded size, and it's often going to
>> be larger than Python standard pools so PyMem_Malloc won't help?
>
> PyMem_Malloc isn't any better than malloc.

Yeah, maybe I got the API wrong. What I meant is that if you have an
unbounded size your *only* way to go is to use malloc(). You can't even try
to somehow reuse Python pool-based allocator which is faster.


> What I meant by 'stack
> allocation', and "a maximum fixed size" is something like...
>
> int fcn(...) {
>     int stack_memory[maximum_fixed_size];
>     ...
> }

Yes, this is why I said "what if you have unbounded size". I'm not sure of
what algorithms you are thinking of, but the ones I think can't really have
some hardcoded limit. One strategy is to get the best of the two worlds: if
the memory is smaller than a certain amount, use the stack, otherwise go
through malloc/free.

>> It also depends on how many times you access the list. If you do it
>> many times, it's much better to pay for the penalty of malloc once,
>> rather than going through PyList_GetItem + PyFloat_AsFloat thousands
>> of times.
>
> I'm not familliar with the algorithm that the OP has chosen to
> implement for permutations, but I wasn't speaking directly to the
> OP's particular application with regards to performance.  I was
> stating that generally malloc/free tends to have some performance
> gotchas that most people
> don't realize until they stop using it.

I agree that, generically speaking, incrementing a register (the stack
pointer) is faster than going through a complex library routine.
Nonetheless, in this specific context (and in that of converting arbitrary
Python structure to C sturctures in Pyrex for speed reasons), I believe that
malloc() is the only possible compromise, with some possible cheating like
proposed above.

I also would like to reiterate that the overhead of malloc, even if larger
than a stack-based allocation, can still be in the noise, compared with the
whole execution of the algorithm. Before even implementing the simple
fallback "use the stack if size < N", I would suggest some profiling.
-- 
Giovanni Bajo




More information about the Pyrex mailing list