[Pyrex] array.array vs long ints for bit vectors

John Machin sjmachin at lexicon.net
Sun Oct 30 20:56:19 CET 2005


Nitin Madnani wrote:
> Thanks for the detailed reply, John. Let me answer some of your 
> questions and ask some more of mine. 
> 
>> For a start, what operations do you want to perform on this "bit vector"?
> 
> The only operations I need are setting bits, getting the bits that are 
> not set etc.

What does "getting the bits that are not set" mean? Is this a loose 
description of x = avector[k], or something else? Have you contemplated 
designing the API you think you will need?

> 
>> "grow" and "needed" are a little contradictory. Do you want an *exact* 
>> size, so that you can do this:
>>
>> zero = Bitvector(size=20, contents=[0])
>> all_ones = ~zero
>>
>> and know exactly how long the result will be?
>>
>> What makes you think a long int would grow unconscionably longer than 
>> needed? How long is your longest map going to be? In any case, just 
>> think int, which morphs seamlessly into long as and when required. A 
>> 20-bit map would fit into an int.
>>
>> Another thing to worth contemplating: given the amount of going-over 
>> by Tim Peters that the long int code has had, if long ints waste more 
>> memory on leading zeroes than n bits [where n is < 32 and could be 
>> determined by inspecting the code], then the end of the world is nigh.
>>
> 
> My bit-vector sizes can vary from 10-bits to 60-bits with an average 
> length of around 25-30 bits.

Any chance of using a 64-bit machine?

> So, would a long int be the best way then ? 

Note Josiah Carlson's comment re immutability -- which probably isn't a 
problem if your bit vectors will always fit in an int, but "always" 
sometimes never happens. Design your API, implement a prototype, RTF 
long int source code to see what happens with longint & (1 << k) ... you 
did mention "research", didn't you?

> I was just worried that I would be wasting too much memory. 

Stop worrying and do some rough calculations! How many bit vectors? How 
much memory on your machine?

> 
>> "This" wouldn't be doing the storing. *You* would be [mis]using a byte 
>> array as a bit array.
>>
> 
> Hmm. So, is there a correct way to use array.array for bit vectors?

Correct in what sense? [Rhetorical question]

Use shifts, ands, and ors to pack multiple bits into a storage container.





More information about the Pyrex mailing list