[Pyrex] array.array vs long ints for bit vectors
John Machin
sjmachin at lexicon.net
Sun Oct 30 20:56:19 CET 2005
Nitin Madnani wrote:
> Thanks for the detailed reply, John. Let me answer some of your
> questions and ask some more of mine.
>
>> For a start, what operations do you want to perform on this "bit vector"?
>
> The only operations I need are setting bits, getting the bits that are
> not set etc.
What does "getting the bits that are not set" mean? Is this a loose
description of x = avector[k], or something else? Have you contemplated
designing the API you think you will need?
>
>> "grow" and "needed" are a little contradictory. Do you want an *exact*
>> size, so that you can do this:
>>
>> zero = Bitvector(size=20, contents=[0])
>> all_ones = ~zero
>>
>> and know exactly how long the result will be?
>>
>> What makes you think a long int would grow unconscionably longer than
>> needed? How long is your longest map going to be? In any case, just
>> think int, which morphs seamlessly into long as and when required. A
>> 20-bit map would fit into an int.
>>
>> Another thing to worth contemplating: given the amount of going-over
>> by Tim Peters that the long int code has had, if long ints waste more
>> memory on leading zeroes than n bits [where n is < 32 and could be
>> determined by inspecting the code], then the end of the world is nigh.
>>
>
> My bit-vector sizes can vary from 10-bits to 60-bits with an average
> length of around 25-30 bits.
Any chance of using a 64-bit machine?
> So, would a long int be the best way then ?
Note Josiah Carlson's comment re immutability -- which probably isn't a
problem if your bit vectors will always fit in an int, but "always"
sometimes never happens. Design your API, implement a prototype, RTF
long int source code to see what happens with longint & (1 << k) ... you
did mention "research", didn't you?
> I was just worried that I would be wasting too much memory.
Stop worrying and do some rough calculations! How many bit vectors? How
much memory on your machine?
>
>> "This" wouldn't be doing the storing. *You* would be [mis]using a byte
>> array as a bit array.
>>
>
> Hmm. So, is there a correct way to use array.array for bit vectors?
Correct in what sense? [Rhetorical question]
Use shifts, ands, and ors to pack multiple bits into a storage container.
More information about the Pyrex
mailing list