[Pyrex] Speeding up custom string lowercasing with Pyrex
Yannick Versley
versley at sfs.uni-tuebingen.de
Wed Oct 31 14:29:18 CET 2007
Hi,
In your place, I'd try and use string.translate
def rulower_ord(n):
# Russian Caps: 0xc0-0xdf
if n in xrange(0xC0, 0xE0) or n == 0xA8:
return chr(n+32)
else:
return chr(n)
rulower_map=''.join([rulower_ord(n) for n in xrange(256)])
def rulower(s):
return s.translate(rulower_map)
Admittedly this has nothing to do with Pyrex, but translate is probably a lot
faster than manipulating python strings. If you need something more
flexible, the starting point would probably be using cStringIO or something
like that, since the "to += c" step unnecessarily creates lots of temporary
string objects.
Best,
Yannick
> The pure Python lowercase function looked like,
>
> def rulower(s): # pure Python
> to = ""
> for c in s:
> n = ord(c)
> # Russian Caps: 0xc0-0xdf
> if n in xrange(0xC0, 0xE0) or n == 0xA8:
> n += 32
> c = chr(n)
> to += c
> return to
>
> -- execution time was 15 minutes.
--
Yannick Versley
Seminar für Sprachwissenschaft, Abt. Computerlinguistik
Wilhelmstr. 19, 72074 Tübingen
Tel.: (07071) 29 77352
More information about the Pyrex
mailing list