[Pyrex] Speeding up custom string lowercasing with Pyrex
Stefan Behnel
stefan_ml at behnel.de
Wed Oct 31 14:25:30 CET 2007
Alexy Khrabrov wrote:
> Greetings -- I'm counting Russian words, encoded in Cyrillics, and
> found that lowercasing them in Python slows down the program 5-10
> times.
> ngram = ' '.join(words[i:(i+self.opts.n)]) # mashed
> if self.opts.lower:
> # x = ngram.lower() # TODO does nothin'!
Are you using a unicode string or some 8-bit string encoding? Because unicode
objects should know how to lower-case characters. I can't see why this should
do nothing.
>>> print "ÄÜÖ".lower()
ÄÜÖ
>>> print u"ÄÜÖ".lower()
äüö
Using Pyrex here is not a good idea, IMHO.
Stefan
More information about the Pyrex
mailing list