[Pyrex] Speeding up custom string lowercasing with Pyrex

Yannick Versley versley at sfs.uni-tuebingen.de
Wed Oct 31 14:29:18 CET 2007


Hi,

In your place, I'd try and use string.translate
def rulower_ord(n):
	# Russian Caps: 0xc0-0xdf
	if n in xrange(0xC0, 0xE0) or n == 0xA8:
 		return chr(n+32)
	else:
		return chr(n)
rulower_map=''.join([rulower_ord(n) for n in xrange(256)])
def rulower(s):
	return s.translate(rulower_map)

Admittedly this has nothing to do with Pyrex, but translate is probably a lot 
faster than manipulating python strings.  If you need something more 
flexible, the starting point would probably be using cStringIO or something 
like that, since the "to += c" step unnecessarily creates lots of temporary 
string objects.

Best,
Yannick
> The pure Python lowercase function looked like,
>
> def rulower(s): # pure Python
> 	to = ""
> 	for c in s:
> 		n = ord(c)
> 		# Russian Caps: 0xc0-0xdf
> 		if n in xrange(0xC0, 0xE0) or n == 0xA8:
> 			n += 32
> 			c = chr(n)
> 		to += c
> 	return to
>
> -- execution time was 15 minutes.

-- 
Yannick Versley
Seminar für Sprachwissenschaft, Abt. Computerlinguistik
Wilhelmstr. 19, 72074 Tübingen
Tel.: (07071) 29 77352



More information about the Pyrex mailing list