I was graphing out letter frequency in some large academic documents. As part of this process, is was sorting letters from large clippings of those documents into alphabetical order. I was using Python's built in sorted function, and I started to wonder if I could make it faster. I then wrote the following function:
def count_sort(l):
items = {'a':0,'b':0,'c':0,'d':0,'e':0,'f':0,'g':0,'h':0,'i':0,'j':0,'k':0,'l':0,'m':
0,'n':0,'o':0,'p':0,'q':0,'r':0,'s':0,'t':0,'u':0,'v':0,'w':0,'x':0,'y':0,'z'
:0}
for item in l:
items[item] += 1
sort_l = []
for key in items:
sort_l += key*items[key]
return sort_l
When testing this code versus sorted on a 10000 letter long string of text, it was almost 20X faster.
With such a performance boost, why isn't this sorting algorithm in the standard libs?
Aucun commentaire:
Enregistrer un commentaire