[Haskell-beginners] How would you improve this program?
Lorenzo Bolla
lbolla at gmail.com
Mon Oct 10 10:13:45 CEST 2011
Hi,
On Sun, Oct 9, 2011 at 11:08 PM, Daniel Fischer <
daniel.is.fischer at googlemail.com> wrote:
>
> A few problems:
> 1) your input cleaning is not correct, running it over the source file, I
> get (among others)
>
Thanks, I'll fix it asap.
> 2) If you have a looong but rare word in the text, you will get too many
> spaces between the words and the markers, that's not good.
>
Thanks.
> 3) you calculate the number of '#'s using the total line length and then
> subtract the length of (word + spaces), skewing your output. You should
> calculate (spaceForHashes = lineWidth - maxLength - 1) and the use that to
> determine the number of hashes.
> [Although it should be more complicated really, maxLength should only
> include those words that are actually output.]
>
Fixed, thanks.
Style is okay, although like Michael Xavier, I'm more of a where-guy.
>
A part from style, is there any reason why one should prefer "where" over
"let..in"?
> You build the map via fromListWith (+). That's bad if you have input with
> large counts. fromListWith (+) gives you thunks of the form
> (...(1+1)+1)...+1) as map-entries. That costs a lot of space and time.
> Better build the map using insertWith',
>
> mp = foldl' ins empty words
> where
> ins m w = insertWith' (+) w 1 m
>
Please find the "strict" version here:
https://github.com/lbolla/stanford-cs240h/blob/master/lab1/lab1.hs.
Plot of times is now:
https://github.com/lbolla/stanford-cs240h/blob/master/lab1/times.png
Python is still slightly faster, but I'm getting there.
> 3. a Python version using defaultdict.
> sort is O(n*log n), map and python are O(n*log d), where n is the number of
> words and d the number of distinct words.
I believe Python's defaultdict have (optimistically) O(1) lookup and insert,
because it is implemented as a hashmap.
Thanks everybody for your tips. They are most welcome.
L.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20111010/73b0cdb3/attachment.htm>
More information about the Beginners
mailing list