[Haskell-cafe] [Newbie] What to improve in my code

Dougal Stanton dougal at dougalstanton.net
Mon Jul 19 06:21:39 EDT 2010


On Mon, Jul 19, 2010 at 9:24 AM, David Virebayre
<dav.vire+haskell at gmail.com> wrote:

> A minor point: instead of removing the punctuation, you maybe should
> convert it to whitespace.
>
> Otherwise in texts like "there was a quick,brown fox" (notice the
> missing space after the comma) you'll have the word "quickbrown"
> instead of 2 words "quick" and "brown".

If you remove punctuation you

- run the risk of joining two valid words into one invalid word:
  "quick,brown" -> "quickbrown"

- run the risk of converting one word into a different word:
 "can't" -> "cant"
 "won't" -> "wont"

If you split at punctuation you create more semi-words:
 "can't" -> "can", "t"
 "shouldn't" -> "shouldn" "t"

It might be better regarding in-word apostrophes as letters in this case?

-- 
Dougal Stanton
dougal at dougalstanton.net // http://www.dougalstanton.net


More information about the Haskell-Cafe mailing list