[Haskell-beginners] Simplifying code
Daniel Fischer
daniel.is.fischer at web.de
Tue Feb 9 16:49:18 EST 2010
Am Dienstag 09 Februar 2010 21:11:55 schrieb edgar klerks:
> Hi All,
>
> I wrote a program which permutates a list words with some rules. For
> example if I feed it a list like:
>
> banana
> waterloo
> vraag
>
> It gives back the list:
>
> banana
> b at nana
> ban at na
> b at n@na
> banan@
> b at nan@
> ban at n@
> b at n@n@
> waterloo
> wa+erloo
> water|oo
> waterl0o
> etc
>
> However I have the feeling I am doing things to complicated. I am still
> a beginner. Would someone like to help me simplify somethings.
Sure. If you don't mind that the mutations come in a different order, one
thing that works wonders is "sequence",
sequence :: Monad m => [m a] -> m [a]
In particular, for m = [], sequence :: [[a]] -> [[a]]. Then, knowing what
sequence does, we can write
import Control.Monad (sequence)
generateAll :: String -> [String]
generateAll word = sequence (map f word)
where
f c = case lookup c leat of
Just r -> [c,r]
Nothing -> [c]
For each letter in the word, we generate the list of all possible
substitutions (map f),
"woot" ~> [['w'],['o','0'],['o','0'],['t','+']]
and then sequence them, choosing each combination of substitutions once.
A little more efficient than sequence is
generateAll :: String -> [String]
generateAll word = allCombs (map f word)
where
f c = case lookup c leat of
Just r -> [c,r]
Nothing -> [c]
allCombs :: [[a]] -> [[a]]
allCombs (l:ls) = [h:t | t <- allCombs ls, h <- l]
allCombs [] = [[]]
-- sequence (l:ls) = [h:t | h <- l, t <- sequence ls]
-- with the generators in reverse order, cf.
-- http://www.haskell.org/pipermail/haskell-cafe/2009-December/070149.html
> If you think this is inappropriate please state also.
How could it be? This list is for helping people understand Haskell better,
exactly what you're after.
> I am not offended then. I understand you are offering your
> spare time to help me.
If one thinks one's time isn't worth it, one can just ignore the post.
>
> The first thing I don't get is this. I recognize some things could be
> rewritten with a bind operator (because of the concat $ fmap), but I am
> puzzled how:
>
> mutateWords :: [Char] -> [[Char]]
> mutateWords word = nub.concat $ fmap snd <$> fmap unzip <$> ( foldr(\x z
> ->
>
> let char = snd x
> nm = number word
> lst = fst x
> in (insertAt char nm <$> lst) : z
> ) [[]] $ mw word )
>
>
>
>
> Here is the full code:
>
>
> import Data.List
> import System
> import System.IO
> import Control.Applicative
>
>
> ---CONFIG section
>
> leat = ['s' ==> '$', 't' ==> '+', 'l' ==> '|', 'o' ==> '0','e' ==> '3',
> 'a' ==> '@', 'v' ==> '^']
>
> leata = fst.unzip $ leat
leata = map fst leat
leatb = map snd leat
> leatb = snd.unzip $ leat
>
> -- Perl like assoc lists
> infixl 1 ==>
> a ==> b = (a, b)
>
>
> -- Flipped fmap sometimes nicer
> infixl 4 <$$>
>
> xs <$$> f = f <$> xs
>
>
> -- first I need to find the positions of the mutatable charachters.
No, you don't need to do that, it's in general more efficient to not care
about positions when dealing with lists.
> findPositions :: [Char] -> [[Int]]
> findPositions xs = take (length index) $ index <*> [xs]
> where index = elemIndices <$> leata
[f1, ..., fm] <*> [x1, ..., xn]
produces a list of length m*n, so
length (index <*> [xs]) == length index * length [xs] == length index
~> remove "take (length index) $"
>
> -- And generate all subsequences
> findSubSeq :: [Char] -> [[[Int]]]
> findSubSeq = fmap subsequences <$> findPositions
>
>
> -- Only change elements which needs to be changed
> insertAt :: Char -> [(Int, Char)] -> [Int] -> [(Int,Char)]
> insertAt c xs ps = xs <$$> (\x ->
> if (fst x) `elem` ps
> then (fst x , c)
> else x
> )
> -- Couples character to mutable positions
> mw word = (findSubSeq word) `zip` leatb
>
> number = zip [0..]
>
> mutateWords :: [Char] -> [[Char]]
> mutateWords word = nub.concat $ fmap snd <$> fmap unzip <$> ( foldr(\x z
> ->
>
> let char = snd x
> nm = number word
> lst = fst x
> in (insertAt char nm <$> lst) : z
> ) [[]] $ mw word )
Okay, I give up, that's too complicated :)
One general remark.
When you have an Ord instance, "nub" is an extremely bad idea (unless your
lists are really short), as it's quadratic in the length of the list.
map head . group . sort
or
import Data.Set
toList . fromList
are much better [O(l * log l) where l = length xs]
>
> generateAll :: [Char] -> [[Char]]
> generateAll word = g lea $ mutateWords word
> where g 0 words = words
> g n words = g (n - 1) (nub $ words >>= mutateWords )
> lea = length leata
> main = do
> filename <- getArgs
> wordlist <- readFile $ filename !! 0
> let a = (words wordlist) >>= generateAll
> mapM_ putStrLn a
More information about the Beginners
mailing list