[Haskell-cafe] Function to detect duplicates

Tue Feb 23 06:57:34 EST 2010

Hi Rafael,

I assume you will perform this operation on some very large lists, or
performance would not be an issue. Have you tested if your optimized
version is better than your initial one?

You should compare your implementation against something like this:

import qualified Data.Set as Set
noneRepeated :: (Ord a) => [a] -> Bool
noneRepeated = accum Set.empty where
  accum _ [] = True
  accum s (x:xs)
    | Set.member x s = False
    | otherwise      = accum (Set.insert x s) xs

Also there is some discussion about the nub function that relates to
this topic, e.g. http://buffered.io/2008/07/28/a-better-nub/.

/Jonas

On 23 February 2010 12:30, Rafael Gustavo da Cunha Pereira Pinto
<RafaelGCPP.Linux at gmail.com> wrote:
>
> Hi folks,
>
> While solving a puzzle, I was posed the problem of finding if there was no
> duplicates on a list.
>
> First I used:
>
> noneRepeated=null.(filter (>1)).(map length).group.sort
>
>
> But this seemed very unneficient, so I thought that I could detect the
> duplicates while sorting, and devised this:
>
> import Control.Monad
> import Data.Maybe
>
> noneRepeated=isNothing . (foldl merge (Just [])) . (map sort) . pairs
>
> pairs []=[]
> pairs [x]=[[x]]
> pairs (x:y:xs)=[x,y]:pairs xs
>
> sort []=Just []
> sort [x]=Just [x]
> sort [x,y] | x>y=Just [y,x]
>            | y>x=Just [x,y]
>            | x==y=Nothing
>
> merge::(Eq a, Ord a) => Maybe [a]->Maybe [a]->Maybe[a]
> merge _ Nothing = Nothing
> merge Nothing _ = Nothing
> merge (Just []) (Just xs)=Just xs
> merge (Just xs) (Just [])=Just xs
> merge (Just (x:xs)) (Just (y:ys)) | x==y = Nothing
>                                   | x>y  = (Just y) +? (merge (Just (x:xs))
> (Just ys))
>                                   | x<y  = (Just x) +? (merge (Just xs)
> (Just (y:ys)))
>
> (+?) = liftM2 (:)
>
>
>
> My version of the merge sort returns Nothing whenever it finds two equal
> entries, aborting all subsequent comparisons.
>
> I have a few questions for the friendly people at this cafe:
>
> 1) Is there any improvement I can make?
> 2) Can it be parallelized (par, pseq)?
>
>
> Best regards,
>
> Rafael
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>