[Haskell-beginners] Processing a list of files the Haskell way

Lorenzo Bolla lbolla at gmail.com
Wed Mar 21 18:13:39 CET 2012


What about something as simple as this?


import           Control.Monad (forM)
import           System.Directory (doesDirectoryExist, getDirectoryContents)
import           System.FilePath ((</>))
import qualified Data.ByteString as B
import           Data.Digest.OpenSSL.MD5 (md5sum)
import qualified Data.Map as M

getRecursiveContents :: FilePath -> IO [FilePath]
getRecursiveContents topdir = do
        names <- getDirectoryContents topdir
        let properNames = filter (`notElem` [".", ".."]) names
        paths <- forM properNames $ \name -> do
                let path = topdir </> name
                isDirectory <- doesDirectoryExist path
                if isDirectory
                        then getRecursiveContents path
                        else return [path]
        return (concat paths)

getMD5 :: FilePath -> IO String
getMD5 file = md5sum `fmap` B.readFile file

main :: IO ()
main = do
        files <- getRecursiveContents "."
        md5s <- sequence $ map getMD5 files
        let m = M.fromListWith (++) $ zip md5s [[f] | f <- files]
        putStrLn $ M.showTree m


The biggest part is the "getRecursiveContent", shamelessly stolen from RWH.

L.




On Sun, Mar 18, 2012 at 5:43 PM, Yawar Amin <yawar.amin at gmail.com> wrote:

> Hi Michael,
>
> Michael Schober <Micha-Schober <at> web.de> writes:
>
> > [...]
> > I took the liberty to modify the output a little bit to my needs - maybe
> > a future reader will find it helpful, too. It's attached below.
>
> I kind of played around with your example a little bit and wondered if it
> could be implemented in terms of just the basic Haskell Platform
> modules and functions. So as an exercise I rolled my own directory
> traversal and duplicate finder functions. This is what I came up with:
>
> - walkDirWith: walks a given directory with a given function that takes a
> Handle to any (unknown type) value, and returns association lists of
> paths and the unknown type values.
>
> - filePathMap: I think roughly analogous to your duplicates function.
>
> - main: In the third line of the main function, I use hFileSize as an
> example of a function that takes a Handle to an IO value, in this case IO
> Integer. A hash function could easily be put in here. The last line
> pretty-prints the Map in a tree-like format.
>
> import System.IO
> import System.Environment (getArgs)
> import System.Directory ( doesDirectoryExist
>                        , getDirectoryContents)
> import Control.Monad (mapM)
> import Control.Applicative ((<$>))
> import System.FilePath ((</>))
> import qualified Data.Map as M
>
> walkDirWith :: FilePath -> (Handle -> IO r) -> IO [(r, FilePath)] ->
>               IO [(r, FilePath)]
> walkDirWith path f walkList = do
>  isDir <- doesDirectoryExist path
>  if isDir
>    then do
>      paths <- getDirectoryContents path
>      concat <$> mapM (\p -> walkDirWith (path </> p) f walkList)
>                      [p | p <- paths, p /= ".", p /= ".."]
>    else do
>      rValue <- withFile path ReadMode f
>      ((:) (rValue, path)) <$> walkList
>
> filePathMap :: Ord r => [(r, FilePath)] -> M.Map r [FilePath]
> filePathMap pathPairs =
>  foldl (\theMap (r, path) -> M.insertWith' (++) r [path] theMap)
>        M.empty
>        pathPairs
>
> main :: IO ()
> main = do
>  [dir] <- getArgs
>   fileSizes <- walkDirWith dir hFileSize $ return []
>  putStr . M.showTree $ filePathMap fileSizes
>
> Obviously there's no right or wrong way to do it, but I'm wondering
> what you think.
>
> Regards,
>
> Yawar
>
>
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20120321/e652c225/attachment.htm>


More information about the Beginners mailing list