[Haskell-beginners] Processing a list of files the Haskell way

Yawar Amin yawar.amin at gmail.com
Sun Mar 18 18:43:50 CET 2012


Hi Michael,

Michael Schober <Micha-Schober <at> web.de> writes:

> [...]
> I took the liberty to modify the output a little bit to my needs - maybe 
> a future reader will find it helpful, too. It's attached below.

I kind of played around with your example a little bit and wondered if it
could be implemented in terms of just the basic Haskell Platform
modules and functions. So as an exercise I rolled my own directory
traversal and duplicate finder functions. This is what I came up with:

- walkDirWith: walks a given directory with a given function that takes a
Handle to any (unknown type) value, and returns association lists of
paths and the unknown type values.

- filePathMap: I think roughly analogous to your duplicates function.

- main: In the third line of the main function, I use hFileSize as an
example of a function that takes a Handle to an IO value, in this case IO
Integer. A hash function could easily be put in here. The last line
pretty-prints the Map in a tree-like format.

import System.IO
import System.Environment (getArgs)
import System.Directory ( doesDirectoryExist
                        , getDirectoryContents)
import Control.Monad (mapM)
import Control.Applicative ((<$>))
import System.FilePath ((</>))
import qualified Data.Map as M

walkDirWith :: FilePath -> (Handle -> IO r) -> IO [(r, FilePath)] ->
               IO [(r, FilePath)]
walkDirWith path f walkList = do
  isDir <- doesDirectoryExist path
  if isDir
    then do
      paths <- getDirectoryContents path
      concat <$> mapM (\p -> walkDirWith (path </> p) f walkList)
                      [p | p <- paths, p /= ".", p /= ".."]
    else do
      rValue <- withFile path ReadMode f
      ((:) (rValue, path)) <$> walkList

filePathMap :: Ord r => [(r, FilePath)] -> M.Map r [FilePath]
filePathMap pathPairs =
  foldl (\theMap (r, path) -> M.insertWith' (++) r [path] theMap)
        M.empty
        pathPairs

main :: IO ()
main = do
  [dir] <- getArgs
  fileSizes <- walkDirWith dir hFileSize $ return []
  putStr . M.showTree $ filePathMap fileSizes

Obviously there's no right or wrong way to do it, but I'm wondering
what you think.

Regards,

Yawar





More information about the Beginners mailing list