[Haskell-beginners] Processing a list of files the Haskell way
Lorenzo Bolla
lbolla at gmail.com
Wed Mar 21 18:13:39 CET 2012
What about something as simple as this?
import Control.Monad (forM)
import System.Directory (doesDirectoryExist, getDirectoryContents)
import System.FilePath ((</>))
import qualified Data.ByteString as B
import Data.Digest.OpenSSL.MD5 (md5sum)
import qualified Data.Map as M
getRecursiveContents :: FilePath -> IO [FilePath]
getRecursiveContents topdir = do
names <- getDirectoryContents topdir
let properNames = filter (`notElem` [".", ".."]) names
paths <- forM properNames $ \name -> do
let path = topdir </> name
isDirectory <- doesDirectoryExist path
if isDirectory
then getRecursiveContents path
else return [path]
return (concat paths)
getMD5 :: FilePath -> IO String
getMD5 file = md5sum `fmap` B.readFile file
main :: IO ()
main = do
files <- getRecursiveContents "."
md5s <- sequence $ map getMD5 files
let m = M.fromListWith (++) $ zip md5s [[f] | f <- files]
putStrLn $ M.showTree m
The biggest part is the "getRecursiveContent", shamelessly stolen from RWH.
L.
On Sun, Mar 18, 2012 at 5:43 PM, Yawar Amin <yawar.amin at gmail.com> wrote:
> Hi Michael,
>
> Michael Schober <Micha-Schober <at> web.de> writes:
>
> > [...]
> > I took the liberty to modify the output a little bit to my needs - maybe
> > a future reader will find it helpful, too. It's attached below.
>
> I kind of played around with your example a little bit and wondered if it
> could be implemented in terms of just the basic Haskell Platform
> modules and functions. So as an exercise I rolled my own directory
> traversal and duplicate finder functions. This is what I came up with:
>
> - walkDirWith: walks a given directory with a given function that takes a
> Handle to any (unknown type) value, and returns association lists of
> paths and the unknown type values.
>
> - filePathMap: I think roughly analogous to your duplicates function.
>
> - main: In the third line of the main function, I use hFileSize as an
> example of a function that takes a Handle to an IO value, in this case IO
> Integer. A hash function could easily be put in here. The last line
> pretty-prints the Map in a tree-like format.
>
> import System.IO
> import System.Environment (getArgs)
> import System.Directory ( doesDirectoryExist
> , getDirectoryContents)
> import Control.Monad (mapM)
> import Control.Applicative ((<$>))
> import System.FilePath ((</>))
> import qualified Data.Map as M
>
> walkDirWith :: FilePath -> (Handle -> IO r) -> IO [(r, FilePath)] ->
> IO [(r, FilePath)]
> walkDirWith path f walkList = do
> isDir <- doesDirectoryExist path
> if isDir
> then do
> paths <- getDirectoryContents path
> concat <$> mapM (\p -> walkDirWith (path </> p) f walkList)
> [p | p <- paths, p /= ".", p /= ".."]
> else do
> rValue <- withFile path ReadMode f
> ((:) (rValue, path)) <$> walkList
>
> filePathMap :: Ord r => [(r, FilePath)] -> M.Map r [FilePath]
> filePathMap pathPairs =
> foldl (\theMap (r, path) -> M.insertWith' (++) r [path] theMap)
> M.empty
> pathPairs
>
> main :: IO ()
> main = do
> [dir] <- getArgs
> fileSizes <- walkDirWith dir hFileSize $ return []
> putStr . M.showTree $ filePathMap fileSizes
>
> Obviously there's no right or wrong way to do it, but I'm wondering
> what you think.
>
> Regards,
>
> Yawar
>
>
>
> _______________________________________________
> Beginners mailing list
> Beginners at haskell.org
> http://www.haskell.org/mailman/listinfo/beginners
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/beginners/attachments/20120321/e652c225/attachment.htm>
More information about the Beginners
mailing list