[Haskell-cafe] How to optimize a directory scanning?

Viktor Dukhovni ietf-dane at dukhovni.org
Sun May 12 06:03:01 UTC 2019


On Sat, May 11, 2019 at 03:52:38AM +0200, Niklas Hambüchen wrote:

> we made the `posix-paths` package for fast directory traversals:
> 
>     https://hackage.haskell.org/package/posix-paths
> 
> You can find benchmarks in
> 
>     https://github.com/JohnLato/posix-paths#benchmarks

It should perhaps be noted that a large fraction of the additional
overhead encountered by the String FilePath traversals in the that
benchmark occur in the output code that prints all the paths to
stdout.  The corresponding ByteString listing is noticeably faster.

If one rather just stats and counts all the files, the performance
difference is somewhat more modest, (IIRC around a factor of ~2
rather than ~5 or 6)

At the directory traversal of course needs to use 'getSymbolicLinkStatus'
rather than 'getFileStatus', since recursive directory traversals
should almost never follow symlinks.

-- 
	Viktor.


More information about the Haskell-Cafe mailing list