[Haskell-cafe] Efficient way to edit a file
Donald Bruce Stewart
dons at cse.unsw.edu.au
Thu Jun 1 22:34:51 EDT 2006
> > Hi,
> > I need to edit big text files (5 to 500 Mb). But I just need to
> > change one or two small lines, and save it. What is the best way to do
> > that in Haskell, without creating copies of the whole files?
Thinking further, since you want to avoid copying on the disk, you need
to be able to keep the edited version in memory. So the strict
bytestring would be best, for example:
import qualified Data.ByteString.Char8 as B
main = do
[f] <- getArgs
B.writeFile f . B.unlines . map edit . B.lines =<< B.readFile f
edit :: B.ByteString -> B.ByteString
edit s | (B.pack "Instances") `B.isPrefixOf` s = B.pack "EDIT"
| otherwise = s
Edits a 100M file in
$ ghc -O -funbox-strict-fields A.hs -package fps
$ time ./a.out /home/dons/data/100M
./a.out /home/dons/data/100M 1.54s user 0.76s system 13% cpu 17.371 total
You could probably tune this further.
More information about the Haskell-Cafe