[Haskell-cafe] Efficient way to edit a file
Donald Bruce Stewart
dons at cse.unsw.edu.au
Thu Jun 1 22:34:51 EDT 2006
dons:
> briqueabraque:
> > Hi,
> >
> > I need to edit big text files (5 to 500 Mb). But I just need to
> > change one or two small lines, and save it. What is the best way to do
> > that in Haskell, without creating copies of the whole files?
> >
Thinking further, since you want to avoid copying on the disk, you need
to be able to keep the edited version in memory. So the strict
bytestring would be best, for example:
import System.Environment
import qualified Data.ByteString.Char8 as B
main = do
[f] <- getArgs
B.writeFile f . B.unlines . map edit . B.lines =<< B.readFile f
where
edit :: B.ByteString -> B.ByteString
edit s | (B.pack "Instances") `B.isPrefixOf` s = B.pack "EDIT"
| otherwise = s
Edits a 100M file in
$ ghc -O -funbox-strict-fields A.hs -package fps
$ time ./a.out /home/dons/data/100M
./a.out /home/dons/data/100M 1.54s user 0.76s system 13% cpu 17.371 total
You could probably tune this further.
-- Don
More information about the Haskell-Cafe
mailing list