Data.PackedString.hGetPS

Simon Marlow simonmar at microsoft.com
Mon Mar 1 11:21:19 EST 2004


 
> If I have
> 
> \begin{code}
> import Data.PackedString (hGetPS)
> import System.IO (openBinaryFile, hClose, IOMode(ReadMode))
> 
> main :: IO ()
> main = do h <- openBinaryFile "1000000x" ReadMode
>           p <- hGetPS h 1000000
>           hClose h
>           return ()
> \end{code}
> 
> (1000000x is a file containing 1000000 'x' characters) then:
> 
> and compile with ghc -O2 (either 6.2 or reasonably recent CVS) then I
> get:
> 
> $ ./foo 
> Stack space overflow: current size 1048576 bytes.
> Use `+RTS -Ksize' to increase it.
> $

I noticed a while back that Data.PackedString isn't being optimised as
well as it should.  Most of the code in there is written with the
assumption that lots of Deforestation is going to happen, and in several
of the cases I looked at it isn't.

This is on our list of things to investigate (along with performance of
the overloaded array support).

> -- | Read a 'PackedString' directly from the specified 'Handle'.
> -- This is far more efficient than reading the characters 
> into a 'String'
> -- and then using 'packString'.  
> -- NOTE: as with 'hPutPS', the string representation in the file is 
> -- assumed to be ISO-8859-1.
> hGetPS :: Handle -> Int -> IO PackedString
> hGetPS h i = do
>   arr <- newArray_ (0, i-1)
>   l <- hGetArray h arr i
>   chars <- mapM (\i -> readArray arr i >>= 
> return.chr.fromIntegral) [0..l-1]
>   return (packString chars)
> 
> (is the efficiency comment accurate? This definition is calling
> packString on a String it makes, so it's not intuitively 
> obvious to me).
> 
> I think the last line should be replaced with
> 
>   return (packNChars l chars)

Indeed that would make a small improvement, yes.

Cheers,
	Simon


More information about the Libraries mailing list