hackage-server memory requirements

Max Bolingbroke batterseapower at hotmail.com
Wed Oct 26 19:36:04 CEST 2011


On 26 October 2011 13:46, Bas van Dijk <v.dijk.bas at gmail.com> wrote:
> According to the profile most space is used by ARR_WORDS (which is the
> internal name for a ByteArray# if I remember correctly).

Interesting. There are a lot of ByteStrings in use in the server, so
candidates for a leak might be:
 1. The cached cabal file in the package information
 2. A StringTable such as the one within a TarIndex
 3. The cached index.tar.gz
 4. Perhaps the mirroring feature is not strict enough in the
ByteString it accepts

Regarding the fourth possibility, it looks to me like there is a
possibility that the lazy ByteString returned from
Unpack.unpackPackageRaw (and stored in the pkgData field of PkgInfo)
is not forced. I'm not sure, but depending on how the Tar package is
implemented this seems like it might cause the garbage collector to
hold on to the whole decompressed contents of the tarball in memory,
rather than just the decompressed Cabal file that we want.

So this is almost pure speculation, but perhaps adding (BS.length
pkgStr `seq`) just before the liftIO on Mirror.hs:122 would reduce the
memory usage significantly. Worth a try?

Max



More information about the cabal-devel mailing list