[Haskell-cafe] Data.Binary stack overflow with Data.Sequence String

Gwern Branwen gwern0 at gmail.com
Wed Mar 4 21:33:15 EST 2009


On Tue, Mar 3, 2009 at 11:50 PM, Spencer Janssen
<spencerjanssen at gmail.com> wrote:
> On Tue, Mar 3, 2009 at 10:30 PM, Gwern Branwen <gwern0 at gmail.com> wrote:
>> So recently I've been having issues with Data.Binary & Data.Sequence;
>> I serialize a 'Seq String'
>>
>> You can see the file here: http://code.haskell.org/yi/Yi/IReader.hs
>>
>> The relevant function seems to be:
>>
>> -- | Read in database from 'dbLocation' and then parse it into an 'ArticleDB'.
>> readDB :: YiM ArticleDB
>> readDB = io $ (dbLocation >>= r) `catch` (\_ -> return empty)
>>          where r x = fmap (decode . BL.fromChunks . return) $ B.readFile x
>>                -- We read in with strict bytestrings to guarantee the
>> file is closed,
>>                -- and then we convert it to the lazy bytestring
>> data.binary expects.
>>                -- This is inefficient, but alas...
>>
>> My current serialized file is about 9.4M. I originally thought that
>> the issue might be the recent upgrade in Yi to binary 0.5, but I
>> unpulled patches back to past that, and the problem still manifested.
>>
>> Whenever yi tries to read the articles.db file, it stack overflows. It
>> actually stack-overflowed on even smaller files, but I managed to bump
>> the size upwards, it seems, by the strict-Bytestring trick.
>> Unfortunately, my personal file has since passed whatever that limit
>> was.
>>
>> I've read carefully the previous threads on Data.Binary and Data.Map
>> stack-overflows, but none of them seem to help; hacking some $!s or
>> seqs into readDB seems to make no difference, and Seq is supposed to
>> be a strict datastructure already! Doing things in GHCi has been
>> tedious, and hasn't enlightened me much: sometimes things overflow and
>> sometimes they don't. It's all very frustrating and I'm seriously
>> considering going back to using the original read/show code unless
>> anyone knows how to fix this - that approach may be many times slower,
>> but I know it will work.
>>
>> --
>> gwern
>
> Have you tried the darcs version of binary?  It has a new instance
> which looks more efficient than the old.
>
>
> Cheers,
> Spencer Janssen

I have. It still stack-overflows on my 9.8 meg file. (The magic number
seems to be somewhere between 9 and 10 megabytes.)

-- 
gwern


More information about the Haskell-Cafe mailing list