[Haskell-cafe] ByteString.getContents fails for files >2GB on OS X

Shaun Jackman sjackman at gmail.com
Fri Jun 8 20:23:57 CEST 2012


Hi Erik, Serge,

I have a 64-bit build of GHC:
http://www.haskell.org/ghc/dist/7.4.1/ghc-7.4.1-x86_64-apple-darwin.tar.bz2

I think it's fundamentally an OS X issue. The system call read(2)
fails for reads >2 GB with EINVAL, even though I have a 64-bit OS X
kernel. GHC would need to hack around this issue.

Cheers,
Shaun

On 8 June 2012 05:08, Serge Le Huitouze <serge.lehuitouze at gmail.com> wrote:
> Isn't it more likely to be due to the garbage collector's strategy (copying) ?
>
> --Serge
>
> On Fri, Jun 8, 2012 at 10:29 AM, Erik Hesselink <hesselink at gmail.com> wrote:
>> Do you have a 32bit or 64bit GHC build? That might have something to
>> do with it, if you're nearing 2^32 (or 2^31) bytes.
>>
>> Erik
>>
>> On Fri, Jun 8, 2012 at 2:25 AM, Shaun Jackman <sjackman at gmail.com> wrote:
>>> Hi,
>>>
>>> Data.ByteString.Char8.getContents fails for files >2GB on OS X. Is
>>> there a fix for this?
>>>
>>> $ cat getContents.hs
>>> main = getContents
>>> $ ./getContents <smallFile
>>> $ ./getContents <bigFile
>>> getContents: <stdin>: hGetBuf: invalid argument (Invalid argument)
>>> $ ghc --version
>>> The Glorious Glasgow Haskell Compilation System, version 7.4.1
>>>
>>> Mac OS X 10.7.4 64-bit
>>>
>>> As a workaround, I used ByteString.Lazy instead of the strict
>>> ByteString, which worked, but found it was ~4 times slower for my
>>> program, so I'd like to get the strict ByteString working with large
>>> files.
>>>
>>> Cheers,
>>> Shaun
>>>
>>> _______________________________________________
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe at haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe



More information about the Haskell-Cafe mailing list