[Haskell-cafe] mapping large structures into memory

Fri Sep 25 15:41:48 EDT 2009

warrensomebody:
>
> On Sep 25, 2009, at 12:14 PM, Don Stewart wrote:
>
>>
>> It is entirely possible to use mmap to map structures into memory.
>> Thanks to the foreign function interface, there are well-defined
>> semantics for calling to and from C.
>>
>> The key questions would be:
>>
>> * what is the type and representation of the data you wish to map
>> * what operations on them
>
> Right... my question relates more to how well the intrinsic type system 
> integrates with foreign/mapped structures. For instance, I wouldn't want 
> to create my own foreign arrays, and have to replicate all sorts of 
> library code that only works on haskell's intrinsic arrays.

Well, nothing is really 'intrinsic', but the fundamental distinction are
unpinned GC-managed memory,  and pinned memory.

The 'arrays' package illustrates GC-managed memory, while
Data.ByteString or the 'carray' or 'hmatrix' library illustrate pinned
memory manipulatable with foreign operations.

For your mmapped data, you'll need to assign (coerce) the pointers to
that data to a type that describes pinned memory.

> I'm assuming here that all this mapped data is self-contained, and  
> doesn't point to heap-allocated structures, although that's a related  
> question -- is it possible to inform the gc about heap pointers stored  
> (temporarily) in these structures (and later identify them in order to  
> swizzle them out when flushing the mapped file to disk).

You can associated a ForeignPtr with mmapped data, and have the GC unmap
the data for you once references go out of scope.

Simple example:

    - Data.ByteString

A fast Haskell type that can be allocated and manipulated by C or Haskell.

-- Don