[GHC] #13630: ByteString pinned memory can be leaky

Sun Apr 30 01:17:07 UTC 2017

#13630: ByteString pinned memory can be leaky
-------------------------------------+-------------------------------------
           Reporter:  nh2            |             Owner:  (none)
               Type:  bug            |            Status:  new
           Priority:  normal         |         Milestone:
          Component:  Runtime        |           Version:  8.0.1
  System                             |
           Keywords:                 |  Operating System:  Unknown/Multiple
       Architecture:                 |   Type of failure:  None/Unknown
  Unknown/Multiple                   |
          Test Case:                 |        Blocked By:
           Blocking:                 |   Related Tickets:
Differential Rev(s):                 |         Wiki Page:
-------------------------------------+-------------------------------------
 My question on IRC:

 {{{
 How does memory allocation for pinned blocks work?

 Let's say pinned blocks are 4KB in size, and I allocate first a 3 KB
 ByteString A and then an 8-byte ByteString B.

 Now I GC A, no longer need it.

 Then according to
   https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/GC/Pinned
 "a single pinned object keeps alive the whole block in which it resides",
 my small ByteString B keeps the entire block alive.

 But what happens with the 3KB in the front of that block?
 Will it be re-used by the next ByteString allocation (say 1KB)?
 In other words, how smart is allocatePinned as an allocator?
 }}}

 The answer:

 {{{
 slyfox: nh2: allocatePinned it quite dump. it only allocated from free
 tail space
 }}}

 {{{
 nh2: slyfox: that sounds like a huge potential for memory leak then
 slyfox: yes, fragmentation is quite bad for bytestrings
 }}}

 So it seems that I can get into the unfortunate situation where a super
 short `ByteString` of a few bytes can waste an entire 4 KB block of
 memory; some migth call this a leak.

 One idea to solve it seems to be to change standard `ByteStrings` to not
 pinned, and to allocate pinned ones explicitly when needed. This seems to
 be an often-discussed topic and not trivial because many `ByteString`
 functions are implemented using libc FFI functions.

 However, it seems there will always be _some_ need for pinned memory, so
 we should better have an efficient way to manage it in any case.

 Efficient here means, for example, to re-use freed memory inside a block
 instead of only using free tail space.

 It seems that `jemalloc` has a feature to use given blocks of memory and
 provide a `malloc()` functionality inside them:

 https://stackoverflow.com/questions/30836359/jemalloc-mmap-and-shared-
 memory

 Perhaps this could be used to provide GHC with a simple method to use
 pinned memory more efficiently?

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/13630>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler