<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Comic Sans MS";
panose-1:3 15 7 2 3 3 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.Code, li.Code, div.Code
{mso-style-name:Code;
mso-style-link:"Code Char";
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Courier New";}
span.CodeChar
{mso-style-name:"Code Char";
mso-style-link:Code;
font-family:"Courier New";}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
.MsoPapDefault
{mso-style-type:export-only;
margin-top:6.0pt;
margin-right:0cm;
margin-bottom:6.0pt;
margin-left:0cm;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:36.0pt">
At the very least I'll take this email and turn it into a short article.<o:p></o:p></p>
<p class="MsoNormal"><span style="font-family:"Calibri",sans-serif;mso-fareast-language:EN-US">Yes, please do make it into a wiki page on the GHC Trac, and maybe make a ticket for it.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Calibri",sans-serif;mso-fareast-language:EN-US"><br>
Thanks<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Calibri",sans-serif;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Calibri",sans-serif;mso-fareast-language:EN-US">Simon<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:"Calibri",sans-serif;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Edward Kmett [mailto:ekmett@gmail.com]
<br>
<b>Sent:</b> 27 August 2015 16:54<br>
<b>To:</b> Simon Peyton Jones<br>
<b>Cc:</b> Manuel M T Chakravarty; Simon Marlow; ghc-devs<br>
<b>Subject:</b> Re: ArrayArrays<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
An ArrayArray# is just an Array# with a modified invariant. It points directly to other unlifted ArrayArray#'s or ByteArray#'s.<o:p></o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
While those live in #, they are garbage collected objects, so this all lives on the heap.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
They were added to make some of the DPH stuff fast when it has to deal with nested arrays.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
I'm currently abusing them as a placeholder for a better thing.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
The Problem<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
-----------------<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Consider the scenario where you write a classic doubly-linked list in Haskell.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New"">data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Chasing from one DLL to the next requires following 3 pointers on the heap.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Comic Sans MS"">DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~> DLL</span><o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
That is 3 levels of indirection.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
We can trim one by simply unpacking the IORef with -funbox-strict-fields or UNPACK<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
We can trim another by adding a 'Nil' constructor for DLL and worsening our representation.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New"">data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil</span><o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
but now we're still stuck with a level of indirection <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Comic Sans MS"">DLL ~> MutVar# RealWorld DLL ~> DLL</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
This means that every operation we perform on this structure will be about half of the speed of an implementation in most other languages assuming we're memory bound on loading things into cache!<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Making Progress<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
----------------------<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
I have been working on a number of data structures where the indirection of going from something in * out to an object in # which contains the real pointer to my target and coming back effectively doubles my runtime.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
We go out to the MutVar# because we are allowed to put the MutVar# onto the mutable list when we dirty it. There is a well defined write-barrier.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
I could change out the representation to use<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New"">data DLL = DLL (MutableArray# RealWorld DLL) | Nil</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
I can just store two pointers in the MutableArray# every time, but this doesn't help _much_ directly. It has reduced the amount of distinct addresses in memory I touch on a walk of the DLL from 3 per object to 2.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
I still have to go out to the heap from my DLL and get to the array object and then chase it to the next DLL and chase that to the next array. I do get my two pointers together in memory though. I'm paying for a card marking table as well, which I don't particularly
need with just two pointers, but we can shed that with the "SmallMutableArray#" machinery added back in 7.10, which is just the old array code a a new data type, which can speed things up a bit when you don't have very big arrays:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New"">data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil</span><o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
But what if I wanted my object itself to live in # and have two mutable fields and be able to share the sme write barrier?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
An ArrayArray# points directly to other unlifted array types. What if we have one # -> * wrapper on the outside to deal with the impedence mismatch between the imperative world and Haskell, and then just let the ArrayArray#'s hold other arrayarrays.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New"">data DLL = DLL (MutableArrayArray# RealWorld)</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
now I need to make up a new Nil, which I can just make be a special MutableArrayArray# I allocate on program startup. I can even abuse pattern synonyms. Alternately I can exploit the internals further to make this cheaper.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Then I can use the readMutableArrayArray# and writeMutableArrayArray# calls to directly access the preceding and next entry in the linked list.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
So now we have one DLL wrapper which just 'bootstraps me' into a strict world, and everything there lives in #.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New"">next :: DLL -> IO DLL</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New"">next (DLL m) = IO $ \s -> case readMutableArrayArray# s of </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New""> (# s', n #) -> (# s', DLL n #)</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
It turns out GHC is quite happy to optimize all of that code to keep things unboxed. The 'DLL' wrappers get removed pretty easily when they are known strict and you chain operations of this sort!<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Cleaning it Up<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
------------------<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Now I have one outermost indirection pointing to an array that points directly to other arrays.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
I'm stuck paying for a card marking table per object, but I can fix that by duplicating the code for MutableArrayArray# and using a SmallMutableArray#. I can hack up primops that let me store a mixture of SmallMutableArray# fields and normal ones in the data
structure. Operationally, I can even do so by just unsafeCoercing the existing SmallMutableArray# primitives to change the kind of one of the arguments it takes.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
This is almost ideal, but not quite. I often have fields that would be best left unboxed.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<span style="font-family:"Courier New"">data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
was able to unpack the Int, but we lost that. We can currently at best point one of the entries of the SmallMutableArray# at a boxed or at a MutableByteArray# for all of our misc. data and shove the int in question in there.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
e.g. if I were to implement a hash-array-mapped-trie I need to store masks and administrivia as I walk down the tree. Having to go off to the side costs me the entire win from avoiding the first pointer chase.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
But, if like Ryan suggested, we had a heap object we could construct that had n words with unsafe access and m pointers to other heap objects, one that could put itself on the mutable list when any of those pointers changed then I could shed this last factor
of two in all circumstances.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Prototype<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
-------------<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Over the last few days I've put together a small prototype implementation with a few non-trivial imperative data structures for things like Tarjan's link-cut trees, the list labeling problem and order-maintenance.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<a href="https://github.com/ekmett/structs">https://github.com/ekmett/structs</a><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Notable bits:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<a href="https://github.com/ekmett/structs/blob/9ff2818f888aff4789b7a41077a674a10d15e6ee/src/Data/Struct/Internal/LinkCut.hs">Data.Struct.Internal.LinkCut</a> provides an implementation of link-cut trees in this style.<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<a href="https://github.com/ekmett/structs/blob/9ff2818f888aff4789b7a41077a674a10d15e6ee/src/Data/Struct/Internal.hs">Data.Struct.Internal</a> provides the rather horrifying guts that make it go fast.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
Once compiled with -O or -O2, if you look at the core, almost all the references to the LinkCut or Object data constructor get optimized away, and we're left with beautiful strict code directly mutating out underlying representation.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
At the very least I'll take this email and turn it into a short article.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
-Edward<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
<o:p> </o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm">
On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones <<a href="mailto:simonpj@microsoft.com" target="_blank">simonpj@microsoft.com</a>> wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Calibri",sans-serif">Just to say that I have no idea what is going on in this thread. What is ArrayArray? What is the issue in general? Is there a ticket?
Is there a wiki page?</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Calibri",sans-serif"> </span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Calibri",sans-serif">If it’s important, an ab-initio wiki page + ticket would be a good thing.</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Calibri",sans-serif"> </span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Calibri",sans-serif">Simon</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-family:"Calibri",sans-serif"> </span><o:p></o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif"> ghc-devs
[mailto:<a href="mailto:ghc-devs-bounces@haskell.org" target="_blank">ghc-devs-bounces@haskell.org</a>]
<b>On Behalf Of </b>Edward Kmett<br>
<b>Sent:</b> 21 August 2015 05:25<br>
<b>To:</b> Manuel M T Chakravarty<br>
<b>Cc:</b> Simon Marlow; ghc-devs<br>
<b>Subject:</b> Re: ArrayArrays</span><o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">When (ab)using them for this purpose, SmallArrayArray's would be very handy as well.<o:p></o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">Consider right now if I have something like an order-maintenance structure I have:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s))<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} !(MutVar s (Lower s))<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">The former contains, logically, a mutable integer and two pointers, one for forward and one for backwards. The latter is basically the same thing with a mutable reference up pointing at
the structure above.<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">On the heap this is an object that points to a structure for the bytearray, and points to another structure for each mutvar which each point to the other 'Upper' structure. So there is
a level of indirection smeared over everything.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">So this is a pair of doubly linked lists with an upward link from the structure below to the structure above.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">Converted into ArrayArray#s I'd get<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">data Upper s = Upper (MutableArrayArray# s)<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">w/ the first slot being a pointer to a MutableByteArray#, and the next 2 slots pointing to the previous and next previous objects, represented just as their MutableArrayArray#s. I can
use sameMutableArrayArray# on these for object identity, which lets me check for the ends of the lists by tying things back on themselves.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">and below that<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">data Lower s = Lower (MutableArrayArray# s)<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">is similar, with an extra MutableArrayArray slot pointing up to an upper structure.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">I can then write a handful of combinators for getting out the slots in question, while it has gained a level of indirection between the wrapper to put it in * and the MutableArrayArray#
s in #, that one can be basically erased by ghc.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">Unlike before I don't have several separate objects on the heap for each thing. I only have 2 now. The MutableArrayArray# for the object itself, and the MutableByteArray# that it references
to carry around the mutable int.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">The only pain points are <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">1.) the aforementioned limitation that currently prevents me from stuffing normal boxed data through a SmallArray or Array into an ArrayArray leaving me in a little ghetto disconnected
from the rest of Haskell, <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">and<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">2.) the lack of SmallArrayArray's, which could let us avoid the card marking overhead. These objects are all small, 3-4 pointers wide. Card marking doesn't help.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">Alternately I could just try to do really evil things and convert the whole mess to SmallArrays and then figure out how to unsafeCoerce my way to glory, stuffing the #'d references to
the other arrays directly into the SmallArray as slots, removing the limitation we see here by aping the MutableArrayArray# s API, but that gets really really dangerous!<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">I'm pretty much willing to sacrifice almost anything on the altar of speed here, but I'd like to be able to let the GC move them and collect them which rules out simpler Ptr and Addr based
solutions.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">-Edward<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty <<a href="mailto:chak@cse.unsw.edu.au" target="_blank">chak@cse.unsw.edu.au</a>> wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">That’s an interesting idea.<br>
<br>
Manuel<br>
<br>
> Edward Kmett <<a href="mailto:ekmett@gmail.com" target="_blank">ekmett@gmail.com</a>>:<o:p></o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt">><br>
> Would it be possible to add unsafe primops to add Array# and SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# entries are all directly unlifted avoiding a level of indirection for the containing structure is amazing, but I can only currently
use it if my leaf level data can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be able to have the ability to put SmallArray# a stuff down at the leaves to hold lifted contents.<br>
><br>
> I accept fully that if I name the wrong type when I go to access one of the fields it'll lie to me, but I suppose it'd do that if i tried to use one of the members that held a nested ArrayArray# as a ByteArray# anyways, so it isn't like there is a safety
story preventing this.<br>
><br>
> I've been hunting for ways to try to kill the indirection problems I get with Haskell and mutable structures, and I could shoehorn a number of them into ArrayArrays if this worked.<br>
><br>
> Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection compared to c/java and this could reduce that pain to just 1 level of unnecessary indirection.<br>
><br>
> -Edward<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:12.0pt">> _______________________________________________<br>
> ghc-devs mailing list<br>
> <a href="mailto:ghc-devs@haskell.org" target="_blank">ghc-devs@haskell.org</a><br>
> <a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" target="_blank">
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><o:p></o:p></p>
</blockquote>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:6.0pt"> <o:p></o:p></p>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</div>
</body>
</html>