[Haskell-cafe] Data.IArray rant

Sat Sep 3 04:04:40 CEST 2011

On 3 September 2011 11:38, Evan Laforge <qdunkan at gmail.com> wrote:
> I never liked Data.IArray.  Recently I had to deal with it again and
> it kind of touched off a bit of a rant:
>
> Array bounds are an inclusive range, which runs counter to almost all
> other uses of ranges, and for good reason.  Length of an array is
> (high - low + 1).  If you want to convert a list to an array it's
> 'IArray.listArray (0, length xs - 1) xs'.  And if you forget to
> subtract one or add one it's likely to be a runtime crash.  Off-by-one
> errors and runtime crashes... I've gotten accustomed to not having
> those in haskell!
>
> The vast majority of data types have their own set of monomorphic
> fuctions, and yet array all of the sudden wants to be an interface.
> That's nice and all, but who actually uses that interface?  I recall
> there used to be a DiffArray and some other variants, but they were
> decried as being slow and appear to be gone now.
>
> So it appears that there's a complexity price paid to support
> different array types under the same interface, namely giant class
> contexts that can't be abstracted away and dependence on MPTCs... and
> appears to be unused.  And then another price paid to support
> arbitrary indices... which I at least have never used.  Maybe there is
> a field where arrays with strange index ranges like 5--20 are useful,
> but I always just want 0--length-1.

I've used this a fair amount: e.g. 2D grids going from (-n,-n) to
(n,n).  This is one of the things I liked about Haskell arrays, in
that you _weren't_ limited to the [0,n-1] bounds found in C-based
languages (and that annoyed me to no end when I was using them).  Even
_Fortran_ had arrays where you can specify the index bounds you want
to use rather than being forced to translate from the bounds that make
sense to your code to the bounds imposed by the array implementation.
Admittedly, I don't use arrays much anymore, but that's because I tend
to play with graphs nowadays...

>  Meanwhile, they lack basic features like concatenation.

I obviously don't know what you're doing with arrays, but this sounds
a bit like using the wrong datatype for the job: if you're
concatenating arrays, wouldn't you have to explicitly create or extend
one of those arrays to get the contiguous memory (unless you're using
a DiffArray)?

> No Monoid instance.  In fact, hardly any useful instances.

Maybe someone needs to suggest + implement such instances then.

>  The result is that my first contact with haskell
> arrays left me with the impression that they were complicated, hard to
> use, and designed for someone with different priorities than me.  Of
> course, Data.Vector didn't exist back then, but if someone were new to
> haskell now I would recommend they skip Data.IArray and head straight
> for vector.

To an extent, I wonder how much of this has been that arrays were
considered to be bad in Haskell, so no-one used them and no-one
bothered to try and improve the API much (and instead went and created
Vector, etc.).

> If they wanted 2 dimensional arrays then I know there are some matrix libraries.

We now have repa, but hmatrix (the only other matrix library that I
know of on Hackage) limits you to only 2D matrices.

> Of course, sometimes IArray is hard to avoid, e.g. Data.Graph is just
> a type synonym for an array.  I wrote a small library of graph
> utilities, and dealing with the arrays was back to awkward land.

Any particular reason you used Data.Graph rather than fgl, etc.?

> So... am I alone in this dislike of IArray?  Or maybe someone can
> explain why it's actually a very useful interface?  Admittedly vector
> is a mere 'cabal install' away, but array has a blessed status as a
> ghc bundled library and is likely to be the first place people will
> look, as well as the natural base for other data types that want an
> array, like Data.Graph.
>
> I know it's not as simple as "just debundle IArray" because it's a
> bootlib and presumably used in ghc itself, though I know there is
> support for reducing the set of bootlibs.  There's also the whole
> mutable interface, which I haven't used, but maybe I'd like it more.
> I know the array stuff winds up integrated fairly deeply with ghc's
> unboxed arrays and whatnot, so there's a lot of detail I don't
> understand.  I just know I dread going back to array using code while
> I tend to enjoy working with vector, bytestring, text, storablevector,
> etc.

I see there are two possible sources of your "problems" here:

1. The underlying data structure (index types and bounds, etc.)
2. The API

You mainly seem to have an issue with the API itself.  My guess is
that the API for array hasn't changed much because it was already
there and a boot library (so incremental changes, etc.) whereas when
vector, etc. was being written the opportunity was being taken to
create an API better suited for the task, taking into account people's
usage of array.

I don't disagree that IArray's API isn't that great and could be
improved, but I quite like the ability to have custom index types and
bounds.  That said, if I was going to be doing anything using arrays,
I would probably be looking at vector or repa (for the touted
efficiency if nothing else) first, and using mappings between my
bounds and the bounds on the array.

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic at gmail.com
IvanMiljenovic.wordpress.com