[Haskell-cafe] Array, Vector, Bytestring

Mon Jun 3 21:16:08 CEST 2013

Hi everyone,

Every time I want to use an array in Haskell, I find myself having to 
look up in the doc how they are used, which exactly are the modules I 
have to import ... and I am a bit tired of staring at type signatures 
for 10 minutes to figure out how these arrays work every time I use them 
(It's even worse when you have to write the signatures). I wonder how 
other people perceive this issue and what possible solutions could be.

Eg. look at the type signature for changing a single entry in an array:

* vector package:
     write :: PrimMonad m => MVector (PrimState m) a -> Int -> a -> m ()
* array package:
     writeArray :: (MArray a e m, Ix i) => a i e -> i -> e -> m ()
* bytestring package:
     not available
* a reasonable type signature would be:
     write :: MVector a -> Int -> a -> ST s a

# To many different implementations

Why do we need so many different implementations of the same thing? In 
the ghc libraries alone we have a vector, array and bytestring package 
all of which do the same thing, as demonstrated for instance by the 
vector-bytestring package. To make matters worse, the haskell 2010 
standard has includes a watered down version of array.

I personally prefer the vector package but if you want to use libraries 
for strings, I am forced back to use bytesting. And now that Array was 
added to haskell 2010, that will start to pop up more and more.

The multitude of already existing libraries is also why I am writing 
this to haskell-cafe and not spinning yet another implementation.

# Index

I don't really see a reason for having an index of a type other than Int 
and that starts somewhere else than at 0.

While there might be corner cases where such a thing might be useful, it 
is only confusing and irritating for the (presumably) rest of us.

# Storable vs Unboxed

Is there really a difference between Storable and Unboxed arrays and if 
so can't this be fixed in the complier rather than having to expose this 
problem to the programmer?

# ST s vs IO

This is probably the hardest to resolve issue. The easiest solution is 
probably to just have a module for each of them as in the array package.
I find the PrimState a bit complicated and circuitous.

The ideal solution would be to have

   type IO a = ST RealWorld# a

in the next haskell standard. If I understand it correctly this would 
let you declare the types as follows but and still use it in an IO context.

   write :: MVector a -> Int -> a -> ST s a

Would this work? I know it would probably screw up all instances

   instance ... IO where

On the other hand, this is not only an issue with arrays but all 
procedurally accelerated data structures like hash-tables.

Silvio