[Haskell-cafe] A numpy equivalent for Haskell

yairchu at gmail.com yairchu at gmail.com
Mon Jan 18 09:48:29 EST 2010


Hi Cafe,

I've created a numpy equivalent for Haskell. (Numpy is a python
library for multi-dimensional arrays and operations on them)

Code at http://github.com/yairchu/numkell
(not yet on hackage because it needs better names)

A numkell array is a pair of a function from integer inputs and a
range for its inputs (size).
This allows for easy memoizing into in-memory arrays, and
additionally, numkell arrays also support useful operations like
numpy's newaxis and folding axes away.
As the "Array" name was already taken, numkell's array is currently
called "Funk" (name suggestions very appreciated).

An example:
Given an bunch of vectors as a 2d array, compute the distance between
each pair of vectors

{-# LANGUAGE DeriveDataTypeable, GeneralizedNewtypeDeriving,
TypeOperators #-}

import Data.HList
import Data.NumKell
import Data.Typeable

newtype PersonIdx = PersonIdx Int
  deriving (Enum, Eq, Integral, Num, Ord, Real, Show, Typeable)

newtype FeatureIdx = FeatureIdx Int
  deriving (Enum, Eq, Integral, Num, Ord, Real, Show, Typeable)

let personProps = fFromList [[5,3,2],[4,8,1],[2,6,9],[5,3,0]] :: Funk
(HJust PersonIdx :*: HJust FeatureIdx :*: HNil) Double

> personProps
          FeatureIdx   0   1   2
PersonIdx          +   -   -   -
        0          | 5.0 3.0 2.0
        1          | 4.0 8.0 1.0
        2          | 2.0 6.0 9.0
        3          | 5.0 3.0 0.0

> sumAxes (fmap (** 2) (liftF2 (-) (personProps !/ (SNewAxis .*. HNil)) (personProps !/ (SAll .*. SNewAxis .*. HNil)))) (TFalse .*. TFalse .*. TTrue .*. HNil)

          PersonIdx    0    1    2    3
PersonIdx         +    -    -    -    -
        0         |  0.0 27.0 67.0  4.0
        1         | 27.0  0.0 72.0 27.0
        2         | 67.0 72.0  0.0 99.0
        3         |  4.0 27.0 99.0  0.0

In Python the last line looks shorter:

>>> ((personProps[newaxis] - personProps[:,newAxis]) ** 2).sum(2)

Mostly due to Python's slicing syntax sugar.
Still, numkell has one large benefit over numpy (apart from being for
Haskell): With numpy this example creates a temporary 3d array in
memory. In numkell the array is not allocated in memory unless "fMemo"
is called.

If anyone has comments, suggestions, naming suggestions, complaints,
etc, I would very much like to hear.

cheers,
Yair


More information about the Haskell-Cafe mailing list