[Haskell-cafe] Python?

Wed May 18 07:06:16 EDT 2005

Hi all,

Finally I found some time to reply to this posting. A couple of years ago we
did something called "Data Field Haskell", which is Haskell extended with a
generalized form of arrays called data fields. Much of the purpose was to
investigate convenient and general syntax for array constructions.

Data fields are semantically pairs of functions and bounds, where bounds
represent sets of indices. Bounds can be pairs of integers representing
intervals, but also general finite sets. Certain bounds represent infinite
sets, which makes sense in a lazy language.

Besides direct notation for creating data fields, Data Field Haskell has a
construct "forall x -> t" which defines a data field in a similar way to how
lambda-abstraction \x -> t defines a function. In addition, the data field
defined by a forall-construct has a bound which is derived from the body of
the expression. The semantics is derived from the view that data fields (and
arrays) are partial functions whose domains do not exceed the index sets
defined by their bounds.

The forall-construct is intended to provide a simple and powerful syntax for
operations on data fields. Let's see below how it matches Jerzy's desired
features of an array language:

Jerzy:
>Tim Rowe writes: 
>
>> On 5/11/05, Jerzy Karczmarczuk <karczma at info.unicaen.fr> wrote: 
>> 
>>> Give me one single language where [3-d arrays are] natural and immediate.
>> 
>> I don't know how Matlab does it, but I find the C++ standard library
>> vector<vector<vector<float> > >
>> entirely intuitive (apart, perhaps, for the need for those two spaces)!
>
>Let's precise what I consider to be important in order to call it natural
>and immediate. And, in general, useful. 
>
>1. The definition of a concrete object, not just its type, but, say,
>  the initialization with constants. Or/and, global initialization with
>  zeros. 

Since Data Field Haskell is declarative, initialization and definition is
the same. Constant data fields are easy to define, viz.

forall x -> 1, a data field filled with ones
forall (i,j) -> if i == j then 1 else 0, a unit matrix
forall (i,j) -> i + j, a Vandermonde matrix

All these data fields are infinite, and the first is also
dimension-polymorphic (through the usual polymorphism in Haskell). Data
Field Haskell has an operator for explicit restriction of data fields, and
a data field may also be implicitly restricted from the context where it
occurs.

>2. Easy synthesis of multi-dim matrices out of "planes", of submatrices
>  of lesser dimensions;

Not directly expressible, but it is possible to define a binary operator to,
say, concatenate a 2-D matrix on some given "side" of a 3-D matrix.

>  it can be an 'overlay', like, say, making  a colour image out of three
>  R/G/B planes, or making a 3D surfaces, or aking tensors through
>  external products. 

Say, an outer product of two vectors a, b: forall (i,j) -> a!i * b!j

>3. Easy indexing, and not only A[i][j][k], etc., but slicing, the extraction
>  of sub-dimensional matrices, e.g., one column vector out of a 2D matrix
>  in Matlab:  A(3,:). Also, extracting parts (e.g. sub-images). 

A(3,:) is expressed as forall j -> a!(3,j). Or, selecting a plane out of a
3D-matrix: forall (i,j) -> a!(i,3,j). Or, selecting the main diagonal of a
matrix: forall i -> a!(i,i).

Extracting parts is done with the explicit restriction operator <\>:

a <\> (3,3)<:>(10,20), selecting submatrix of a
a <\> predicate (\(i,j) -> a!(i,j) > 0), selecting the subfield of a where
it is strictly greater than zero (this is a sparse data field)

>  Also, in mathematical context, "intelligent" indexing, e.g. treating
>  a matrix as implicitly anti-symmetric. Here the CAS systems as Maple
>  or Mathematica provide the adequate tools. C++ of course doesn't,
>  unless you overload [] yourself. 

All representation issues are hidden in Data Field Haskell, so there is no
direct counterpart. However, it is of course possible to define a data
structure of your own and then wrap it up as a data field by providing a
bound and a function from indices to values.

>4. Readable iterators, perhaps something more compact than insipid do-loops. 

There are folds for data fields. These can be used for reduction operations
over data fields. In a sense, forall-abstraction is also a kind of iterator
(although "parallel" since it does not impose any order of evaluation
between the elements of the data field).

>5. If those matrices are used as mathematical objects: tensors, etc.,
>  I want to have some simple notation for inner multiplications/
>  contractions, etc. This is not just the syntax problem, but the
>  existence of good libraries as well... 

You can use folds "inside" a matrix to, e.g., sum all rows in a matrix. Here
is an example of a function that does one iteration in Jacobi's method for
iteratively solving the linear equation system ax = b:

jacobi_iter a b x = 
  forall i -> ((b!i) - dfSum ((forall j -> a!(i,j)*(x!j))
                              <\> predicate (\j -> j /= i)))/a!(i,i)

>6. Reshaping of those arrays. I thought that Matlab 'reshape' (or something
>  similar in Numeric Python) is a baroque, rarely used construction. Now
>  I use it quite often... 

A general reshape function, that takes a data field a (to be reshaped), a
function f mapping new indices to old indices, and a bound b as argument:

reshape a f b = forall x -> a!(f x) <\> b

>So, plenty of things. That's why this is not so trivial... 
>
>Jerzy Karczmarczuk 

Björn Lisper