[Haskell-cafe] QuickCheck testing of AST transformers

Mon Apr 23 17:46:32 EDT 2007

My previous post did not receive any replies so I thought I might try  
generalizing the problem a bit...

Suppose I'm parsing a language into a syntax tree and then  
transforming that tree into another AST representing a "core  
language". The core language is a more general AST that should help  
with compiling to other languages.

My problem is how to best structure my AST transformations to be able  
to test them with QuickCheck. I suspect that I'm not going about it  
in the most optimal way so I thought I should ask for suggestions.

The transformation into the core AST applies operations to simplify,  
or desugar, the AST of the original language. Here's sample code in  
the source language which, incidentally, was recently highlighted at  
Lambda the Ultimate [1].

Array: MyArray[10](10 + 2);
Value1 = MyArray[5][10];

This declares an array of 10 elements and initializes each element to  
12. Value1 (a built-in variable) is then initialized to the value of  
element #5 as of 10 bars ago. A bar is, basically, a stock quote. The  
code is invoked on every bar and so "5 bars ago" can be treated as 5  
invocations ago.

The syntax tree of the above code is a 1-1 mapping. We declare an  
array of integers of 10 elements. Initialize it to the sum of two  
integers and then assign to Value1.

[ ArrayDecs [ VarDecl (VarIdent "MyArray") TyInt [Int 10]
                           (Op Plus (Int 10) (Int 2)) ]
, Assign (VarIdent "Value1") [] (Var (VarIdent "MyArray") [Int 5]
                                          (BarsBack (Int 10))) ]

The "desugared" version does away with the array declaration  
statement and declares MyArray to be a variable of array type. Arrays  
in the "core language" do not remember values from one invocation to  
another but there's a data series type, so we declare a series  
variable to hold the value of element #5.

We must manually store the value of the array element in the data  
series and can then refer to the value of the series 10 data points ago.

vars = [ ("MyArray", VarDecl (TyArray TyInt) [Int 10]
                        (Just (Plus (Int 10) (Int 2))))
        , ("series0", VarDecl (TySeries TyInt) [] Nothing)
        ]

code = [ AddToSeries (VarIdent "series0") (Var (VarIdent "MyArray")  
[Int 5])
        , Assign (Var (VarIdent "Value1") [])
                     (Series (VarIdent "series0") (Int 10))
        ]

The next step would be to take the above "core syntax tree" and  
transform it yet again into a C# (or other target language) AST. It's  
assumed that all target languages have a data series type.

The OCaml version of my code translated directly into the C# AST but  
I figured an intermediate syntax tree will help me translate into  
other languages such as Haskell, Erlang or OCaml.

The part I can't figure out is how to come up with a set of  
invariants for my transformations.

Should I, for example, state that every access to an array value in a  
previous invocation should introduce an extra variable to hold the  
series plus the appropriate assignment code?

Should I write the translator as a series of small transformers in  
the ST monad that can be threaded and tested separately?

	Thanks in advance, Joel

[1] http://lambda-the-ultimate.org/node/2201

--
http://wagerlabs.com/