Persistant (as in on disk) data

Alastair Reid alastair@reid-consulting-uk.ltd.uk
Tue, 11 Mar 2003 11:10:05 +0000


> It describes some attractive capabilities with respect to
> persistence, though the supporting mechanism appears rather
> complicated.  Does anyone with a deeper understanding of these
> issues have an opinion as to whether similar capabilities could (and
> should) be provided for Haskell?

Persistence is, indeed, a very attractive language feature.  To see
how useful they are (and how to solve some of the issues they raise),
look at languages like PS-Algol and Pisa.

Some of the main issues are:

- Typechecking: if I write a value with type 'data T = L | R' (say)
  out to disk and then read it back in again, what can I do with the 
  values when I read them back in?  

  If the program that reads them in contains a type declaration 'data
  T = L | R', can I treat the value I read in as having that type?  

  If the program that reads them in contains a type declaration 'data
  T = L | M | R', can I treat the value I read in as having that type?

  If the data written and then read back in involves standard library
  types like [Int] and (Int,Float), can you treat them as having that
  same type when you read them back in?

  The obvious solution to these issues is to access types through an
  abstract interface - don't use data constructors, don't use pattern
  matching.  This is (part of) how Pisa solves the problem.  But this
  makes your persistent data a 2nd class citizen.

- Higher order functions and lazy evaluation: You need to write code
  out to disk because the data may contain unevaluated thunks and
  higher order functions.  

  This is a bit tricky to add to existing compilers but quite feasible
  (e.g., I think someone at St Andrews did this with STG-Hugs at some
  point).

  More trickily, if a thunk refers to a standard library function, do
  you have to write it out too?  (Obvious answer is yes).
  What if the thunk refers to a C function accessed through the foreign
  function interface, do you have to write it out too?  (Obvious answer is
  yes but it's not clear how you'd implement this.

- Coping with changes in on-disk representation as the compiler evolves.

- Coping with portability problems between machines with different
  sizes of Float, Int, etc.

--
Alastair Reid                 alastair@reid-consulting-uk.ltd.uk  
Reid Consulting (UK) Limited  http://www.reid-consulting-uk.ltd.uk/alastair/