Records in Haskell

Fri Sep 16 03:34:34 CEST 2011

> It ''also'' demonstrated, to me, that qualified imports are horrible
> when used on a large scale. It happened all the time, that'd I'd
> import, say, 10 different data types all qualified.  Typing map
> (Foo.id . BarMu.thisField) and foo Bar.Zot{x=1,y=2} becomes tedious
> and distracting, especially having to add every type module when I
> want to use a type. And when records use other types in other modules,
> you have ''a lot'' of redundancy. With the prefixing paradigm I'd write
> fooId and barMuThisField, which is about as tedious but there is at
> least less . confusion and no need to make a load of modules and
> import lines. Perhaps local modules would solve half of this
> problem. Still have to write “Bar.mu bar” rather than “mu bar”, but
> it'd be an improvement.

I disagree about qualified imports, in my experience they're even more
useful as the scale increases, because there are many more modules
each symbol could come from, and many more name conflicts, and of
course there starts to be module name conflicts.  I don't find it much
of an imposition because I have an automatic tool to add and remove
imports when needed, and once the import is already there keyword
completion works on it.  Grepping for '^data .. =', I have 199 types,
of which 97 are probably records, spread over 296 modules.  The 97
records are spread over 50 modules, but in practice there are a number
with 5-10 and then a long tail with just 1 or 2 each.

However I entirely agree a better syntax for records would make
programming clearer, more concise, and more fun.  I also agree that
one module per record is annoying, so I wind up with many records per
module, so I have record prefixes anyway.  Access is annoying, but not
so bad, I think.  It's true '(ModuleB.recField2 . ModuleA.recField1)
val' is inferior to val.field1.field2, but at least the functions
compose and if you make a habit of reducing the dots by defining a few
precomposed functions you can cut out a lot of the work of moving a
field, or more likely, grouping several fields into a nested record.
But the really annoying thing is that modification doesn't compose.
Record updates also don't mix well with other update functions.  So,
just to put give another data point, here's an extreme case:

set_track_width view_id tracknum width = do
    views <- gets views
    view <- maybe (throw "...") return $ Map.lookup view views
    track_views <- modify_at "set_track_width"
        (Block.view_tracks view) tracknum $ \tview ->
            tview { Block.track_view_width = width }
    let view' = view { Block.view_tracks = track_views }
    modify $ \st ->
        st { state_views = Map.insert view_id view' (state_views st) }

I inlined the 'modify_view' function, so this looks even nastier than
the real code, but it would be nice to not have to define those helper
functions!  In an imperative language, this would look something like

    state.views[view_id].tracks[tracknum].width = width

Of course, there's also monadic vs. non-monadic getting its claws in
there (and the lack of stack traces, note the explicit passing of the
function name).  So if I hypothesize a .x syntax that is a
modification function, some handwavy two arg forward composition, and
remove the monadic part (could get it back in with some fancy
combinator footwork), theoretically haskell could get pretty close to
the imperative version:

    let Map.modify k m f = Data.Map.adjust f k m
    (.views .> Map.modify view_id .> .tracks .> modify_at tracknum .>
.width) state width
    -- or, let set = (...) in state `set` width

It's also nice if the field names can be in their own namespace, since
it's common to say 'x.y = y'.  I'm not saying I think the above would
be a good record syntax, just that I think composing with other modify
functions is important.  Passing to State.modify is another obvious
thing to want to do.

The above is an extreme case, but I've lost count of the number of
times I've typed 'modify $ \st -> st { field = f (field st) }'.  I
don't even have to think about it anymore.

I know that these are the problems that lenses / first class labels
libraries aim to solve, so if we're talking only about non-extensible
records, I'm curious about what a language extension could provide
that a library can't.  Simply not being standard and built-in is a big
one, but presumably the platform is an answer to that.

It would be nice if record updates and access to come with no
performance penalty over the existing system, since updating records
forms a large part of the runtime of some programs, I'm not sure if
the libraries can provide that?