[Haskell-cafe] Call for discussion: OverloadedLists extension

Michael Snoyman michael at snoyman.com
Mon Sep 24 15:33:20 CEST 2012


On Mon, Sep 24, 2012 at 2:53 PM, George Giorgidze <giorgidze at gmail.com> wrote:
> Hi Michael,
>
> Here at the University of Tübingen, I am co-supervising (together with
> Jeroen Weijers) a student project implementing the OverloadedLists
> extension for GHC. Achim Krause is the student who is working on the
> project. We took into consideration earlier discussions on this topic
> [1,2] before embarking on the project.
>
> Achim has worked on two approaches.
>
> The first approach is very simple, both from the user's and the
> extension implementor's perspective (it follows the implementation of
> OverloadedStrings closely) and typechecks and desugars lists like
>
> [] ; [x,y,z] ;  ['a' .. 'z'] ;
>
> as
>
> fromList [] ;  fromList [x,y,z] ; fromList ['a' .. 'z'] ;
>
> where fromList is whatever is in scope with that name. That said, we
> do provide the FromList type class that can be used to overload
> fromList. In the following I give the definition of the class, as well
> as, example instances:
>
> class FromList l where
>   type Item l
>   fromList :: [Item l] -> l
>
> instance FromList [a] where
>   type Item [a] = a
>   fromList = id
>
> instance (Ord a) => FromList (Set a) where
>   type Item (Set a) = a
>   fromList = Set.fromList
>
> instance (Ord k) => FromList (Map k v) where
>   type Item (Map k v) = (k,v)
>   fromList = Map.fromList
>
> instance FromList (IntMap v) where
>   type Item (IntMap v) = (Int,v)
>   fromList = IntMap.fromList
>
> instance FromList Text where
>   type Item Text = Char
>   fromList = Text.pack
>
> This approach has already been implemented by Achim as patch against GHC head.
>
> This approach is very simple, but can be inefficient as it may result
> into unnecessary construction of lists at runtime. This can be a
> serious issue when constructing large structures from arithmetic
> sequences (e.g., from the [ .. ] notation) or when using non-literal
> expressions (e.g., variables) inside the square brackets.
>
> Our second approach to OverloadedLists is to avoid the construction of
> lists altogether. By typechecking and desugaring lists like
>
> [] ; [x,y,z] ;  ['a' .. 'z'] ;
>
> as
>
> mempty ; singleton x `mappend` singleton y `mappend` singleton z ;
> genericEnumFromTo 'a' 'z' ;
>
> We  provide the Singleton and GenericEnum type classes for overloading
> singleton and genericEnum(..) functions. In the following, I give the
> definitions of the classes, as well as, example instances:
>
> -- Singleton class
>
> class Singleton l where
>   type SingletonItem l
>   singleton :: SingletonItem l -> l
>
> -- Singleton instances
>
> instance Singleton [a] where
>   type SingletonItem [a] = a
>   singleton a = [a]
>
> instance (Ord a) => Singleton (Set a) where
>   type SingletonItem (Set a) = a
>   singleton = Set.singleton
>
> instance (Ord k) => Singleton (Map k v) where
>   type SingletonItem (Map k v) = (k,v)
>   singleton (k,v) = Map.singleton k v
>
> instance Singleton (IntMap v) where
>   type SingletonItem (IntMap v) = (Int,v)
>   singleton (k,v) = IntMap.singleton k v
>
> instance Singleton Text where
>   type SingletonItem Text = Char
>   singleton = Text.singleton
>
> -- GenericEnum class
>
> class GenericEnum l where
>   type EnumItem l
>   genericEnumFrom        :: EnumItem l -> l
>   genericEnumFromThen    :: EnumItem l -> EnumItem l -> l
>   genericEnumFromTo      :: EnumItem l -> EnumItem l -> l
>   genericEnumFromThenTo  :: EnumItem l -> EnumItem l -> EnumItem l -> l
>
> -- GenericEnum instances
>
> instance (Enum a) => GenericEnum [a] where
>   type EnumItem [a] = a
>   genericEnumFrom        = enumFrom
>   genericEnumFromThen    = enumFromThen
>   genericEnumFromTo      = enumFromTo
>   genericEnumFromThenTo  = enumFromThenTo
>
> instance (Ord a,Enum a) => GenericEnum (Set a) where
>   type EnumItem (Set a) = a
>   genericEnumFrom       a     = Set.fromList (enumFrom a)
>   genericEnumFromThen   a b   = Set.fromList (enumFromThen a b)
>   genericEnumFromTo     a b   = Set.fromList (enumFromTo a b)
>   genericEnumFromThenTo a b c = Set.fromList (enumFromThenTo a b c)
>
> instance (Ord k,Enum (k,v)) => GenericEnum (Map k v) where
>   type EnumItem (Map k v) = (k,v)
>   genericEnumFrom       a     = Map.fromList (enumFrom a)
>   genericEnumFromThen   a b   = Map.fromList (enumFromThen a b)
>   genericEnumFromTo     a b   = Map.fromList (enumFromTo a b)
>   genericEnumFromThenTo a b c = Map.fromList (enumFromThenTo a b c)
>
> instance (Enum (Int,v)) => GenericEnum (IntMap v) where
>   type EnumItem (IntMap v) = (Int,v)
>   genericEnumFrom       a     = IntMap.fromList (enumFrom a)
>   genericEnumFromThen   a b   = IntMap.fromList (enumFromThen a b)
>   genericEnumFromTo     a b   = IntMap.fromList (enumFromTo a b)
>   genericEnumFromThenTo a b c = IntMap.fromList (enumFromThenTo a b c)
>
> instance GenericEnum Text where
>   type EnumItem Text = Char
>   genericEnumFrom       a     = Text.pack (enumFrom a)
>   genericEnumFromThen   a b   = Text.pack (enumFromThen a b)
>   genericEnumFromTo     a b   = Text.pack (enumFromTo a b)
>   genericEnumFromThenTo a b c = Text.pack (enumFromThenTo a b c)
>
> Note that the GenericEnum instances can be implemented more
> efficiently, but for now I give simple definitions that go through
> lists.
>
> Our second approach avoids the construction of intermediate lists at
> runtime and directly constructs the target data structure for which
> the list notation is used.
>
> We will release GHC patches for both approaches, meanwhile the
> feedback from the community on the approaches that we took would be
> very much appreciated. Which one those would you prefer? or would you
> suggest a different one.
>
> Note that we intend to make fromList in the first approach and
> singleton, genericEnum(..), mempty and mapped rebindable. This means
> that the definitions of the type classes that overload this functions
> can be easily changed. Having said that, altering the changes that
> Achim already made to the GHC source code (including typechecking and
> desugaring rules) will be more work and we hope that one of the
> approaches that we took will be acceptable for the community.
>
> Cheers, George
>
> [1] http://www.mail-archive.com/glasgow-haskell-users@haskell.org/msg20447.html
> [2] http://www.mail-archive.com/glasgow-haskell-users@haskell.org/msg20518.html

Hi George,

It's very exciting to hear that this work has already been starting,
thank you for letting me know. Your first approach is more inline with
my initial proposal, though that doesn't mean I necessarily prefer it.

I'm certainly interested having a more efficient implementation
available, but I wonder if this second approach isn't a premature
optimization. GHC rewrite rules provide a lot of power in this
department: both ByteString and Text are able to avoid the
intermediate String and build their representations from the buffers
GHC creates. In my own work in conduit, I was able (with some help
from Joachim Breitner[1]) to remove the intermediate list structure,
and I'd be surprised if it's not possible to do the same thing with
calls to `fromList`.

That said, I haven't actually tried implementing any of this in real
code, and it could be that there are serious performance advantages to
the second approach. I'm quite happy with either implementation making
it into GHC.

Michael

[1] http://www.haskell.org/pipermail/haskell-cafe/2012-April/100793.html



More information about the Haskell-Cafe mailing list