[Haskell-cafe] Fwd: Efficient records with arbitrarily many fields [was: Extensible states]

Sat Jul 4 13:54:30 UTC 2015

(forgot to reply to cafe list
Hi.

You can take a look at vector-fixed-hetero [1],
that can act as anonymous structure with arbitrary number of fields,
convertion to and from

datatypes with the same structure and many more features. It's missing
field names though
and syntactic sugar that 'records' package have. If you like to
reinvent a wheel, then

you can use following appoach:

introduce some typelevel info

data a :? (b :: Symbol)

data FieldName (t :: Symbol) = FieldName

introduce generic structure for records with any number of fiels

newtype ARec (m :: [*]) = ARec { unRec :: Array Any }
type instance TProxy (ARec m) = m
type instance VProxy (ARec m) = ARec

rerec :: ARec a -> ARec b
rerec = ARec . unRec

indexRec :: (KnownNat (FieldIndex name m), FieldType name m ~ a)
         => proxy name -> ARec m -> a
indexRec name v = unsafeCoerce $
  indexArray (unRec v) (fromIntegral (natVal (indexProxy name v)))

updateRec :: (KnownNat (Length m), KnownNat (FieldIndex name m),
FieldType name m ~ a)
          => proxy name -> a -> ARec m -> ARec m
updateRec name b v = runST $ do
    out <- newArray len undefined
    copyArray out 0 (unRec v) 0 len
    writeArray out idx (unsafeCoerce b)
    ARec <$> unsafeFreezeArray out
  where
    idx = fromIntegral (natVal (indexProxy name v))
    len = fromIntegral (natVal (lengthProxy v))

you'll need some typelevel magic for that:
type family FieldType (n :: Symbol) m where
  FieldType n ((a :? n) ': z) = a
  FieldType n (b ': z)        = FieldType n z

type family FieldIndex (n::Symbol) m :: Nat where
  FieldIndex n ((a :? n) ': z) = 0
  FieldIndex n (   b     ': z) = 1 + FieldIndex n z

indexProxy :: (KnownNat c, FieldIndex n m ~ c) => proxy1 n -> proxy2 m
-> Proxy c
indexProxy _ _ = Proxy

type family Length m where
  Length '[] = 0
  Length (x ': xs) = 1 + Length xs

lengthProxy :: (KnownNat c, Length n ~ c) => proxy n -> Proxy c
lengthProxy _ = Proxy

then you can implement lenses:

instance (KnownNat (FieldIndex n m), KnownNat (Length m), FieldType n m ~ a)
         => HasField (n :: Symbol) (ARec m)  a where
  getField = indexRec
  updateField p = flip (updateRec p)

fieldLens' :: (HasField name z a, FieldType name (TProxy z) ~ a)
           => FieldName name -> Lens z z a a
fieldLens' name = \f m -> fmap (updateField name m)
                               (f $ getField name m)

type family UpdateType (n :: Symbol) z a b where
  UpdateType n ((a :? n) ': z ) a b = (b :? n) ': z
  UpdateType n (   z     ': zs) a b =    z     ': UpdateType n zs a b

fieldLens :: ( ARec m ~ z, ARec m' ~ z', m' ~ UpdateType name m a b
             , FieldType name m  ~ a, FieldType name m' ~ b
             , KnownNat (FieldIndex name m), KnownNat (FieldIndex name m')
             , KnownNat (Length m), KnownNat (Length m')
             ) => FieldName name -> Lens z z' a b
fieldLens name = \f m -> fmap (updateField name (rerec m))
                              (f $ getField name m)

this approach is more or less the same as records package with only one
datastructure and almost the same syntactic sugar can be applied, the

only missing thing is that pattern matching will be more difficult that with

records.

At this point it's not possible to write strict fields, but it can be
easily extended.

If someone is interested in this sort of wheel, I can prepare a package and some

docs about and with coercion with other solutions like
fixed-vector-hetero and records.

[1] https://hackage.haskell.org/package/fixed-vector-hetero

On Sat, Jul 4, 2015, 16:08 Ben Franksen <ben.franksen at online.de> wrote:
>
> Marcin Mrotek wrote:
> > Okay, perhaps I'm too newbie to understand the big picture, but it
> > seems to me you can get either:
> >
> > a) O(1) access to any, arbitrarily selected (at runtime) field
> > b) O(1) append
> >
> > I guess option a) is better performance-wise, as appending is usually
> > done less often than selecting (an O(1) slice is already possible with
> > independently typed regular Haskell records) but
> > dependently-typed-list-based implementation, or at the very least
> > Vinyl (I haven't ever used HList) has the advantage of being dead
> > simple in both implementation and usage. I mean, with Vinyl, you can
> > write manual recursion over Rec's like:
> >
> > foo :: Rec ... -> Rec ...
> > foo RNil = ...
> > foo (r :& rs) = ...
> >
> > whenever GHC's typechecker gives up and goes on a strike; and I dare
> > to say, with commonly used record sizes (have you ever used a record
> > with more than, let's say, 10 fields?) the speed tradeoff is not
> > noticeable.
>
> While more than 10 fields in a record is uncommon for typical library APIs
> and simple programs, real world projects can grow much larger records. One
> example is configuration data for complex programs (like Darcs or even GHC)
> with many options. It would be so nice if we could use record types for the
> configuration! Another application could in control system toolkits like
> EPICS [1], which currently has (actually: generates) C records with
> potentially hundreds of fields.
>
> If lookup is / remains linear we can never efficiently support these kinds
> of applications and that would be very sad.
>
> I think the most reasonable default is O(1) for lookup and O(n) for
> extension, like in Nikita Volkov's record package. It is quite unfortunate
> that this package limits the number of fields! If GHC would offer generic
> support for tuples of arbitrary size (with the same efficiency as today)
> this limitation could be avoided and all would be well.
>
> Cheers
> Ben
>
> [1] http://www.aps.anl.gov/epics/
> --
> "Make it so they have to reboot after every typo." ― Scott Adams
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

-- 
Alexander