[Haskell-cafe] library for set of heterogeneous values?

David Feuer david.feuer at gmail.com
Thu Aug 11 18:58:27 UTC 2016


I don't know if they're at all suitable for your purposes, but you
should look (at least for inspiration) at both vinyl and
dependent-map.

On Thu, Aug 11, 2016 at 12:37 AM, Anthony Clayden
<anthony_clayden at clear.net.nz> wrote:
> I have a (let's call it) database of heterogeneous records.
>
> They're not Haskell records, but anonymous/extensible type-labelled rows.
> (Could be tuples, could be HLists, could be Lens-like, could be something
> fancier.)
>
> There's a small number (dozens) of distinct row types, each with a large
> number (thousands) of rows. The variety of row-types is not predictable in
> advance. And indeed a row might 'morph' over time with fields added/removed.
>
> So the obvious answer of putting the lot into a giant HList (each element of
> the list being a row) isn't going to scale. I could have a type-indexed
> HList in which each element is a Set of homogeneous rows. But performance
> still suffers from scanning along the list to find the right type index.
>
> Is there something better? On hackage there's two packages called HSet,
> neither giving very much help about their suitability:
>
> * `hset` (lower case) [AlekseyUymanov] seems isomorphic to a type-indexed
> HList.
>       ie Must be unique type in each element (could be a Set type, I guess)
>
> * `HSet` (upper case) [athanclark] "Faux heterogeneous sets" seems a lot
> meatier
>      why the "Faux"?
>      built over hashtables in the ST monad.
>
> Has anybody used these? Can give guidance on what they can and can't?
>
> Bonus questions:
>
> Given a filter specifying a restriction on (some) fields of rows, I want to
> get a heterog subset:
> * all rows with at least those fields, matching those restrictions.
> * the restriction might be merely "has field labelled L".
>
> GIven a candidate row for insertion, I want first to scan for
> quasi-duplicates:
> * any existing row with a subset of the given fields, and the same value at
> those fields.
> * any existing row with a superset of the given fields, and the same value
> at those fields in common.
> * ignore records with only a partial overlap of fields.
>
> One possible data structure: a "vertical store".
> Give each row a Globally Unique Id.
> Have a separate set for each possible field,
> where the set elements are field value (key) to set of GUId -- records with
> that value.
>
> Then I have a different bonus question:
> * how to retrieve all field values for a given GUId?
>
>
> Thanks
> AntC
>
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.


More information about the Haskell-Cafe mailing list