Proposal: Move primitive-Data.Primitive.Addr API into base

Thu Nov 1 06:37:35 UTC 2018

On Wed, Oct 31, 2018 at 9:08 PM Anthony Cowley <acowley at seas.upenn.edu>
wrote:

>
> Edward Kmett writes:
>
> > One messy sketch of how this could proceed would be to make additional
> > Addr-based analogues of the report specified type-unrelated "Ptr a"
> > functions and install them side by side with the existing functions. This
> > would behave much like how we have both readsPrec and readPrec for
> standard
> > vs. GHC specific Read internals. With MINIMAL pragmas it should basically
> > become transparent if they are mutually defined. Since generally Storable
> > dictionaries aren't built out of compositions of other Storable
> > dictionaries, growing the class shouldn't do any measurable harm to
> > performance.
>
> This doesn't sound right to me. I'm probably an outlier, but I rely
> heavily on composite Storable instances in vinyl and Frames and general FFI
> situations. I don't know if a more common usage like an ad hoc record
> relying on the Storable instances of its fields would be greatly impacted,
> but this needs some investigation.

Fair point. It does feel at least worth benchmarking if we do decide to go
down this path.

> More generally, I'm in agreement with the points Sven has been making in
> this discussion, and wouldn't like to see such an unresolved debate be
> resolved by imposing a performance penalty on everyone as a compromise.
>

That said coercing copies of ~2 definitions into a class that already has 8
members seems unlikely to move the needle on this to me. If, say, 1-3% is a
cost that could never be borne then we'd never be able to get around to
fixing things like, say, sizeOf or alignment taking actively harmful
arguments, rather than just type arguments or proxies, so we might need to
consider what is an acceptable trade off in terms of leaning into the
current local optima vs. allowing for growth. Fixing those warts some day
in an eventually standardizable way would incur exactly the same amount of
overhead.

If on the other hand the cost winds up being appreciably higher than that,
then by all means its worth considering other approaches like leaving the
methods in the class alone and adding the Addr versions as top level
methods, if we decide we do want to move Addr into base. I like that option
somewhat less as it means that there is no real sane roadmap for how to go
from there to a simpler story in the future, even if we don't choose to
start down the road to standardizing that behavior prematurely today.

> We haven't been shy about adding new members to
> > report-specified classes like Bits. This doesn't strike me as much
> > different.
>
> It is similar, but I think composite Storable instances are much more
> common than composite Bits instances.
>

Fair, I wound up using a similar scheme for OpenGL Uniform and SSBO
serialization.

It is kind of a mess though as you need to track not just the max
alignment, you need to run through a pass computing current term alignments
in order to correctly pack and compute sizeOf. Otherwise if you stuck a
pair of Word16s after a Word32 you're stuck with Word32 alignment all the
way through, and it doesn't even manage to resemble even the worst struct
packing schemes of the C world.

It strikes me that relative to _that_ setup or evaluation overhead, the
overhead of gluing an extra couple of methods for {peek|poke}ByteOffAddr
onto a class that already has 8 fields in it is unlikely to be appreciable.

I could well be wrong.

My comment was not an attempt to shut down debate but to try to find some
way under which the proposal could plausibly proceed.

-Edward

Anthony
>
> >
> > At some point in the future we can work out if it is worth it to get the
> > report fixed up by incorporating the Addr-based API as the default and
> make
> > the _other_ the legacy.
> >
> > -Edward
> >
> >
> > On Tue, Oct 30, 2018 at 10:11 AM Sven Panne <svenpanne at gmail.com> wrote:
> >
> >> I am not sure if everybody fully comprehends what Storable is all about:
> >> It is meant as the lowest-level building block in an Addr-free world
> >> (remember: Addr is a GHCism and is *not* mentioned anywhere in the
> report)
> >> to put a few well-defined simple Haskell types into memory or read them
> >> from there. Its explicit non-goals are:
> >>
> >>    * Achieve 100% type safety. In the presence of raw memory access,
> >> castPtr, C calls etc. this would be a total illusion. Forcing API users
> to
> >> sprinkle tons of castPtr at every possible place over their code
> wouldn't
> >> improve safety at all, it would only hurt readability.
> >>
> >>    * Handle more complicated sum/product types. How would you do this?
> >> Respect your native ABI (i.e. automatically handle padding/alignment)?
> >> Tightly packed? Or even handle a foreign ABI? Your own ABI? Some funny
> >> encoding like OpenGL's packed data types? Etc. etc. You can build all of
> >> those things in a layer above Storable, probably introducing other type
> >> classes or some marshaling DSLs.
> >>
> >>    * Portability of the written values. This is more in the realm of
> >> serialization libraries.
> >>
> >> More concretely:
> >>
> >> Am Di., 30. Okt. 2018 um 14:34 Uhr schrieb Daniel Cartwright <
> >> chessai1996 at gmail.com>:
> >>
> >>> [19:26:50] <chessai_> hPutBuf :: Handle -> Ptr a -> Int -> IO () [...]
> >>>
> >>
> >> The signature for this is actually perfect: hPutBuf doesn't care about
> >> what stuff has been written into the given buffer, it just cares about
> its
> >> start and its size. Forcing castPtr Kung Fu here wouldn't buy you
> anything:
> >> The buffer will probably contain a wild mix of Haskell values or even no
> >> Haskell values at all, but that doesn't matter. Whatever you pass as
> "a" or
> >> whatever you cast from/to is probably a lie from the typing
> perspective. At
> >> this level this is no problem at all.
> >>
> >>
> >>> [19:30:02] <chessai_> peekByteOff :: Ptr b -> Int -> IO a
> >>> [19:30:09] <chessai_> peekByteOff :: Addr -> Int -> IO a
> >>> [19:30:26] <chessai_> what is 'b' doing there? it's not used in any
> >>> meaningful way by peekByteOff [...]
> >>>
> >>
> >> If you have a pointer pointing to something and shift that pointer by
> some
> >> bytes, you are probably pointing to something completely different, so
> of
> >> course "b" and "a" have nothing to do with each other. So peekByteOff
> >> intentionally ignores "b".
> >>
> >>
> >>> [19:32:22] <carter> pokeElemOff :: Ptr a -> Int -> a -> IO () --- way
> >>> better than peak  [...]
> >>>
> >>
> >> Yes, because this is intended to be used for *arrays* of values of the
> >> same type. Note "Elem" vs. "Byte".
> >>
> >>
> >>> [19:33:12] <carter> hvr: lets add safePeekByteOff :: Ptr a -> Int ->
> IO a
> >>> ? [...]
> >>>
> >>
> >> This signature doesn't make sense, see above: Shifting a pointer by an
> >> arbitrary amount of bytes will probably change the type of what you're
> >> pointing to. If you shift by units of the underlying type, well, that's
> >> peekElemOff.
> >>
> >>
> >>> [19:35:31] <chessai_> carter: i am glad we agree on the smell
> >>>
> >>
> >> I don't have the full chat log, but I think I don't even agree on the
> >> smell, at least not at the places I've seen... :-)
> >>
> >>
> >> _______________________________________________
> >> Libraries mailing list
> >> Libraries at haskell.org
> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> >>
> > _______________________________________________
> > Libraries mailing list
> > Libraries at haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
> _______________________________________________
> Libraries mailing list
> Libraries at haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20181101/c79e8551/attachment.html>