[Haskell-cafe] Binary Data Access via PIC…??
Ian Denhardt
ian at zenhack.net
Sun Jan 13 20:02:59 UTC 2019
Shameless plug for one of my own libraries, which seems at least
relevant to the problem space:
https://hackage.haskell.org/package/capnp
Though as a disclaimer I haven't done any benchmarking myself; my
personal interest is more in RPC than in super-fast serialization.
There will be a release with RPC support sometime later this month.
That said, I have heard from one user who's using it to communicate with
a part of their application written in C++, who switched over to from
protobufs for perf, and because they needed to handle very large (>
2GiB) data.
-Ian
Quoting Nick Rudnick (2019-01-13 07:43:40)
> On NL FP day, it struck me again when I saw an almost 1 MB *.hs file
> with apparent sole purpose of getting a quantity of raw data
> incorporated to the binary � applying some funny text encoding
> constructs. I remembered that, to my best knowledge, with major
> downside that it's compile time, this appears to be the best solution
> to me�
> Another approach I did notice several times was, say, the use of super
> fast parsing, to read in binary data at run time.
> Did I miss something?
> Or, more specifically � I am speaking about that kind of binary data
> which is
> (1) huge! � the 1 MB mentioned above rather being at the lower limit,
> (2) completely independent from the version of the Haskell compiler,
> (3) guaranteed (externally!) to match the structural requirements of
> the application referred to,
> (4) well managed in some way, concerning ABI issues, too (e.g.
> versioning, metadata headers etc.),
> and the question is in how far � as I believe other languages do, too �
> we can exploit PIC (position independent code), to read in really large
> quantities of binary data at run time or immediately before run time,
> without the need for parsing at all.
> E.g., a textual data representation Haskell file will generate an an
> object file already, for which linking only should have a limited
> amount of assumptions regarding its inner structure. Imagining I have a
> huge but simple DB table, and a kind of converter which by some
> simplification of a Haskell compiler generates an object file that
> equally matches these (limited, as I believe) assumptions, and at the
> end can build a 'fake' the linker accepts instead of one dummy file
> skeleton � couldn't that be a way leading into the direction of
> directly getting in vast amounts of binary data in one part?
> In case there are stronger integrity needs, extra metadata like should
> be usable for verification of the origin from a valid code generator.
> Of course, while not completely necessary, true run time loading would
> be even greater� while direct interfacing to foreign (albeit simple)
> memory spaces deems much more intricate to me.
> I regularly stumbled about such cases � so I do believe this to useful.
> I would be happy to learn more about this � any thoughts�??
> Cheers, and all the best, Nick
More information about the Haskell-Cafe
mailing list