[Yhc] Regarding Yhc bytecode versioning

Sun Oct 29 15:59:20 EST 2006

Hi Robert,

> * Compatibility-breaking changes to the bytecode file format.
>
> The second problem is the one I wish to discuss.  Since I began work
> on the Yhc bytecode library in May, there have been at least two
> instances of compatibility-breaking changes.  One relates to Hat
> integration, I believe, and the other has to do with the switch to
> libFFI.

> Both of these changes were made without bumping the version
> number that appears in the file header. I would like to suggest that
> such changes be avoided in the future.

While there is still work ongoing I think its unfortunate but a
definate reality that things will have to be broken in binary file
formats - we want to put the .hi information in .hbc files, that will
require breaking. We want to do a linking pass to merge multiple
.hbc's into one - that will require breaking.

However, you're entirely right, any change that breaks anything from
now on needs a version bump.

> It may also be a good time to think about a ways to future-proof the
> file format so that future additions can be made without breaking
> compatibility.  Right now the format is quite fragile.  Perhaps we
> could take inspiration from the Java classfile format.  The basic
> idea is that there are named blocks of data with a minimal header
> which gives the name of the block and the size of the data payload.
> The name of the block defines the meaning of the data.  Eg, the
> 'CODE' block contains bytecode instructions, etc.  If any block is
> encountered with an unrecognized name, it is ignored.  That way, one
> can have optional blocks, or one can add blocks without breaking
> compatibility.  One can also have optional information (like
> debugging symbols) and things of that nature.

That was always the intention, I'm hoping that once we move to having
Yhc.ByteCode handle everything, we can treat that as an abstraction
over the file format, and then we can work on defining a new .hbc file
format designed to last a very long time without changes. Me and Tom
did a brain storm a while on the "perfect" .hbc file format, but
unfortunately we've never had time to implement or document it...

As a side note, all these issues apply equally to .ycr files, for
which there are now 3 projects making active use of. For that I have
defined a Haskell interface which is the only supported way of getting
at the data (that can't be done for the .hbc files, as the C needs
access to them). I am also very agressively bumping the version number
- the tiniest change gets a new version. I am also ignoring backwards
compatability, at every version bump I just ignore all old files.

Thanks

Neil