Deprecating Safe Haskell, or heavily investing in it?

Tue Dec 27 20:48:47 UTC 2022

Viktor Dukhovni <ietf-dane at dukhovni.org> writes:

> On Tue, Dec 27, 2022 at 06:09:59PM +0100, Hécate wrote:
>
>> Now, there are two options (convenient!) that are left to us:
>> 
>> 1. Deprecate Safe Haskell: We remove the Safe mechanism as it exists 
>> today, and keep the IO restriction under another name. This will 
>> certainly cause much joy amongst maintainers and GHC developers alike. 
>> The downside is that we don't have a mechanism to enforce "Strict 
>> type-safety" anymore.
>> 
>> 2. We heavily invest in Safe Haskell: This is the option where we amend 
>> the PVP to take changes of Safety annotations into account, invest in 
>> workforce to fix the bugs on the GHC side. Which means we also invest in 
>> the tools that check for PVP compatibility to check for Safety. This is 
>> not the matter of a GSoC, or a 2-days hackathon, and I would certainly 
>> have remorse sending students to the salt mines like that.
>> 
>> I do not list the Status Quo as an option because it is terrible and has 
>> led us to regularly have complaints from both GHC & Ecosystem libraries 
>> maintainers. There can be no half-measures that they usually tend to 
>> make us slide back into the status quo.
>> 
>> So, what do you think?
>
> I think that "Restricted IO" would in principle be the more sensible
> approach.  HOWEVER, for robust "sandboxing" of untrusted code what's
> required is more than just hiding the raw IO Monad from the sandboxed
> code.  Doing that securely is much too difficult to do correctly, as
> evidenced by the ultimate failure (long history of bypass issues) of
> similar efforts for enabling restricted execution of untrusted code in
> Java (anyone still using Java "applets", or running Flash in their
> browser???).
>
> The only way to do this correctly is to provide strong memory separation
> between the untrusted code and the TCB.  The only mainstream working
> examples of this that I know of are:
>
>     * Kernel vs. user space memory separation.
>
>     * Tcl's multiple interpreters, where untrusted code runs in
>       slave interpreters stripped of most verbs, with aliases
>       added to wrappers that call back into the parent interpreter
>       for argument validation and restricted execution.
>
> Both systems provide strong memory isolation of untrusted code, only
> data passes between the untrusted code and the TCB through a limited
> set of callbacks (system calls if you like).
>
> For "Safe Haskell" to really be *safe*, memory access from untrusted
> code would need to be "virtualised", with a separate heap and foreign
> memory allocator for evaluation of untrusted code, and the RTS rewriting
> and restricting all direct memory access.  This means that "peek" and
> "poke" et. al. would not directly read memory, but rather be restricted
> to specific address ranges allocated to the untrusted task.
>
> Essentially the RTS would have to become a user-space microkernel.
>
> This is in principle possible, but it is not clear whether this is worth
> doing, given limited resources.
[...]

The other possibility is to break up your application into functional
components, that run in separate processes and communicate through
channels.

That doesn't require any special RTS support, but it requires some
infratructure, e.g., to automatically derive message types from function
signatures, and to manage the subprocesses. (Ideally, one would also
restrict the available kernel interface of each component using, e.g.,
seccomp on Linux or pledge on OpenBSD)