[Haskell-cafe] Compiling arbitrary Haskell code

Fri Oct 11 21:41:05 UTC 2013

On Fri, Oct 11, 2013 at 1:30 PM, Christopher Done <chrisdone at gmail.com>wrote:

> Is there a definitive list of things in GHC that are unsafe to
> _compile_ if I were to take an arbitrary module and compile it?
>
> E.g. off the top of my head, things that might be dangerous:
>
> * TemplateHaskell/QuasiQuotes -- obviously
> * Are rules safe?
> * #includes ? I presume there's some security risk with including any old
> file?
> * FFI -- speaks for itself
>

It really depends on the security properties you want to maintain. That
should inform your policy. For example, denial of service vs. leaking
information (like password db) vs. allowing yourself to become part of a
botnet. There are lots of things to consider here.

For example, lambdabot has always disallowed IO and thus needs to disallow
unsafeCoerce/unsafePerformIO/unsafeInterleaveIO and anything else that
introduces a "backdoor" in the type system. I think the list you have above
is a good start, but wouldn't be complete for lambdabot.

>
> I'm interested in the idea of compiling Haskell code on lpaste.org,
> for core, rule firings, maybe even Th expansion, etc. When sandboxing
> code that I'm running, it's really easy if I whitelist what code is
> available (parsing with HSE, whitelisting imports, extensions). The
> problem of infinite loops or too much allocation is fairly
> straight-forwardly solved by similar techniques applied in mueval.
>

What type of sandboxing do you plan to use and what limitations does it
have? For example, chroot jails can be defeated.

>
> SafeHaskell helps a lot here, but suppose that I want to also allow
> TemplateHaskell, GeneralizedNewtypeDeriving and stuff like that,
> because a lot of real code uses those. They only seem to be restricted
> to prevent cheeky messing with APIs in ways the authors of the APIs
> didn't want -- but that shouldn't necessarily be a security?in terms
> of my system?problem, should it? Ideally I'd very strictly whitelist
> which modules are allowed to be used (e.g. a version of TH that
> doesn't have runIO), and extensions, and then compile any code that
> uses them.
>

GND can be used to cause a segfault. I don't know if it can be used to
cause a more serious exploit, but I would be concerned that it can. Then
again, if you're already allowing TH or arbitrary IO then those are
probably much easier places to attack so it may not matter.

>
> I'd rather not have to setup a VM just to compile Haskell code safely.
> I'm willing to put some time in to investigate it, but if there's
> already previous work done for this, I'd appreciate any links.
>

I don't know how well it's documented, but lambdabot has a long history of
restricting the Haskell it accepts to make it safe. Other things to look
at, google native client (to see how they approach sandboxing), and geordi
the C++ IRC bot.

In the native client case they do fancy tricks with segment registers (to
control where the sandboxed process can write to memory) and intercepting
system calls in the outer part of the process. They have the case where
they do everything in one process in one address space. You could imagine
porting the GHC RTS to run in native client (didn't someone start on that?)
and then using that to sandbox all your Haskell evaluation.

>
> At the end of the day, there's always just supporting a subset of
> Haskell using SafeHaskell. I'm just curious about the more general
> case, for use-cases similar to my own.
>

I think SafeHaskell is a reasonable starting place, but I don't think it
gives you a really strong guarantee yet. Everything that is inferred safe
probably is (I don't know of any exploits with that part of SafeHaskell).
In practice, you'll probably also want to use some trusted packages, but
that requires that none of the stuff your trust is exploitable.

I hope that helps,
Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20131011/5b2a4c5f/attachment.html>