[Haskell-cafe] Safe Haskell?

Thu Apr 22 19:23:56 UTC 2021

Am 21.04.21 um 18:39 schrieb Sven Panne:
> Am Mi., 21. Apr. 2021 um 15:57 Uhr schrieb Joachim Durchholz 
> <jo at durchholz.org <mailto:jo at durchholz.org>>:
> 
>     Actually Sandboxes have a just as complicated attack surface as
>     languages.
> 
> 
> Perhaps we are talking about different kinds of sandboxes: I'm talking 
> about sandboxes at the syscall level. Compared to higher level stuff 
> they are relatively simple. They are not small (tons of syscalls exist 
> nowadays), but each syscall is relatively simple compared to stuff built 
> upon it. Size != complexity.

True, but the semantics behind each syscall can be horrendously complex.
The syscall for opening a file is straightforward, but its semantics 
depends on a ton of factors, like whether it's in /proc, on a ramdisk, a 
remote file system, and the semantics of whatever filesystem is plugged 
behind.
That's why you can have a sandbox and this still doesn't protect you 
from symlink timing attacks on /tmp, for example (wait for a system 
process to create a temporary file, delete the file and insert a symlink 
to a file of your choice - will work with a low probability if there's a 
window between file creation and file opening, that's why mktemp and 
friends were built to make this atomic, and we hope(!) that all 
security-relevant programs have been updated to use that instead of 
creating a temp file to open it later).

> And the SW engineering point is: You 
> program your sandbox SW once and can use it for every language/program, 
> without any need to follow the latest and greatest hype in languages, 
> compilers, type systems etc.

Instead you have to make sure that all software uses mktemp instead of 
doing nonatomic file creation&opening.
Which isn't secure by design, you still have to double-check all software.

> This means a *much* higher ROI compared to 
> a language-specific solution. Of course you won't get a PhD for a 
> syscall sandbox...

Except that there is no such thing as an inherently safe syscall 
interface, there are unsafe ways to use it.

And that's where language-based safety can help.

>     I also believe they are different domains.
>     Secure languages deal with making guarantees about what a program does
>     and, more importantly, what it does not do. So you can control things
>     like IO effects, Capabilities, and the language can even make these
>     guarantees statically.
> 
> Only if you trust your language itself, your implementation and its RTS. 

Again, trusting an operating system isn't a smaller surface.
After all, you need to trust the OS as a whole, namely that all the 
syscalls do what they're supposed to do.

Not a win, actually. Operating systems are vastly more complex than 
language runtimes.

It's even stronger.
The compiler code to emit useful error messages is complex, because it 
needs to infer the programmer's intentions.
The compiler code for type inference is pretty complex as well - it 
ranges from pretty easy for pure Hindley-Milner typing to arbitrary 
complexity for dependent types and other things.
However, the code that checks if the program type checks after all type 
inference has been applied - that's actually trivial. It's the moral 
equivalent of a proof checker for first-order predicate logic, and 
that's something that you can do at university (I actually did such a 
thing for my Diploma); *finding* a proof is hard but *checking* one is 
trivial, and that translates to type systems and things like Secure Haskell.

There's still another challenge: Set up a system of axioms, i.e. 
security properties and how they apply when going through the various 
primitives, and find a set that's (a) interesting and (b) understandable 
for programmers.
This is pretty hard to do for syscalls, because their semantics was 
never designed to be easily provable. It's easier to languages if you 
restrict syscall use to a safe subset (for whatever definition of "safe" 
resp. for what set of security-relevant properties you're interested 
in). Such a restriction is good for an easy first prototype, but it 
falls flat as soon has you allow arbitrary syscalls, or linking 
arbitrary libraries.
So for system security regardless of what programming language people 
are using, you probably need something at the syscall level - sandboxes, 
jails, cgroups, whatever else. None of these are perfect - kernel bugs 
still exist, and some syscalls have well-known security issues - so you 
*still* need defense in depth.

> And given the complexity of each of these aspects, basically any 
> language has security holes (see e.g. the recent discussion here about 
> various GHC bugs). A sandbox has a much easier job, it can e.g. always 
> err on the safe side.

No sandbox does that.
Well, the new ones do. Then people start demanding additional features, 
and off you go with new features, new bugs, and new unforeseen interactions.
I'm speaking from experience with the JVM sandboxes, which is an 
in-language mechanism but the problems I saw there could happen 
regardless of whether the sandboxing is done inside a runtime or at the 
syscall level.

>     A sandbox deals more with API objects. This is a much more complicated
>     surface because today's APIs tend to be large, complex, and interact in
>     unexpected fashions; on the other hand, it is much nearer to the actual
>     objects being protected.
>     I.e. I believe the two approaches, while they have some overlap, they
>     serve different purposes and need to complement each other.
> 
> 
> A syscall sandbox needs no help whatsoever from a language.

Well, I hope I just argued why a syscall sandbox is never going to be 
complete.

An additional argument: syscall sandboxes can't prevent data 
exfiltration. There are just too many channels.
And in-language security check could prevent exfiltration of user 
passwords (with the right axioms and inference rules).

>     I believe the opposite is true.APIs change over time.
> 
> The syscall interface is extremely stable, at least compared to the rest 
> of all APIs in existence.

But first-order predicate logic is immutable, and that's all you need 
for a checker.
(The axioms and inference rules can depend on the API, but that's just 
the same as with syscalls.)

BTW syscalls aren't immutable. They're backwards-compatible.
That's irrelevant for a programmer, but it is a world of difference for 
security - a programmer is interested in what can be done, security is 
interested in what can *not* be done, and this changes the picture 
completely.

>     Languages do that, too, but to a much lesser
>     extent, and type system guarantees tend to hold for decades.
> 
> There is no such thing as a type system in the machine code actually 
> running, so you can't trust any higher-level guarantees.

You can't trust the compiler, you can't trust the runtime, and you can't 
trust the kernel.
There's really no big difference there. You just happen to know the ghc 
loopholes better than the syscall loopholes (for me it's the other way 
round).

>     Even filesystem APIs are less stable than that (think NFS, or
>     filesystem-dependent ACLs).
> 
> 
> Do you really see the filesystem implementation on the syscall level? 
> Unless you're doing serious fcntl()/ioctl()-Kung-Fu or stuff like that, 
> I don't think so, but that's just a guess.

No, it's much worse.
Some syscalls are atomic on a local filesystem and nonatomic on NFS. So 
if you're writing security-relevant code, you have to know whether 
you're on NFS or not, and I'm not even sure if you can reliably detect 
if the file you're trying to create&open is local or NFS.
Then there's tons of interesting games you can play with symlinks and 
hardlinks, further complicated by questions like what happens if a 
symlink is cross-filesystem (it may well be that all atomicity 
guarantees go out of the window). Do you even write your code in a way 
that you detect if a physical file was turned into a symlink behind your 
back? Even if your code is running inside a VM and might be subject to 
suspend/resume? (Cloud management software like OpenShift does things 
like suspend your VM, move it do a different hardware, and resume; 
filesystem boundaries can indeed change, and it's easy to confuse 
OpenShift enough to trigger that "oh we need to relocate to another 
hardware" reaction).

Quite a while ago, attacks have started shifting from exploiting 
operating system bugs to exploiting application programmer expectations 
and making these fail.
And a syscall will do nothing to stop that. A language-based security 
framework cannot fully stop that, but it can catch another class of errors.

So I still stand by my assessment that the approaches don't replace each 
other, they're complementing each other and you should pursue both.

That's my 5 cents - YMMV.

Regards,
Jo