[Haskell] Eliminating Array Bound Checking through Non-dependent
oleg at pobox.com
oleg at pobox.com
Sun Aug 8 19:50:33 EDT 2004
Bjorn Lisper wrote:
> What is the relation to the sized types by Lars Pareto and John Hughes?
It is orthogonal and complementary, as the message in response to
Conor T. McBride indicated.
> What is the relation to classical range analyses for (e.g.) array index
> expressions, which have been known for a long time for imperative languages?
It is just like the classical range analysis, but _reified_ in the
programming language itself. Given a piece of code:
finda x arr = loop lo
where (lo,hi) = bounds arr
loop i = if i <= hi then
if x == arr ! i then Just i
else loop (i + 1)
the analysis sees that 'i' starts at the lower bound of 'arr' and is
incremented afterwards. When the analysis sees the test "i <= hi" it
infers that in the `then' branch of that test `i' does not exceed the
upper bound of the array. Therefore, the indexing operation `arr ! i'
is safe and no range check is needed.
In the `branding' framework, the programmer makes the result of the
test "i <= hi" and the corresponding implication that `i' is in range
known to the type system, by branding the index `i'. To be more
precise, the programmer would replace the first `if' statement with
if_in_range:: (Ix i) => i -> BArray s i e -> (BIndex s i->r)
-> r ->r
if_in_range i arr on_within_range on_outside_rage ...
If `i' turns out to be in range, that fact would be recorded by
passing to the continuation on_within_range a branded index. Thus the
logical implication that was implicit in the range checker is made
explicit to the typechecker here.
> A program analysis like range analysis is not exact, of course: it must make
> safe approximations sometimes and will sometimes say that an array index
> might be out of bounds when it actually won't. In your framework, this seems
> to correspond to the fact that you must verify your propositions about index
True, just as the range analysis must verify the rules of the
analysis. The difference is that the conventional range analyzer is a
part of the _compiler_, typically hidden from view (of a regular
programmer). Here, the analyzer is a part of a _library_.
It is also true that our analysis can't be exact: if the code includes
let i = very_complex_function j
and we know that j is in range, it may be very difficult to ascertain
that 'i' will always be in range. In that case, we do the following
let j_unbranded = unbrand j
i = very_complex_function j_unbranded
in if_in_range i arr on_within_range on_outside_rage
That is, we intentionally forget the branding information, do a
complex index transformation, followed by a run-time witnessing to
recover the branding. If we somehow know that very_complex_function
keeps its result in range, we can replace `on_outside_rage' with the
function that raises an exception, crashes the computer, etc. If we
are not sure if `i' is in range, then our program
must do the range check anyway; if `i' turns out of range, the
program should do what the algorithm prescribes in that case. The
upshot is that `if_in_range' makes the programmer explicitly consider the
consequences of the range check. We turn the range check from a
routine safety check into an algorithmically significant decision.
Incidentally, if we can prove that `very_complex_function' leaves the
index in range, then we can make the function return a branded index,
and thus eliminate the if_in_range check above. Because the creation
of a branded index can only be done in a trusted kernel, we must put
such a function into the kernel, after the appropriate rigorous
verification -- perhaps formal verification.
More information about the Haskell