[Haskell-beginners] Using intersecting ranges to narrow a range yields an "Out of Memory" error

Felipe Almeida Lessa felipe.lessa at gmail.com
Mon Jul 25 03:56:53 CEST 2011


On Sun, Jul 24, 2011 at 6:26 PM, Costello, Roger L. <costello at mitre.org> wrote:
> Hi Folks,
>
> I am creating an Int range by starting with the full range of Int:
>
> let r1  =  [minBound::Int .. maxBound::Int]
>
> Then I restrict the upper bound of the range, e.g.,
>
> let r2  =  r1 `intersect` [minBound::Int .. 10]
>
> Then I restrict the lower bound of the range, e.g.,
>
> let r3  =  r2 `intersect` [0 .. maxBound::Int]
>
> No error thus far.
>
> However, when I try to display r3 I get an "Out of Memory" error.
>
> Why is that?

While it is not obvious from the documentation alone, 'intersect' is
not lazy enough.  It is defined as

  intersect xs ys = [x | x <- xs, any (== x) ys]

So, supose you have x:xs and ys such that x ∉ ys.  Then 'intersect
(x:xs) ys' needs to traverse all of ys before deciding that x really
isn't inside ys.

Now, you're doing 'r2 `intersect` [0 .. maxBound::Int]'.  In this
case, 'x:xs = r2' and 'ys = [0 .. maxBound::Int]'.  The first element
of r2 is minBound, which of course is not in ys.  But to find that
out, 'intersect' would need to traverse the whole 'ys' list.  This
should require at least 32 GiB on GHC with a 32 bit platform, and much
much much more on a 64 bit platform.

> I created a function that restricts a range using the above approach:
>
> restrict                                  ::  [Int] -> Restriction-> [Int]
> restrict r (MinInclusive v)    =   r `intersect` [v .. maxBound::Int]
> restrict r (MaxInclusive v)   =   r `intersect` [minBound::Int .. v]
>
> The arguments to "restrict" is a range, r, and an indication of whether the upper bound is to be restricted (MaxInclusive v) or the lower bound is to be restricted (MinInclusive v).
>
> Is there a better way to do this?

This will give you the same problems.

> What can I do to my restrict function to avoid getting an "Out of Memory" error?

By explicitly representing intervals, instead of using the list of
elements they contain.  Something like:

  data Interval a = Interval {mini :: a, maxi :: a} deriving (Eq, Ord, Show)
  type IntervalUnion a = [Interval a]

Instead of '[x..y]', use '[Interval x y]'.  Instead of '[a..b] ++
[c..d]', use '[Interval a b, Interval c d]'.

HTH,

-- 
Felipe.



More information about the Beginners mailing list