[Haskell-cafe] Unnecessarily strict implementations
Jan Christiansen
jac at informatik.uni-kiel.de
Thu Sep 2 18:22:14 EDT 2010
Hi,
On 02.09.2010, at 13:41, Daniel Fischer wrote:
> takes a little to run and keeps the entire file until the first
> occurrence
> of pat in memory.
first of all thanks very much for the detailed instructions.
I have rewritten the example slightly using Strings instead of
Bytestrings. Replacing all occurrences of 'ä' by "ä" in the
collected works of Shakespeare ; ) has a maximum memory usage of
around 65MB with the current implementation of intersperse while it
has a maximum memory usage of only around 5KB with the less strict
implementation.
replaceBy :: Eq alpha => alpha -> [alpha] -> [alpha] -> [alpha]
replaceBy x sep = concat . intersperse sep . splitBy (==x)
splitBy :: (alpha -> Bool) -> [alpha] -> [[alpha]]
splitBy _ [] = []
splitBy p xs =
case break p xs of
(l,ys) -> l : case ys of
[] -> []
(_:zs) -> splitBy p zs
This function only runs in constant space if I use the strict pattern
matching on the result of break. If I use the following implementation
I observe a linear memory usage instead.
splitBy' :: (alpha -> Bool) -> [alpha] -> [[alpha]]
splitBy' _ [] = []
splitBy' p xs =
l : case ys of
[] -> []
(_:zs) -> splitBy' p zs
where
(l,ys) = break p xs
I think this is due to the Wadler tuple space leak. The same would
apply to the current implementation of lines. I wonder whether an
implementation of lines analogous to splitBy has any disadvantages.
>> That is, we could have
>>
>> intersect [] _|_ = [] and intersect (_|_:_|_) [] = []
>>
>> or
>>
>> intersect [] (_|_:_|_) = [] and intersect _|_ [] = []
>>
>> and the current implementation satisfies neither.
> Right. So the question is, has the current implementation advantages
> over
> either of these? (I don't see any.) If not, which of these two
> behaviours
> is preferable?
I'd prefer the first one as it is in line with the left to right
pattern matching of Haskell.
> I have mixed feelings about those. Part of me dislikes breaking the
> symmetry between (<=), (==) and compare.
I think you should not blame (<=) for the existence of a function that
yields a superset of the information that (<=) yields ; )
Cheers, Jan
More information about the Haskell-Cafe
mailing list