[Haskell-cafe] One-Off detection. Question about Space/Time complexity

Olaf Klinke olf at aatal-apotheke.de
Thu Mar 23 20:26:39 UTC 2017


I believe this is the classic publication on the subject:

http://users.monash.edu/~lloyd/tildeStrings/Alignment/92.IPL.html

The Haskell implementation is here:

http://users.monash.edu/~lloyd/tildeFP/Haskell/1998/Edit01/

In case of a pair of one-off strings, it is O(m) time. Not sure whether the entire strings are kept in memory. My gut feeling is that they must be kept in memory in the worst case (edit distance max(m,n)) but stretches of matched prefixes can be garbage collected. So in your particular case it should indeed be O(1) space. Profiling suggests the algorithm is not O(1) space, but I might have done the input strings wrong. 

input :: Int -> String
input = flip replicate 'a'

length . input should be O(1) space, shouldn't it? With ghc 7.8.3 and -O2 it is not. Running the edit distance on the pair (input m, 'b' : input m) and profiling the program suggests the algorithm is O(m) space. Andrew Butterfield's oneOff seems to be linear space, too.
-- Olaf

$cat test.hs
import System.Environment
import Data.List (foldl')
import EditDistance (d)
input :: Int -> String
input n = replicate n 'a'

main = do
  args <- getArgs
  let n = read (head args)
  let x = input $! (n+1)
  let y = 'b' : input n
  print (d x y)

$ghc -O2 --make -prof -fprof-auto test.hs
[2 of 2] Compiling Main             ( test.hs, test.o )
Linking test ...

$for n in 1000 10000 100000; do ./test $n +RTS -p && grep 'total alloc' test.prof; done
1
	total alloc =     358,128 bytes  (excludes profiling overheads)
1
	total alloc =   3,022,696 bytes  (excludes profiling overheads)
1
	total alloc =  29,663,256 bytes  (excludes profiling overheads)



More information about the Haskell-Cafe mailing list