[Haskell-cafe] Re: Longest increasing subsequence
ChrisK
haskell at list.mightyreason.com
Fri Apr 11 15:33:33 EDT 2008
My late night suggestions were nearly correct. I have actually written the code
now. Once keeping track of indices, and a second time without them:
> {-# LANGUAGE BangPatterns #-}
> -- By Chris Kuklewicz, copyright 2008, BSD3 license
> -- Longest increasing subsequence
> -- (see http://en.wikipedia.org/wiki/Longest_increasing_subsequence)
> import Data.List (foldl')
> import Data.Map (Map)
> import qualified Data.Map as M (empty,null,insert,findMin,findMax
> ,splitLookup,deleteMin,delete)
>
> type DList a = [a] -> [a]
>
> lnds :: Ord a => [a] -> [a]
> lnds = lnds_decode . lnds_fold
>
> lnds_fold :: Ord a => [a] -> Map a (DList a)
> lnds_fold = foldl' process M.empty where
> -- The Map keys, in sorted order, are the input values which
> -- terminate the longest increasing chains of length 1,2,3,…
> process mu x =
> case M.splitLookup x mu of
> (_,Just {},_) -> mu -- ignore x when it is already an end of a chain
>
> (map1,Nothing,map2) | M.null map2 ->
> -- insert new maximum element x
> if M.null mu
> then M.insert x (x:) mu -- x is very first element
> else let !xs = snd (M.findMax mu)
> in M.insert x (xs . (x:)) mu
>
> | M.null map1 ->
> -- replace minimum element with smaller x
> M.insert x (x:) (M.deleteMin mu)
>
> | otherwise ->
> -- replace previous element oldX with slightly smaller x
> let !xs = snd (M.findMax map1)
> !oldX = fst (M.findMin map2) -- slightly bigger key
> !withoutOldX = M.delete oldX mu
> in M.insert x (xs . (x:)) withoutOldX
>
> lnds_decode :: Ord a => Map a (DList a) -> [a]
> lnds_decode mu | M.null mu = []
> | otherwise = snd (M.findMax mu) []
>
> tests = [ ['b'..'m'] == (lnds $ ['m'..'s'] ++ ['b'..'g'] ++ ['a'..'c'] ++ ['h'..'k'] ++ ['h'..'m'] ++ ['d','c'..'a'])
> , "" == lnds ""
> , "a" == lnds "a"
> , "a" == lnds "ba"
> , "ab" == lnds "ab"
> ]
Comparing to wikipedia:
The X[M[1]],X[M[2]],… sequence is strictly increasing. These are the ends of
the current increasing chains of length 1,2,… and they are the keys to the Map
in my code.
The values of the map are the subsequences themselves, in DList form.
Instead of pointing to the index of the previous element I just lookup '!xs' and
append '(x:)' to that.
Complexity:
The strictness annotations ensure that the garbage collector can destroy any
unreachable DList entries. The space usage is thus O(N) and may be O(1) for
certain inputs (such as the best case of never-increasing input list). A
strictly increasing input list is the worst case for space usage.
The naive time complexity of 'process' for the i'th input value is O(log i).
This can be double checked by looking at the time complexity of everything I
import from Data.Map.
Peak performance could be had by
(1) adding the first element before the foldl' to avoid checking for this
case in process
(2a) accessing the internal map structure to optimize the
splitLookup->delete->insert case into a single operation
(2b) Using something like a zipper to access the to-be-deleted-and-replaced
element of the map
The (2a) and (2b) work because we know the changed key will go into the same
'slot' of the map as the old one.
--
Chris
More information about the Haskell-Cafe
mailing list