[Haskell-cafe] A "less" complex median
michelhaber1994 at gmail.com
Sun May 26 17:44:26 UTC 2019
I wanted to try and write a median filter with good complexity and
performance. This is just an exercise, so there is no use looking for
better performance algorithms (median of medians).
Below is the code I came up with:
type MyState s = State (Int, Int, [(Int, s)])
-- Inserts a value in its sorted place,
-- Removes all expired values encountered,
-- Returns the median value of the list.
insertSorted :: (Ord a) => a -> MyState a a
insertSorted v = do
(size, index, list) <- get
let flt = valid size index -- Filter: If too old remove value
cmp (_,x) = v < x -- Compare: If v spot reached, add v
ret = min index size `div` 2 -- The return value's index
let (newList, med) = goThrough flt cmp ret (index, v) list
put (size, index + 1, newList)
return . snd $ med
-- Internal implementation
goThrough :: (a -> Bool) -> (a -> Bool) -> Int -> a -> [a] -> ([a], a)
goThrough _ _ _ v  = ([v], v)
goThrough flt cmp idx v ls@(x:xs)
| flt x && cmp x = if idx == 0 then (v:ls, v)
else (v:ls, getTo idx (v:ls))
| flt x = let (ys, y) = goThrough flt cmp (max 0 (idx-1)) v xs
in if idx == 0 then (x:ys, x)
else (x:ys, y)
| otherwise = goThrough flt cmp idx v xs
-- Continues going through the list until we get to the element we want, or
-- run out of elements, in which case we take the last one
getTo :: Int -> [a] -> a
getTo _  = error "No value to return!"
getTo _ [x] = x
getTo 0 (x:_) = x
getTo i (_:xs) = getTo (i-1) xs
-- A value is valid if it still has a place in the list (hasn't expired)
valid :: Int -> Int -> (Int, a) -> Bool
valid size index (i,_) = index - i < size
-- Clean up all the old values in the list. This is costly, and should be
-- performed sparingly
clean :: MyState Double ()
clean = do
(size, index, list) <- get
put (size, index, filter (valid size index) list)
-- Insert n values with insertSorted
insertList :: [Double] -> MyState Double [Double]
insertList ls = sequence $ insertSorted <$> ls
main :: IO ()
main = do
values <- sequence $ replicate 10000 $ randomRIO (1,100::Double)
print $ runState (insertList values <* clean) (10, 0, )
The code works, at least in the simple cases I've tried.
It seems, however, way too complex and not very composable.
So I was wondering if anyone could tell me where I went wrong, and how I
can make the code simpler and smaller.
Of course, it is completely possible to divide my insertion function, but
the goal is to do all of the steps in one iteration over the list. This
allows me to later compare what I have with divided functions to discover
if the compiler would optimize them into one traversal.
But even with its current logic, I believe the code can be better written.
I just don't know how :p
So I really appreciate any advice on how to make it better (or fix a bug I
missed, or maybe even the logic of the whole algorithm).
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Haskell-Cafe