[Haskell-cafe] needsaname :: ([a] -> Maybe (b, [a])) -> (b -> [a]) -> [a] -> [a]

Fri Jul 6 14:00:12 EDT 2007

Hi,

Yet another "Function looking for a name" post. Here's the type:

 > morph :: ([a] -> Maybe (b,[a])) -> (b -> [a]) -> [a] -> [a]

Here, I am calling ([a] -> Maybe (b,[a])) the 'selector'. It is
actually the same type as a simple parser. I am calling (b -> [a]) the
'transformer'.

Once you've chosen your selector and transformer, it's a stream
processing function [a] -> [a].

The idea is that the 'selector' looks at the front of the stream and
decides if it matches some predicate. (Not necessarily just based on
only the first element, though). If it does not, it will return
Nothing and that element is passed through unchanged. If it matches
then it produces (parses) some representation or calculation 'b'. The
transformer is then responsible to transforming this 'b' back into a
list of as to be spliced back into the stream.

So morph is a general stream manipulation combinator.

Let me give some examples:

* Replace all strings of odd numbers by (just one occurence of) the
   number 0

*Main> morph (sAll odd) (const [0]) [1,3,2,3,4,4,3,1,5]
[0,2,0,4,4,0]

* Add one to all even numbers

*Main> morph (sAll even) (map (+1)) [1,3,2,3,4,4,3,1,5]
[1,3,3,3,5,5,3,1,5]

Here "sAll" is the very simply defined:

 > sAll p l@(x:xs) | p x       = Just (span p l)
 >                 | otherwise = Nothing

* simple string replacement

*Main> morph (sMatch "fox") (const "ferret") "the quick brown fox jumped 
over the lazy dog"
"the quick brown ferret jumped over the lazy dog"

* string match + modify

*Main Data.Char> morph (sMatch "fox") (("'"++).(++"'").reverse . map 
toUpper) "the quick brown fox jumped over the lazy dog"
"the quick brown 'XOF' jumped over the lazy dog"

...in the above two examples, sMatch is defined as:

 > sMatch a l | a `isPrefixOf` l = Just (a,drop (length a) l)
 >            | otherwise        = Nothing

I'm fairly sure that the active hive-mind of haskell-cafe will be able
to think of many more uses of this function. I actually wrote it to do
HTML fix-up, working with the TagSoup library. A few quick definitions
and it becomes easy to express things like 'remove all FONT, BR and U
tags; replace all instances of B with SPAN CLASS="important"...'; the
task of repairing broken HTML and replacing simplistic markup with
semantic markup. It's very powerful to have the separation between
selection and transformation; it's quite easy to build up powerful
libraries of selectors and use them with simple transformers.

Here is the definition of morph:

 > morph sel trans [] = []
 > morph sel trans l@(x:xs) = case sel l of Nothing      -> x : morph 
sel trans xs
                                            Just (b,xs') -> trans b ++ 
morph sel trans xs'

I've written it to run over lists, but it would not be difficult to
make it run over ByteStrings instead, and exploit the 'no-copying'
effect on the bits of the stream which were not modified, which would
be very handy for programs processing large bytestrings.

It could be used for a streaming filter, for example; although care
must be taken. A composition of 'morphs' will 'read-ahead' as far as
its most greedy selector, so if did want it to stream, you'd have to
make sure that your selectors don't read ahead indefinitely (like an
XML selector waiting for a closing tag would).

There is a close cousin which is fully recursive and not guaranteed to
terminate, where you call 'morph' again on the "trans b" part, which
I've been calling 'rmorph'. You need it if you want to, for example,
remove all correctly nested <FONT>...</FONT> tags. (Although you
wouldn't need it if you forgot about nesting, which is arguably more
sane for that particular case).

Has anyone spotted this function or a close analogue elsewhere in the
haskell canon? Does anyone have a better name for it that 'morph'?

Jules