lines/unlines and "inverse"
D. Tweed
tweed@compsci.bristol.ac.uk
Sun, 21 Jul 2002 13:35:42 +0100 (BST)
On 21 Jul 2002, Lars Henrik Mathiesen wrote:
> > From: Ian Lynagh <igloo@earth.li>
> > Date: Sat, 20 Jul 2002 15:03:22 +0100
> >
> > [The Revised Haskell 98 report] says
> >
> > -- lines breaks a string up into a list of strings at newline
> > -- characters. The resulting strings do not contain newlines.
> > -- Similary, words breaks a string up into a list of words, which
> > -- were delimited by white space. unlines and unwords are the
> > -- inverse operations. unlines joins lines with terminating
> > -- newlines, and unwords joins words with separating spaces.
> >
> > I think the use of "inverse" is potentially confusing given,
> > well, they aren't inverses (or even left or right inverses).
> >
> > Ian, who thinks (unlines . lines == id) would have been useful. Oh well.
>
> Well, you do have
>
> lines . unlines = id
> unlines . lines . unlines == unlines
> words . unwords . words = words
>
> (unwords . words . unwords) cannot be simplified to unwords, though;
> the results will differ on input that contains 'words' with leading,
> trailing, or multiple consecutive spaces.
>
> However, if you observe a few reasonable constraints on the input to
> any of the functions, you can get it back by feeding the output to the
> 'inverse' function:
>
> words: input must have no leading or trailing blanks
> lines: input must end in a newline
> unwords: input list elements must be non-empty and must not
> contain blanks
> unlines: input list elements must not contain newlines
To put what Lars is saying in a more `puffed up' way, there's an idea of a
`canonical form' for a string containing a words and lines, and you do
have
unlines.lines == toCanonicalRep
It's no different from if you had a representation of polynomials
represented as sums of powers using lists of (coefficient,power) pairs
ordered by descending power values so that, eg, 2 x^2 +1 would be
represented by either [(2,2),(1,0)] or [(2,2),(0,1),(1,0)] since zero
coefficients don't affect things. Given functions `factorise' which
convert to a product of factors representation and `expand' which expands
a product of factors, then these are inverses in the sense that
f.expand.factorise == f
__providing f does not give different values depending upon coefficients
which are zero__, i.e., f is completely defined by it's action on the
canonical forms of the polynomials. There'd be no `cognitive dissonance'
calling them inverses in this case because zero coefficients don't have
any interesting effects on polynomials (e.g., changing the length of the
coefficient list isn't interesting; if they somehow added new zeroes they
would become interesting). But arguably for strings precisely what the
whitespace originally was __is__ significant and we don't naturally have a
mental model with the canonical form given by Lars, so maybe replacing
`inverse' with the more vague `converse' might be good.
___cheers,_dave_________________________________________________________
www.cs.bris.ac.uk/~tweed/ | `It's no good going home to practise
email:tweed@cs.bris.ac.uk | a Special Outdoor Song which Has To Be
work tel:(0117) 954-5250 | Sung In The Snow' -- Winnie the Pooh