[Haskell-cafe] Software Tools in Haskell

Wed Dec 12 13:51:58 EST 2007

Gwern Branwen wrote:
> Some of those really look like they could be simpler, like 'copy' -
> couldn't that simply be 'main = interact (id)'?
> 
> Have you seen <http://haskell.org/haskellwiki/Simple_Unix_tools>?
> 
> For example, 'charcount' could be a lot simpler - 'charcount = showln
> . length' would work, wouldn't it, for the core logic, and the whole
> thing might look like:
> 
>> main = do (print . showln . length) =<< getContents
> 
> Similarly wordcount could be a lot shorter, like 'wc_l = showln .
> length . lines'
> 
> (showln is a convenience function: showln a = show a ++ "\n")

Yes, that's absolutely true, and I am adding a section showing 
implementations based on interact as soon as I send this message.  The 
reason I didn't do so before is that I was trying to (to an extent) 
preserve the structure of the original implementations, which means 
using an imperative style.

Strangely, I have considerably more confidence in the imperative-ish 
Haskell code than I do in the imperative Pascal code, in spite of the 
fact that they are essentially the same.  Probably this is due to the 
referential transparency that monadic IO preserves and that does not 
even enter into the picture in traditional Pascal.  For example, the 
pseudo-nroff implementation has a giant, horrible block of a record 
(containing the state taken directly from K&P) that is threaded through 
the program, but I am tolerably happy with it because I know that is the 
*only* state going through the program.

Further, while interact could probably handle all of the filter-style 
programs (and if I understand correctly, could also work for the main 
loop of the interactive editor) and a similar function could handle the 
later file-reading programs, I do not see how to generalize that to the 
out-of-core sort program.

(Plus, interact is scary. :-D )

> I... I want to provide a one-liner for 'detab', but it looks
> impressively monstrous and I'm not sure I understand it.

If you think that's bad.... :-)

detab is one of the programs I do not like.  I kept the "direct 
translation" approach up through that, but I think it really hides the 
simplicity there; detab copies its input to its output replacing tabs 
with 1-8 spaces, based on where the tab occurs in a line.  The only 
interesting state dealt with is the count of characters on each line, 
but that gets hidden, not emphasized.

On the other hand, I'm not looking for one-liners; I really want clarity 
as opposed to cleverness.

> One final comment: as regards run-length encoding, there's a really
> neat way to do it. I wrote a little article on how to do it a while
> ago, so I guess I'll just paste it in here. :)

That *is* neat.

-- 
Tommy M. McGuire
mcguire at crsr.net