[Haskell-beginners] How to make this more functional?
Frerich Raabe
raabe at froglogic.com
Tue Jul 18 06:32:15 UTC 2017
On 2017-07-18 06:10, Jeffrey Brown wrote:
> I wrote a 10-line program[1] for converting from org-mode format to
> smsn-mode format. Both formats use indentation to indicate hierarchy. In
> org, a line at level k (levels are positive integers) starts with k
> asterisks, followed by a space, followed by the text of the line. In
> smsn-mode, a line at level k starts with 4*(k-1) spaces, followed by an
> asterisks, followed by a space.
>
> I feel like there ought to be an intermediate step where it converts the
> data to something other than string -- for instance,
>
> data IndentedLine = IndentedLine Int String | BadLine
>
> and then generates the output from that.
I think generating an intermediate data structure makes a lot of sense.
To make your program more 'functional', I'd start by factoring out the IO
part as early as often. I.e. consider your program to be a function of type
'String -> String': it consumes a string, and yields a string:
reformat :: String -> String
Now, reformatting the input means splitting it into lines, converting each
line, and then merging the lines into a single string again, i.e. we can
define 'reformat' as:
reformat input = unlines (convertLine (lines input))
To make this type-check, clearly you need some functions with the types
lines :: String -> [String]
convertLine :: String -> String
unlines :: [String] -> String
As it happens, the first and the last function are part of the standard
library, so we only need to worry about 'convertLine'. Converting a line
means parsing the input line and the serialising the parsed data to the
output format, i.e.
convertLine line = serialiseToOutput (parseLine line)
At this point, some sort of data structure to pass from parseLine to
serialiseToOutput would be useful. You could certainly go for the
'IndentedLine' type you sketched, i.e. the parseLine function can be declared
to be of type
parseLine :: String -> IndentedLine
I'll skip defining this function, but it might be that the 'span' function
defined in the Data.List module might be useful here. With that at hand, you
only need to define the serialiseToOutput function which (in order to make
this program type-check) needs to be of type
serialiseToOutput :: IndentedLine -> String
Again, I'll omit the definition here (but the 'replicate' function would
probably be useful).
At this point, you should have your 'reformat' function fully defined and
usable from within 'ghci', i.e. you can nicely test it with some manual
input. What's missing is to use it in a real program - you could of course
plug it into your existing program calling 'readFile', but as a last idea I'd
like to mention the standard 'interact' function which, given a function of
type 'String -> String', yields an IO action which reads some input from
stdin, applies the given function to it, and then prints the output to
stdout. A useful helper for defining UNIX-style filter programs.
I believe one lesson to take from this is to not think about how the program
does something ('count the number of * characters, etc.) but rather think
about _what_ the program does - in this case, in a top-down fashion. Also, in
Haskell, this type-driven development works quite nicely to yield programs
which you can tinker with very early on.
--
Frerich Raabe - raabe at froglogic.com
www.froglogic.com - Multi-Platform GUI Testing
More information about the Beginners
mailing list