[Haskell-beginners] How to make this more functional?

Frerich Raabe raabe at froglogic.com
Tue Jul 18 06:32:15 UTC 2017


On 2017-07-18 06:10, Jeffrey Brown wrote:
> I wrote a 10-line program[1] for converting from org-mode format to 
> smsn-mode format. Both formats use indentation to indicate hierarchy. In
> org, a line at level k (levels are positive integers) starts with k 
> asterisks, followed by a space, followed by the text of the line. In
> smsn-mode, a line at level k starts with 4*(k-1) spaces, followed by an 
> asterisks, followed by a space.
> 
> I feel like there ought to be an intermediate step where it converts the 
> data to something other than string -- for instance,
> 
> data IndentedLine = IndentedLine Int String | BadLine
> 
> and then generates the output from that.

I think generating an intermediate data structure makes a lot of sense.

To make your program more 'functional', I'd start by factoring out the IO 
part as early as often. I.e. consider your program to be a function of type 
'String -> String': it consumes a string, and yields a string:

   reformat :: String -> String

Now, reformatting the input means splitting it into lines, converting each 
line, and then merging the lines into a single string again, i.e. we can 
define 'reformat' as:

   reformat input = unlines (convertLine (lines input))

To make this type-check, clearly you need some functions with the types

   lines :: String -> [String]
   convertLine :: String -> String
   unlines :: [String] -> String

As it happens, the first and the last function are part of the standard 
library, so we only need to worry about 'convertLine'. Converting a line 
means parsing the input line and the serialising the parsed data to the 
output format, i.e.

   convertLine line = serialiseToOutput (parseLine line)

At this point, some sort of data structure to pass from parseLine to 
serialiseToOutput would be useful. You could certainly go for the 
'IndentedLine' type you sketched, i.e. the parseLine function can be declared 
to be of type

   parseLine :: String -> IndentedLine

I'll skip defining this function, but it might be that the 'span' function 
defined in the Data.List module might be useful here. With that at hand, you 
only need to define the serialiseToOutput function which (in order to make 
this program type-check) needs to be of type

   serialiseToOutput :: IndentedLine -> String

Again, I'll omit the definition here (but the 'replicate' function would 
probably be useful).

At this point, you should have your 'reformat' function fully defined and 
usable from within 'ghci', i.e. you can nicely test it with some manual 
input. What's missing is to use it in a real program - you could of course 
plug it into your existing program calling 'readFile', but as a last idea I'd 
like to mention the standard 'interact' function which, given a function of 
type 'String -> String', yields an IO action which reads some input from 
stdin, applies the given function to it, and then prints the output to 
stdout. A useful helper for defining UNIX-style filter programs.

I believe one lesson to take from this is to not think about how the program 
does something ('count the number of * characters, etc.) but rather think 
about _what_ the program does - in this case, in a top-down fashion. Also, in 
Haskell, this type-driven development works quite nicely to yield programs 
which you can tinker with very early on.

-- 
Frerich Raabe - raabe at froglogic.com
www.froglogic.com - Multi-Platform GUI Testing


More information about the Beginners mailing list