[Haskell-cafe] really difficult for a beginner like me...

Sun May 4 08:24:04 EDT 2008

Hi everyone,

this is my first posting on here and this was what drove me here and the quest to know more as i anticipate a lot of help and direction in this quite new and different environment haskell.I have this paper that i'm working on and need to solve these scenarios/ cases based on some sample codes:

Scenarios/cases:

1)                  Allow words to be hyphenated and treat the hyphenated word as a single word (including the hyphen).                          

2)                  As for no. 2 but if the hyphen is the last character on a line treat the hyphenated word as a single word without the hyphen.                                                                                     

3)                  Treat a capitalised word (one or more capital letters) the same as lower case, i.e. only the lower case word appears in the index.                                                                                  
4)                  Treat a word ending in an ‘s’ as a plural and thus the same as the singular, i.e. only the singular appears in the index.                       
5)                  As for no. 5 but (a) treat suffix ‘ss’ as not a plural; and (b) treat the plural suffixes ‘sses’, ‘zzes’, ‘oes’, ‘xes’, ‘shes’, ‘ches’ the same as the singular, i.e. without the ‘es’, e.g. “branches” (except for 4- and 5-letter plurals with suffices ‘oes’  and ‘ches’, e.g. “floes”, and 4-letter plural suffix ‘xes’); and (c) treat the plural suffix  ‘ies’ (except for 4-letter plurals, e.g. “pies”) as the singular suffix ‘y’.                                                               
 6)                  Make the output more readable in the form of an index table.
 7)                  Use an output file                                                                         
8)                  Include a user-friendly menu by which the user can choose input and output file names.               

This is the code i'm supposed tomodify and in some cases create new functions to support.I also need some explanation as the various approaches in solving them..............................................................................................

The function makeIndex given a document produces a list of entries.  

Each entry is a word and a list of line numbers (for words > 4 letters)

Type definitions:

import Prelude -- hiding (Word)
--  predefined Word hidden, so we can define ours
--    type String     =  [Char]        defined in Prelude
      type Doc         =  String
      type Line        =  String
      type Word      =  String         --  our version
makeIndex  ::  Doc  ->  [ ([Int], Word) ]
A data-directed design considers a sequence of functions (i.e using composition operator ‘.’) to transform the document of type, Doc, into an index of type, [ ([Int], Word) ].

splitUp                   the document, doc, into a list of lines, [Line].
numLines             pairs each line with a line number, [(Int, Line)].
allNumWords       splits lines into words and line no., [(Int, Word)].
sortLs                    sorts words into alphabetical order, [(Int, Word)].
makeLists              makes a list for each line number, [([Int], Word)].
amalgamate          nos. into a list of nos. for each word, [([Int], Word)].
shorten                  into a list for words > 4 letters, [([Int], Word)].

makeIndex
 =   shorten .                     --          [([Int], Word)]            -> [([Int], Word)]
      amalgamate .              --          [([Int], Word)]            -> [([Int], Word)]
      makeLists .                 --          [(Int, Word)]  -> [([Int], Word)]
      sortLs .                       --          [(Int, Word)]  -> [(Int, Word)]
      allNumWords .          --          [(Int, Line)]    -> [(Int, Word)]
      numLines .                 --          [Line]              -> [(Int, Line)]
      splitUp                        --          Doc                 -> [Line]
      Last                             --         [a]                    -> [a ]

splitUp function

splitUp :: Doc -> [Line]
splitUp [] = []
splitUp  text
 = takeWhile (/='\n') text :                       --          first line
   (splitUp .                                                --          splitup other lines
    dropWhile (==’\n’) .                            --          delete 1st newline(s)
    dropWhile (/='\n')) text                       --          other lines 

Example:
      splitUp “hello world\n\nnext world”
=>  [“hello world”, “next world”]

numLines function:

numLines :: [Line] -> [(Int, Line)]
numLines lines                                         --          list of pairs of 
 = zip [1 .. length lines] lines                    --          line no. & line

Example:
      numLines  [“hello world”, “next world”]
=>  [(1, “hello world”), (2, “next world”)]

splitWords function:

--    for each line
--    a)         split into words
--    b)         attach line no. to each word

splitWords :: Line -> [Word]                 --          a)
splitWords [ ] = [ ]
splitWords  line 
 = takeWhile isLetter line :                     --          first word in line
      (splitWords .                                      --          split other words
        dropWhile (not.isLetter && Last ==’-’) .             --          delete separators
        dropWhile isLetter) line                  --          other words
   where
   isLetter ch
      =           (‘a’<=ch) && (ch<=’z’)
        || (‘A’<=ch) && (ch<=’Z’)
Example:
      splitWords  “hello world”  =>  [“hello”, “world”]

allNumWords function:

numWords :: (Int, Line) -> [(Int, Word)]                      --          b)
numWords (number, line)
 = map addLineNum ( splitWords line)             --          all line pairs
   where
   addLineNum word = (number, word)                       --          a pair

allNumWords :: [(Int, Line)] -> [(Int, Word)]
allNumWords = concat . map numWords                    --          doc pairs 

Examples:
      addLineNum  “hello”                         =>  (1, “hello”)
      numWords  (1, “hello world”)  =>  [(1, “hello”), (1, “world”)]
      allNumWords [(1, “hello world”), (2, “next world”)]
      =>        [(1, “hello”), (1, “world”), (2, “next”), (2, “world”)]

SortLs function:

sortLs :: [(Int, Word)] -> [(Int, Word)]
sortLs  [ ]  =  [ ]
sortLs (a:x)
 = sortLs [b | b <- x, compare b a]                                  --          sort 1st half
   ++  [a]  ++                                                                      --          1st in middle
   sortLs [b | b <- x, compare a b]                                   --          sort 2nd half
    where
    compare (n1, w1) (n2, w2)
     = (w1 < w2)                                                                 --          1st word less
         || (w1 == w2 && n1 < n2)                                      --          check no. 

Example:
      sortLs [(1, “hello”), (1, “world”), (2, “next”), (2, “world”)]
=>  [(1, “hello”), (2, “next”), (1, “world”), (2, “world”)]

makeLists function:

makeLists :: [(Int, Word)] -> [([Int], Word)]
makeLists 
 = map mk                                                                        --  all pairs
   where mk (num, word) = ([num], word)
                                                                                          --  list of single no. 
Examples:
      mk  (1, “hello”)           =>  ([1], “hello”)

      makeLists        [(1, “hello”), (2, “next”), (1, “world”), (2, “world”)]
=>  [([1], “hello”), ([2], “next”), ([1], “world”), ([2], “world”)]

Amalgamate function:

amalgamate :: [([Int], Word)] -> [([Int], Word)]
amalgamate [ ] = [ ]
amalgamate [a] = [a]
amalgamate ((n1, w1) : (n2, w2) : rest)                                      --  pairs of pairs
 | w1 /= w2            = (n1, w1) : amalgamate ((n2, w2) : rest)
 | otherwise           = amalgamate ((n1 ++ n2, w1) : rest)
                                                      --          if words are same grow list of numbers

Example:

      amalgamate  [([1], “hello”), ([2], “next”), ([1], “world”), ([2], “world”)]
=>  [([1], “hello”), ([2], “next”), ([1, 2], “world”)]

Shorten function:

shorten :: [([Int], Word)] -> [([Int], Word)]
shorten
 = filter long                                                         --          keep pairs >4
   where
   long (num, word) = length word > 4               --          check word >4

Example:

      shorten  [([1], “hello”), ([2], “next”), ([1, 2], “world”)]
=> [([1], “hello”), ([1, 2], “world”)]

_________________________________________________________________
Connect to the next generation of MSN Messenger 
http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20080504/6d99c75c/attachment.htm