[Haskell-cafe] Difficulties in accessing inner elements of data
types
David Miani
davidmiani at gmail.com
Tue Mar 3 05:57:57 EST 2009
Hi,
I'm working on a Haskell library for interacting with emacs org files. For
those that do not know, an org file is a structured outline style file that
has nested headings, text, tables and other elements. For example:
* Heading 1
Some text, more text. This is a subelement of Heading 1
1. You can also have list
1. and nested lists
2. more...
** Nested Heading (subelement of Heading 1)
text... (subelement of Nested Heading)
** Another level 2 heading (subelement of Heading 1)
| Desc | Value |
|-----------+--------------------------------------|
| Table | You can also have tables in the file |
| another | row |
| seperator | you can have seps as well, eg |
|-----------+--------------------------------------|
* Another top level heading
There are many more features, see orgmode.org
My library enables read and write access to a subset of this format (eg lists
aren't parsed atm).
The data structures used for writing are:
data OrgFile = OrgFile [OrgFileElement]
data OrgFileElement = Table OrgTable
| Paragraph String
| Heading OrgHeading
-- heading level, title, subelements
data OrgHeading = OrgHeading Int String [OrgFileElement]
data OrgTable = OrgTable [OrgTableRow]
data OrgTableRow = OrgTableRow [String] | OrgTableRowSep
To write a file you contruct a OrgFile out of those elements, and pass it to a
writeOrgFile func. Eg:
writeOrg $ OrgFile [Heading (OrgHeading 1 "h1" [Paragraph "str"])]
would produce:
* h1
str
I was going to use the same data structures for reading an org file, but it
quickly became apparent that this would not be suitable, as you needed the
position of the file of an element to be able to report errors. Eg if you
needed to report an error that a number was expected, the message "'cat' is
not a number" is not very useful, but "Line 2031: 'cat' is not a number" is.
So the data structures I used were:
data FilePosition = FilePosition Line Column
data WithPos a = WithPos {
filePos :: FilePosition,
innerValue :: a
}
data OrgTableP = OrgTableP [WithPos OrgTableRow]
data OrgFileElementP = TableP OrgTableP
| ParagraphP String
| HeadingP OrgHeadingP
data OrgHeadingP = OrgHeadingP Int String [WithPos OrgFileElementP]
data OrgFileP = OrgFileP [WithPos OrgFileElementP]
Finally there is a function readOrg, which takes a string, and returns an
OrgTableP.
Now, this all works as expected (files are correctly being parsed and
written), however I am having a lot of trouble trying to come up with a decent
API to work with this. While writing an OrgFile is fairly easy, reading (and
accessing inner parts) of an org file is very tedious, and modifying them is
horrendous.
For example, to read the description line for the project named "Project14" in
the file:
* 2007 Projects
** Project 1
Description: 1
Tags: None
** Project 2
Tags: asdf,fdsa
Description: hello
* 2008 Projects
* 2009 Projects
** Project14
Tags: RightProject
Description: we want this
requires the code:
type ErrorS = String
listToEither str [] = Left str
listToEither _ (x:_) = Right x
get14 :: OrgFileP -> Either ErrorS String
get14 (OrgFileP elements) = getDesc =<< (getRightProject . concatProjects)
elements where
concatProjects :: [WithPos OrgFileElementP] -> [OrgHeadingP]
concatProjects [] = []
concatProjects ((WithPos _ (HeadingP h)) : rest) = h : concatProjects rest
concatProjects (_ : rest) = concatProjects rest
getRightProject :: [OrgHeadingP] -> Either ErrorS OrgHeadingP
getRightProject = listToEither "Couldn't find project14" .
filter (\(OrgHeadingP _ name _) -> name == "Project14")
getDesc :: OrgHeadingP -> Either ErrorS String
getDesc (OrgHeadingP _ _ children) =
case filter paragraphWithDesc (map innerValue children) of
[] -> Left $ show (filePos $ head children) ++
": Couldn't find desc in project"
((ParagraphP str):_) -> Right str
_ -> error "should not be possible"
paragraphWithDesc :: OrgFileElementP -> Bool
paragraphWithDesc (ParagraphP str) = str =~ "Description"
paragraphWithDesc _ = False
If you think that is bad, try writing a function that adds the Tag "Hard" to
Project2 :(
What I really need is a DSL that would allow sql like queries on an OrgFileP.
For example:
select (anyHeading `next`
headingWithName "Project14" `withFailMsg` "couldn't find p14" `next`
paragraphMatchingRegex "Description" `withFailMsg` "no desc")
org `output` paragraphText
would return a String
OR
select (anyHeading `next` headingWithName "Project2" `next`
paragraphMatchingRegex "Tag:") org `modify` paragraphText (++ ",Hard")
would return an OrgFile, with the new Hard tag added.
However, I don't know if this is even possible, how to do it, or if there is a
better alternative to this. I would really apreciate any hints with regards to
this. It would be useful to know if there are other libraries that also face
this problem, and how they solved it.
Finally, I would be grateful for any other advice regarding my code. One thing
that has bugged me is my solution for having file position info - my solution
never seemed very elegant.
Thanks,
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20090303/3115fb94/attachment.htm
More information about the Haskell-Cafe
mailing list