From simonmar@microsoft.com Tue Jan 30 11:37:04 2001 Date: Tue, 30 Jan 2001 03:37:04 -0800 From: Simon Marlow simonmar@microsoft.com Subject: A Haskell Documentation Standard
[ resending... mail problems at MS, sorry if you see this twice ]

Hi All,

Henrik - thanks for your comprehensive message!  I've set up the mailing
list on haskell.org (haskelldoc@haskell.org; go to
http://www.haskell.org/mailman/listinfo/haskelldoc).  Let's communicate
on there from now on.

My thoughts on the matter appear to differ slightly from yours and
Jan's, so I'll try to express my own priorities for a documentation
system.

   - we're all agreed that what is needed is an automatic
     documentation-generation tool that takes annotated Haskell
     and delivers documentation.  It's important that we don't
     tie ourselves to one particular documentation format, so
     I believe we need a separately-specified format for the
     in-source documentation.  This is one target for attack.

   - a big win is that even without any annotations, you get
     documentation for free (i.e. hyperlinked type signatures
     for exported functions, class definitions, etc.).  Just having
     this working would make we very happy :)

   - here's where we differ: instead of defining a new interface
     format and writing code to generate/parse the new format, I
     propose we define the format to be the same as Haskell source.
     Why?
	    - we already have a pretty printer/parser for Haskell
		source.
	    - we can start generating documentation from existing
            source straight away, without waiting for support from
            the compiler writers (who are generally a lazy bunch :-)
	
     It is true that by eliminating the compiler from the
     documentation path we lose the benefit of automatically derived
     types.  However, the compiler often infers ugly types anyway
     (type synonyms may be expanded), and it's always good practice
     to give type signatures to every exported function.  I bet there
     are very few functions in GHC's libraries that are exported
     without a type signature.

     Of course, I'm thinking more about "external" than "internal"
     documentation.  But even for internal documentation, I don't
     think it would be a big loss not to get types if they weren't
     explicitly mentioned in the source.  

     And there's an upgrade path: if we define the "interface-file"
     format to be exactly the same as the syntax of a Haskell module
     (perhaps with code optinoal), then the compiler can always just
     read a source file and output the same file with types filled in.

I see the priorities as (a) getting the format for the documentation
annotations specified, and (b) extending HDoc to do the business (and to
add GHC extensions :).

Cheers,
	Simon


From jans@numeric-quest.com Tue Jan 30 18:03:47 2001 Date: Tue, 30 Jan 2001 13:03:47 -0500 (EST) From: Jan Skibinski jans@numeric-quest.com Subject: A Haskell Documentation Standard
On Tue, 30 Jan 2001, Simon Marlow wrote:

>    - a big win is that even without any annotations, you get
>      documentation for free (i.e. hyperlinked type signatures
>      for exported functions, class definitions, etc.).  Just having
>      this working would make we very happy :)

	Right, but without the comments.
	The question of compiler support aside ...

>      And there's an upgrade path: if we define the "interface-file"
>      format to be exactly the same as the syntax of a Haskell module
>      (perhaps with code optinoal), then the compiler can always just
>      read a source file and output the same file with types filled in.

	Am I missing something here? I thought that the Haskell Report
	does not the comments semantics. 
	They are the free floating entities, and can be put anywhere
	as it pleases a module developer. And we can have many possible
	types of comments describing: module, datatype, class, function,
	etc.
	Add to it some important decorations, delimiting some
	categories of functions, and we are in a complete mess.

	Granted the variety of styles used, how on earth even the
	cleverest parser can figure it out? Typical parsers (I am not
	talking here about HDoc) do not care because they skip the
	comments anyway but it is not the case for the documentation
	extractors.

	And the scheme: [(E1, C1), (E2, C2) ...] where (Ei,Ci)
	are pairs of top level entities and their associated
	comments, may work sometimes under the assumption that
	Ci always precedes Ei, but this is only the assumption
	which I can easily break anytime I wish: I can insert
	a category decoration somewhere in between or even reverse
	the order from Ci Ei to Ei Ci - what I often do because it
	pleases me so.

	This is the sad affair, because we could treat the problem
	the same way we treat the function signatures. The association
	between the function signatures (FSi) and their bodies (FBi)
	are not in any way positionally described. Sure, I have not
	seen a code whether someone writes:
	FS1 FS2 FS3 FB3 FB2 FB1, or FB1 FS1,
	but this is still OK - at least in Hugs, which I just checked
	to see if I am correct.

	There are few solutions possible:
	1. Describe precise positional standard for placement
	   of comments (see Eiffel (*) example)
	2. Invent markups to identify comment types and their
	   associations (see Java. But I am not talking here
	   about the variable markups, etc. for a moment.)
	3. Deliver development tools that force one particular
	   layout and use its own internal markup system, but
	   which is never visible to humans, unless they insist.
	   (See Smalltalk browsers enforcing the style. Eiffel
	   has it too, but admits old fashion editing, as long
	   as the result conforms to the standard.) 
	4. Bite a bullet and admit that comments are as important
	   as the code; extend Haskell Report to add those specialized
	   comments to be part of the language
	...... etc. 

	All of those have their problems however.

	Jan

	P.S.
	Because our first few messages are not stored in the
	haskelldoc archive I am again showing the pointer to Eiffel
	online book. Chapters 1 and 5 talk about documentation
	issues.

	http://www-staff.socs.uts.edu.au/~rist/eiffel/

	There are plenty of Eiffel sites. The oldest is the
	ISE site of Bertrand Meyer, the language inventor:

 	http://www.eiffel.com 

	 
	 
	






From simonmar@microsoft.com Wed Jan 31 11:46:27 2001 Date: Wed, 31 Jan 2001 03:46:27 -0800 From: Simon Marlow simonmar@microsoft.com Subject: A Haskell Documentation Standard
Jan Sibinki writes:

> 	Am I missing something here? I thought that the Haskell Report
> 	does not the comments semantics. 
> 	They are the free floating entities, and can be put anywhere
> 	as it pleases a module developer. And we can have many possible
> 	types of comments describing: module, datatype, class, function,
> 	etc.
> 	Add to it some important decorations, delimiting some
> 	categories of functions, and we are in a complete mess.
> 
> 	Granted the variety of styles used, how on earth even the
> 	cleverest parser can figure it out? Typical parsers (I am not
> 	talking here about HDoc) do not care because they skip the
> 	comments anyway but it is not the case for the documentation
> 	extractors.

Sorry for not being clear about this.  You're right in that trying to
understand arbitrary comments in Haskell source isn't workable.  The
documentation annotations must be in a special format that the
documentation tool can understand.

eg.  HDoc's {--- .. -} style comments.  I had in mind using Haskell's
pragma convention, like this:

{-# DOC f
    <desc><id>f</id> turns people into frogs</desc>
    <arg><id>x</id> A <type>Person</type></arg>
    <ret>A <type>Frog</type></ret>
#-}
f :: Person -> Frog
f x = ...

with similar annotations for classes, instances, datatypes, newtypes
etc.  There's no requirement that the documentation appears directly
before the source code for the function; since it contains the
identifier of the entity being documented, it can be placed anywhere
(even in a different file).

The markup format for the documentation is of course up for discussion.
XML seems plausible but verbose.

If I understand correctly, I think you were proposing a two stage
process to get the documentation (similar to the Eiffel approach?):

	Haskell source --> interface ---> on-line documentation
                                   `--> printed documentation
                                        .....

Why not do it in one?

	Haskell source ---> on-line documentation
			   `--> printed documentation
                          ....

Cheers,
	Simon


From groessli@fmi.uni-passau.de Wed Jan 31 13:32:56 2001 Date: Wed, 31 Jan 2001 14:32:56 +0100 (MET) From: Armin Groesslinger groessli@fmi.uni-passau.de Subject: A Haskell Documentation Standard
On Wed, 31 Jan 2001, Simon Marlow wrote:

> eg.  HDoc's {--- .. -} style comments.  I had in mind using Haskell's
> pragma convention, like this:
>
> {-# DOC f
>     <desc><id>f</id> turns people into frogs</desc>
>     <arg><id>x</id> A <type>Person</type></arg>
>     <ret>A <type>Frog</type></ret>
> #-}
> f :: Person -> Frog
> f x = ...
>
> with similar annotations for classes, instances, datatypes, newtypes
> etc.  There's no requirement that the documentation appears directly
> before the source code for the function; since it contains the
> identifier of the entity being documented, it can be placed anywhere
> (even in a different file).
>

That means that classes and instances (and their functions) have to be
distinguished by giving the complete type of the class / instance
declaration, right?

E.g.

  class X a b where
     f :: a -> b

  instance X Person Frog where
     f x = ...

How do we avoid that the tool confuses the two version of "f" ?
An obvious way would be

     {-# DOC f INSTANCE X Person Frog ... #-}

     {-# DOC f CLASS    X             ... #-}


(HDoc required similar "help" in early versions, but I changed that in
 favour of considering positional information to reduce the redundancy
 required in the annotations.)


I guess additional annotations are an unavoidable drawback when not
relying on positional information. On the other hand, being able to put
the documentation in different files may be a big advantage.

So, should we allow both variants? I.e. use positional information when
the pragma happens to be next to a class/instance declaration (or a
function therein) and rely on extra information (like "CLASS X") in the
other case?


> The markup format for the documentation is of course up for discussion.
> XML seems plausible but verbose.
>

As flexibility should be a priority here (we want to produce many
different output formats, right?), I think XML is verbose, but not too
verbose. I don't see an alternative format which is significantly less
verbose. And, there's HaXml which could do a good job at processing the
documentation (I haven't had a very close look at HaXml, yet).


Regards,

Armin



From jans@numeric-quest.com Wed Jan 31 10:00:59 2001 Date: Wed, 31 Jan 2001 05:00:59 -0500 (EST) From: Jan Skibinski jans@numeric-quest.com Subject: A Haskell Documentation Standard
On Wed, 31 Jan 2001, Simon Marlow wrote:

> Sorry for not being clear about this.  You're right in that trying to
> understand arbitrary comments in Haskell source isn't workable.  The
> documentation annotations must be in a special format that the
> documentation tool can understand.
> 
> eg.  HDoc's {--- .. -} style comments.  I had in mind using Haskell's
> pragma convention, like this:
> 
> {-# DOC f
>     <desc><id>f</id> turns people into frogs</desc>
>     <arg><id>x</id> A <type>Person</type></arg>
>     <ret>A <type>Frog</type></ret>
> #-}
> f :: Person -> Frog
> f x = ...

	Machine-wise, this is an excellent format, because it
	not only signifies that this comment is important
	(as HDoc does by using triple dashes) but also associates
	the comment with the specific entity (as I was discussing
	it in the previous post). Easy to parse, no ambiguities.

	Human-wise, this is a terrible thing. I would never
	like to read sources written this way, nor to produce
	them by hand this way.

> The markup format for the documentation is of course up for discussion.
> XML seems plausible but verbose.

	Now take a look at this disciplined ascii version:

	f :: Person -> Frog
	f x
	    -- A frog made from a person 'x'
	    = .....
	   (Note the I always specify the result by placing
	    it up front of the sentence. No need for <ret>,
	    no need for types to be told twice.) 

	or at a more complex example to make even a stronger point:

	f :: Person -> Frog -> Bool
	f person y 
	    -- True if a 'person' can be turned
	    -- to a frog
	    -- where
	    --    y is a frog
	    --
	    = ....

	I am sure there is nothing in your XML version that I
	did not explain in plain English of my versions - unless you
	want to cross-reference all the frogs, persons and
	booleans (which would not make any sense to me).
	
	But the the major difference between the two is such
	that I can still read the source files with ease.
	
	This is similar to the Eiffel style of documentation:
	you extract the signature, the left hand side
	of function definition and the comments. You place
	them in your interface (which can be pretty printed
	in any format) in exactly this order - from most
	general info to the most detailed explanation. Comments
	can refer to arguments by their names for clarity.
	And a function comment clearly becomes a part of a
	function definition.

	This positional method has few drawbacks however. First,
	this specific order cannot be applied to functions
	with multiple equations (the order: "signature, comment,
	equations" looks better in such cases), although it is
	still fine with guards.
	Secondly, this could open a can of protests about the order
	requirements. Thirdly, it requires a lot of self discipline.
	However, the final result definitely exceeds other styles
	-- readability-wise.

	But I am not trying to promote this style, I am just
	pointing out some more readable alternatives to your original
	version. For example, this would do equally well:

	f :: Person -> Frog
	f x
	{-# Doc f
	    -- Frog from person 'x'
	-#}
	    = .....

	and could be pretty printed, with pragmas removed. But
	--as a developer -- I would still hate reading and writing
	those pragma tokens, especially the complex thingies
	-- as pointed by Armin in another post  .... unless I
	would never see them anytime during the development
	cycle.
	
	In theory it is possible to have both worlds with the
	help of tools: you write a single function in plain English,
	annotate it (or not - depending on the tool), and it
	comes back again, but prettified. So if we want both
	worlds then we better provide good tools and then
	convince the community that this is the way to work
	with Haskell.
	 
> If I understand correctly, I think you were proposing a two stage
> process to get the documentation (similar to the Eiffel approach?):

	No, the numbered list in my previous post did not represent
	any order of steps, but some options I was musing about.

	Jan




From simonmar@microsoft.com Wed Jan 31 14:08:58 2001 Date: Wed, 31 Jan 2001 06:08:58 -0800 From: Simon Marlow simonmar@microsoft.com Subject: A Haskell Documentation Standard
> That means that classes and instances (and their functions) have to be
> distinguished by giving the complete type of the class / instance
> declaration, right?
> 
> E.g.
> 
>   class X a b where
>      f :: a -> b
> 
>   instance X Person Frog where
>      f x = ...
> 
> How do we avoid that the tool confuses the two version of "f" ?
> An obvious way would be
> 
>      {-# DOC f INSTANCE X Person Frog ... #-}
> 
>      {-# DOC f CLASS    X             ... #-}

For classes, the 'class' keyword isn't required: classes and type
constructor share the same namespace, so an identifier beginning with an
upper case letter is unambiguous.

I hadn't considered the documentation of individual instances, but
perhaps you need that for internal documentation (most of the time, the
only documentation you need for an instance is that it exists).  If
documentation for instances is required, then yes, you have to give the
full instance header in order to identify the exact instance.

> (HDoc required similar "help" in early versions, but I changed that in
>  favour of considering positional information to reduce the redundancy
>  required in the annotations.)
> 
> 
> I guess additional annotations are an unavoidable drawback when not
> relying on positional information. On the other hand, being 
> able to put
> the documentation in different files may be a big advantage.
> 
> So, should we allow both variants? I.e. use positional 
> information when
> the pragma happens to be next to a class/instance declaration (or a
> function therein) and rely on extra information (like "CLASS 
> X") in the
> other case?

I wouldn't object to allowing both variants - perhaps the convention
could be that if the identifer is missing, then it applies to the
following declaration.

PS. Henrik - before this discussion gets too detailed, do you want to
announce the existence of the mailing list on haskell@haskell.org?

Cheers,
	Simon


From jans@numeric-quest.com Wed Jan 31 23:29:51 2001 Date: Wed, 31 Jan 2001 18:29:51 -0500 (EST) From: Jan Skibinski jans@numeric-quest.com Subject: ready to announce?
	 
On Wed, 31 Jan 2001, Henrik Nilsson wrote:

> Simon Marlow wrote:
> 
> > PS. Henrik - before this discussion gets too detailed, do you want to
> > announce the existence of the mailing list on haskell@haskell.org?
> 
> I think it would be very good if we could agree on some general
> design goals and principles before the discussion gets too detailed,
> and incorporate these into an announcement. Obviously, they would
> not be cast in stone, but I think that having a little bit of structure
> from the outset would help in getting a focused and constructive
> design process going, and I also believe that it is somewhat easier
> to decide on that structure in a relatively small group. (Even
> among us, there seems to be some quite different opinions on
> what we are trying to achieve! ;-)

	I second it. We need a bit of time to have some feel
	for each other's position and to define the basic
	terms and goals.

	I also suggest that our discussions should be summarized
	periodicly for the benefit of ours and those who will
	join us later. Henrik, could you do it, say under the heading
	"Summary - date", every so often?

	Jan




From jans@numeric-quest.com Wed Jan 31 23:54:55 2001 Date: Wed, 31 Jan 2001 18:54:55 -0500 (EST) From: Jan Skibinski jans@numeric-quest.com Subject: raw machine standard
	I sympathize with Henrik idea about the developing
	the "raw", rich, machine interface standard. I appreciate it
	because I already experienced the impact of incompatibilities
	on development of Haskell Module Browser.

	I use NHC and Hugs interfaces as helpers an guidelines 
	even though I do extract other information directly
	from sources. From this perspective I consider it one of
	the priorities, especially because Henrik made me realize
	that the incompatibilities could multiply when an implementor
	decided one day to switch to a new format. And this looks
	quite probable - vide the announcement of the new version of Hugs.
	It appears that Johan Nordlander is taking over the Hugs
	maintenance.

	Examples of incompatibilities between NHC and Hugs
	interfaces are numerous. One good example is different
	representation of function signatures:
	f :: a -> a -> Int		   - Hugs
	f :: (a -> (a -> Prelude.Int))     - NHC

	Jan





From simonmar@microsoft.com Mon Jan 29 13:06:39 2001 Date: Mon, 29 Jan 2001 05:06:39 -0800 From: Simon Marlow simonmar@microsoft.com Subject: A Haskell Documentation Standard
Hi All,

Henrik - thanks for your comprehensive message!  I've set up the mailing
list on haskell.org (haskelldoc@haskell.org; go to
http://www.haskell.org/mailman/listinfo/haskelldoc).  Let's communicate
on there from now on.

My thoughts on the matter appear to differ slightly from yours and
Jan's, so I'll try to express my own priorities for a documentation
system.

   - we're all agreed that what is needed is an automatic
     documentation-generation tool that takes annotated Haskell
     and delivers documentation.  It's important that we don't
     tie ourselves to one particular documentation format, so
     I believe we need a separately-specified format for the
     in-source documentation.  This is one target for attack.

   - a big win is that even without any annotations, you get
     documentation for free (i.e. hyperlinked type signatures
     for exported functions, class definitions, etc.).  Just having
     this working would make we very happy :)

   - here's where we differ: instead of defining a new interface
     format and writing code to generate/parse the new format, I
     propose we define the format to be the same as Haskell source.
     Why?
	    - we already have a pretty printer/parser for Haskell
		source.
	    - we can start generating documentation from existing
            source straight away, without waiting for support from
            the compiler writers (who are generally a lazy bunch :-)
	
     It is true that by eliminating the compiler from the
     documentation path we lose the benefit of automatically derived
     types.  However, the compiler often infers ugly types anyway
     (type synonyms may be expanded), and it's always good practice
     to give type signatures to every exported function.  I bet there
     are very few functions in GHC's libraries that are exported
     without a type signature.

     Of course, I'm thinking more about "external" than "internal"
     documentation.  But even for internal documentation, I don't
     think it would be a big loss not to get types if they weren't
     explicitly mentioned in the source.  

     And there's an upgrade path: if we define the "interface-file"
     format to be exactly the same as the syntax of a Haskell module
     (perhaps with code optinoal), then the compiler can always just
     read a source file and output the same file with types filled in.

I see the priorities as (a) getting the format for the documentation
annotations specified, and (b) extending HDoc to do the business (and to
add GHC extensions :).

Cheers,
	Simon