[GHC] #3549: unlit does not follow H98 spec

GHC ghc-devs at haskell.org
Fri Jul 24 03:04:59 UTC 2015


#3549: unlit does not follow H98 spec
-------------------------------------+-------------------------------------
        Reporter:  duncan            |                   Owner:
            Type:  bug               |                  Status:  new
        Priority:  normal            |               Milestone:  ⊥
       Component:  Compiler          |                 Version:  6.10.4
      Resolution:                    |                Keywords:
Operating System:  Unknown/Multiple  |            Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |               Test Case:
      Blocked By:                    |                Blocking:
 Related Tickets:                    |  Differential Revisions:
-------------------------------------+-------------------------------------

Comment (by lenish):

 I recently ran into this bug when trying to learn Haskell.

 I always indent my LaTeX blocks to improve readability, but that doesn't
 work properly with unlit. The readline function handles prefixed
 whitespace properly. It's when we get to checking for `\end{code}` after
 `\begin{code}` that things break.

 The file I'm working with:

 {{{
 \documentclass[12pt]{article}
 \begin{document}
   \maketitle

   \section{Type Declaration}
   \begin{code}
     data Thingy = Thing
   \end{code}

 \end{document}
 }}}

 Unfortunately, from the original report, it sounds as though the
 specification requires these lines to not be indented? Being new to
 Haskell & GHC I've no idea how to go about changing things like that or if
 such a change is possible. My general impression is that the 'correct' way
 to do this would be to parse the file, locate `\begin{code}` and
 `\end{code}` sections that TeX would treat as such (not in comments, etc),
 and then strip out everything but those blocks. That it does not work this
 way currently makes this feature way less enticing to me, as I won't be
 able to create maintainable and readable LaTeX files without proper
 indentation.

 If there were a desire for this to be reimplemented in Haskell with proper
 (or a reasonable approximation of proper) TeX parsing then I wouldn't mind
 trying to figure out how to do that as part of my learning how to program
 in Haskell project. Again, not sure what the situation is with the
 specification or even if such a patch would be accepted?

 I have a relatively simple, hacky patch which prevents leading whitespace
 from causing `\end{code}` to fail to match. I also modified the PSEUDOCODE
 section, as it seems to have the same issue. There may be issues with this
 patch in the event someone had a 1000 character line, but that seems...
 excessive.

 {{{#!diff
 diff --git a/utils/unlit/unlit.c b/utils/unlit/unlit.c
 index a367a0a..dff43bc 100644
 --- a/utils/unlit/unlit.c
 +++ b/utils/unlit/unlit.c
 @@ -273,7 +273,15 @@ void unlit(char *file, FILE *istream, FILE *ostream)
                      exit(1);
                  }
                  linesread++;
 -                if (strncmp(lineb,ENDCODE,LENENDCODE) == 0) {
 +
 +                size_t offset = 0;
 +                for(offset = 0; offset < sizeof lineb; ++offset) {
 +                        if (!isWhitespace(lineb[offset])) {
 +                                break;
 +                        }
 +                }
 +
 +                if (strncmp(&lineb[offset],ENDCODE,LENENDCODE) == 0) {
                      myputc('\n', ostream);
                      break;
                  }
 @@ -289,9 +297,17 @@ void unlit(char *file, FILE *istream, FILE *ostream)
                      complain(file, linesread, MISSINGENDPSEUDOCODE);
                      exit(1);
                  }
 +
 +                size_t offset = 0;
 +                for(offset = 0; offset < sizeof lineb; ++offset) {
 +                        if (!isWhitespace(lineb[offset])) {
 +                                break;
 +                        }
 +                }
 +
                  linesread++;
                  myputc('\n', ostream);
 -                if (strncmp(lineb,ENDPSEUDOCODE,LENENDPSEUDOCODE) == 0) {
 +                if
 (strncmp(&lineb[offset],ENDPSEUDOCODE,LENENDPSEUDOCODE) == 0) {
                      break;
                  }
              }

 }}}

--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/3549#comment:12>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler


More information about the ghc-tickets mailing list