[Haskell-cafe] Hugs vs GHC (again) was: Re: Some random newbiequestions

Dimitry Golubovsky dimitry at golubovsky.org
Mon Jan 10 20:48:41 EST 2005



-------- Original Message --------
Subject: Re: [Haskell-cafe] Hugs vs GHC (again) was: Re: Some random	newbiequestions
Date: Mon, 10 Jan 2005 20:47:26 -0500
From: Dimitry Golubovsky <dimitry at golubovsky.org>
To: Marcin 'Qrczak' Kowalczyk <qrczak at knm.org.pl>
References: <3429668D0E777A499EE74A7952C382D102F30BF6 at EUR-MSG-01.europe.corp.microsoft.com>	<877jmmv2zz.fsf at qrnik.zagroda>	<59BACBDC-6273-11D9-8389-000A95E8B0DA at etu.upmc.fr> 
<87acrgssem.fsf at qrnik.zagroda>

Hi,

Let me add a column for Hugs (summarized by looking at recent checkout
from CVS, contained im several C ahd Haskell files):

           |Sebastien's| Marcin's | Hugs
    -------+-----------+----------+------
     alnum | L* N*     | L* N*    | L*, M*, N* <1>
     alpha | L*        | L*       | L* <1>
     cntrl | Cc        | Cc Zl Zp | c < ' ' || c >= '\DEL' && c <= '\x9f'
     digit | N*        | Nd       | c >= '0'   &&  c <= '9'
     lower | Ll        | Ll       | Ll <1>
     punct | P*        | P*       | P*
     upper | Lu        | Lt Lu    | Lu Lt <1>
     blank | Z* \t\n\r | Z*(except| ' ' \t\n\r\f\v U+00A0
                         U+00A0
                         U+2007
                         U+202F)
                         \t\n\v\f\r U+0085

<1>: for characters outside Latin1 range. For Latin1 characters (0 to
255), there is a lookup table defined as
"unsigned char   charTable[NUM_LAT1_CHARS];"

I also like Ketil's idea about defining predicates like isUpper or isSpace
in multiple files, quoting this:

  >> It's not obvious what the predicates should really mean, e.g. should
  >> isDigit and isHexDigit include non-ASCII digits or should isSpace
  >> include non-breaking space characters.

  > I think perhaps the answer is all of the above.  The functions could
  > be defined in multiple modules, so that 'ASCII.isSpace' would match
  > the "normal" space character only, while 'Unicode.isSpace' could match
  > all the weird and wonderful stuff in the standard.

So there might be a bunch of (perhaps autogenerated, from localedef
files) modules for each locale/encoding, like ISO8859_1 or KOI_8. These
modules might be imported into applications as needed. Also there would
be one module autogenerated from the Unicode data files.

Dimitry Golubovsky
Middletown, CT






More information about the Haskell-Cafe mailing list