[Haskell-i18n] Some starters for the new list

Alastair Reid alastair@reid-consulting-uk.ltd.uk
15 Aug 2002 13:11:46 +0100


> I just want to repeat something somebody suggested, and which I
> thought was a really neat idea: Have string constants in programs be
> replaced by (Prelude.fromString "..") or similar, like numerical
> constants are handled already.

> This was suggested in order to simplify the use of PackedString, but
> I think it might come in handy for translation issues, too.

I find it a little hard to picture this so let's fill in some details
so that we can agree that we're talking about the same thing and also
to make the idea more concrete.

Using typeclasses in this way would require us to make the encoding
explicit in the typesystem.  So we'd define a bunch of types
corresponding to characters and to strings:

  data Char   = .. -- unicode
  data Latin1 = ... -- Latin1
  ...

and we'd define two classes and the basic operations on them.

  class Enum a => Charset a where fromChar   :: Char   -> a
  class Ord a  => String  a where fromString :: String -> a

Why did I define two classes instead of just one?  The more obvious
design was to have

  class Enum a => Charset a where
    fromChar   :: Char   -> a
    fromString :: String -> [a]

but this wouldn't let us make PackedString an instance of it.  This
could be fixed using multiparameter type classes but splitting the
class is easier.  (We might revisit this decision if we want
operations to convert Charsets to Strings and the like.)


Details:

- We might want to add operations to convert back to Unicode - though
  that might require additional parameters to fill in details not
  encoded in the type?

- What should we do if the conversion fails?  For example, if I try to
  convert the unicode yin-yang character (\u262f) to Latin1?

- We probably want additional operations for strings like map, append, 
  etc.

- fromString should be applied to strings used in patterns.

- This requires a minor change in the report which states that a string
  literal is just an abbreviation for a list of characters.


Overall, this looks like it might be a viable approach.  The only
potential showstoppers seem to be what to do when conversion fails.

> (Naturally, the idea is that Prelude.fromString can be repaced by a
> function that looks the string up in a translation table, instead of
> using the default value.  Any reason this won't work?)

This goes quite a bit further than what I suggest above but let's try
to sketch it out.

1) You have to define a new string type:

   newtype FrenchString = FS String

2) You have to define an instance:

   instance String FrenchString where
     fromString (FS "General Protection Fault") = "..."
     fromString (FS "File not found") = "..."
     ...
     fromString (FS _)                = ????

Well, it seems simple enough.  Once again though, we have the problem
of what to do when the conversion fails.  What happens in the real
world?  Do they print the string in English and hope for the best?

I don't feel entirely comfortable with doing things this way.  I think
I'df prefer to see an explicit call to a translation function like
'toFrench'.  I presume that the advantage of this approach would be
that you could use existing libraries without change?  Unfortunately,
the way I've sketched it out, the code has to be modified to use the
type 'FrenchString' instead of 'String' so we don't achieve this goal.

Overall, this doesn't look like it will work.

--
Alastair Reid                 alastair@reid-consulting-uk.ltd.uk  
Reid Consulting (UK) Limited  http://www.reid-consulting-uk.ltd.uk/alastair/