[Haskell-cafe] The Proliferation of List-Like Types

Thu Feb 21 07:37:48 EST 2008

On Thu, Feb 21, 2008 at 11:37 AM, Duncan Coutts
<duncan.coutts at worc.ox.ac.uk> wrote:
>
>
>  On Thu, 2008-02-21 at 10:06 +0100, Johan Tibell wrote:
>  > Hi John!
>  >
>  > On Wed, Feb 20, 2008 at 3:39 PM, John Goerzen <jgoerzen at complete.org> wrote:
>  > >  3) Would it make sense to base as much code as possible in the Haskell
>  > >    core areound ListLike definitions?  Here I think of functions such
>  > >    as lines and words, which make sense both on [Char] as well as
>  > >    ByteStrings.
>  >
>  > I don't think the examples you gave (i.e. lines and words) make much
>  > sense on ByteStrings. You would have to assume that the sequence of
>  > bytes are in some particular Unicode encoding and thus words and lines
>  > will break if they get passed a ByteString using a different encoding.
>  > I don't think either of those two functions make sense on anything but
>  > sequence of character types like String.
>
>  That's exactly what the Data.ByteString[.Lazy].Char8 modules provide, a
>  Char8 view of a Bytestring. Those modules provide functions like words,
>  lines etc that assume an ASCII compatible 8bit encoding.

I would be very happy if people didn't use the .Char8 versions of
ByteString except for being able to write byte literals using pack. (I
would be even happier if Haskell had byte literals.) If people start
using ByteString in their library interfaces instead of String I'll be
really miserable because I can't really use their libraries for
writing applications that need to be internationalized because their
libraries would be limited to ASCII.

Data.ByteString and Data.ByteString.Char8 uses the same ByteString
type so I can take some bytes in UTF-32 which I read from the network
and use Data.ByteString.Char8 functions on them which will fail. I
prefer that a type that represent characters is guarded by encode and
decode functions. If that's not the case it's easy to mix data in
different encodings by mistake when e.g. writing web applications
which involve data in several different encodings.

>  One day we'll have a separate type that does Unicode with a similar fast
>  packed representation.

That will be a good day. :)

-- Johan