Haskell Platform Proposal: add the 'text' library
Brandon S Allbery KF8NH
allbery at ece.cmu.edu
Wed Oct 20 01:10:36 EDT 2010
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 10/19/10 22:36 , wren ng thornton wrote:
> <musing>
> I almost wonder if it would be worth it to define a new type, Character,
> which does correspond 1:1 to the human notion of a "character" (being
> intentionally vague about what exactly that means). Then we could have that
> Text is a vector/list/sequence of Characters, and give it the appropriate
> interface for being thought of that way.
I believe Perl 6 is going this way; while there is a single base type Str
and role String, there are three different things it can "mean" (call them
subtypes): bytes, Unicode code points, graphemes (the latter corresponding
to the proposed Character). Or possibly only two of those; IIRC recently it
was proposed that the byte version be moved to the already existing Buf
type/Buffer role intended for binary data, roughly equivalent to ByteString.
If a given string is accessed as code points, it can't then be treated as
graphemes unless re-assigned to, and vice versa, but assigning it to another
Str allows that Str to be accessed as graphemes instead.
(I think. The Perl 6 spec is still a moving target, as evidenced by the
thing about byte access; it's entirely possible that this changed again and
I missed it. But there was definitely thought put into the distinction
between bytes, codepoints, and graphemes.)
- --
brandon s. allbery [linux,solaris,freebsd,perl] allbery at kf8nh.com
system administrator [openafs,heimdal,too many hats] allbery at ece.cmu.edu
electrical and computer engineering, carnegie mellon university KF8NH
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAky+ecwACgkQIn7hlCsL25USGgCeOQZdx4PBCjc7yF0LwSRdyYEp
E1IAniYszij4vGohwPtGOkB/weNB6TEF
=NhB/
-----END PGP SIGNATURE-----
More information about the Libraries
mailing list