[Haskell-cafe] Re: String vs ByteString

Ivan Lazar Miljenovic ivan.miljenovic at gmail.com
Tue Aug 17 06:19:17 EDT 2010


Ketil Malde <ketil at malde.org> writes:

> Johan Tibell <johan.tibell at gmail.com> writes:
>
>> It's not clear to me that using UTF-16 internally does make Data.Text
>> noticeably slower. 
>
> I haven't benchmarked it, but I'm fairly sure that, if you try to fit a
> 3Gbyte file (the Human genome, say¹), into a computer with 4Gbytes of
> RAM, UTF-16 will be slower than UTF-8.  Many applications will get away
> with streaming over data, retaining only a small part, but some won't.

Seeing as how the genome just uses 4 base "letters", wouldn't it be
better to not treat it as text but use something else?  Or do you just
mean storage-wise to be able to be read in a text editor, etc. as well
(in case someone is trying to do their mad genetic manipulation by
hand)?

-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic at gmail.com
IvanMiljenovic.wordpress.com


More information about the Haskell-Cafe mailing list