[Haskell-cafe] Data.Text UTF-8 question
jeff p
mutjida at gmail.com
Fri Aug 31 07:59:26 CEST 2012
Hello,
I have a sample file (attached) which I cannot read into Text:
Prelude Control.Applicative> Data.Text.IO.readFile "foo"
*** Exception: utf8.txt: hGetContents: invalid argument (invalid
byte sequence)
Prelude Control.Applicative> Data.Text.Encoding.decodeUtf8 <$>
Data.ByteString.Char8.readFile "foo"
"*** Exception: Cannot decode byte '\x6e':
Data.Text.Encoding.decodeUtf8: Invalid UTF-8 stream
So it seems that foo doesn't contain valid UTF-8. However,
System.IO.UTF8 has no problem reading the data:
Prelude Control.Applicative> System.IO.UTF8.readFile "foo"
"3591,,,dihigma99h,1905,5,25,CUBA,,Matanzas,1971,5,20,CUBA,,Cienfuegos,Martin,Dihigo,,Mart\65533n
Magdaleno Dihigo
(Llanos),,190,74,R,R,,,,dihigma99,dihigma99,dihim001,dihigma99,dihigma99\r\n"
Shouldn't these all have the same behavior?
I am running on Mac OS X 10.8.1, with GHC 7.4.2 and text-0.11.2.3.
thanks for any insight,
Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: foo
Type: application/octet-stream
Size: 182 bytes
Desc: not available
URL: <http://www.haskell.org/pipermail/haskell-cafe/attachments/20120830/e3c799e6/attachment.obj>
More information about the Haskell-Cafe
mailing list