FPS again

Duncan Coutts duncan.coutts at worc.ox.ac.uk
Sat Jul 15 14:23:07 EDT 2006


On Sat, 2006-07-15 at 22:14 +0400, Bulat Ziganshin wrote:
> Hello Duncan,
> 
> Saturday, July 15, 2006, 8:04:26 PM, you wrote:
> 
> > getContents, putStr, readfile, interact etc are
> > encoding-independent, they're just the same as hGet/hPut, working on
> > binary data blocks. Indeed putStr = hPut stdout.
> 
> one shortage i've seen in the library is that you don't see difference
> between Text and Binary modes of file open. indeed, on the Unix it's
> the same, but not on windows. below is my conversation on this topic
> with Donald. finally he applied changes i proposed (using of openFile
> instead of openBinaryFile in these operations) and today i sent to him
> patch that does the same change in Lazy module
> 
> 
> >> the System.IO contains the following definitions:
> >> 
> >> readFile name   =  openFile name ReadMode >>= hGetContents
> >> 
> >> writeFile name str = do
> >>     hdl <- openFile name WriteMode
> >>     ...
> >> appendFile name str = do
> >>     hdl <- openFile name AppendMode
> >>     ...
> >> 
> >>     
> >> As you can see, file is open in text mode, while your definitions open
> >> files in Binary mode:
> >> 
> >> readFile f = bracket (openBinaryFile f ReadMode) hClose
> >>     (\h -> hFileSize h >>= hGet h . fromIntegral)
> >> 
> >> writeFile f ps = bracket (openBinaryFile f WriteMode) hClose
> >>     (\h -> hPut h ps)
> >> 
> >> appendFile f txt = bracket (openBinaryFile f AppendMode) hClose
> >>     (\hdl -> hPut hdl txt)
> >> 
> 
> 
> > I don't understand your point here. Do you mean I should be opening in
> > Text mode, since its not portable in Binary mode? Can you clarify?
> 
> just for case you don't know - due the history roots, different
> operation systems has different line end sequences - Unix use chr(10),
> classical Mac OS - chr(13), while DOS/Windows uses chr(13)+chr(10)
> 
> In order to allow writing universal text-processing programs that
> works with any OS, standard C libraries implemented ability to open
> files in "text mode", in which case OS-specific line ends translated
> by the library to standard Unix ones when reading, and vice versa when
> writing
> 
> System.IO routines i mentioned also opens files in text mode which
> means that they will correctly translate on Windows 13+10 line ends
> (standard for this OS) to the chr(10). This means that any
> text-processing functions written with translated (aka Unix) line ends
> in mind, will work correctly (with contents of files read/written by
> mentioned System.IO routines) even on Windows
> 
> for example, 2-line text file on Windows may contain something like
> "line1\r\nline2". When read by openBinaryFile and split by 'lines',
> the result will be ["line1\r", "line2"], that is incorrect. When read
> by openFile (which opens files in text mode), Windows-specific line
> end will be translated to Unix-specific one, so the string read will
> be "line1\nline2" and the 'lines' will return correct results
> ["line1", "line2"]
> 
> So, while under Unix there is absolutely no difference which mode you
> use to open files, this makes difference on Windows. If original
> routines uses openFile then these routines are intended to work with
> _text_ files and their clones should give a chance to text translation
> too. 

So presumably the correct solution is to have the readFile, writeFile
etc in the Data.ByteString module use openBinaryFile and the versions in
Data.ByteString.Char8 use openFile. That way the versions that are
interpreting strings as text will get the OS's line ending conversions.

Duncan



More information about the Libraries mailing list