I/O overhead in opening and writing files

J Baptist arc38813 at hotmail.com
Mon Aug 27 22:43:40 CEST 2012


I'm looking into high-performance I/O, particularly on a tmpfs (in-memory) filesystem. This involves creating lots of little files. Unfortunately, it seems that Haskell's performance in this area is not comparable to that of C. I assume that this is because of the overhead involved in opening and closing files. Some cursory profiling confirmed this: most of the runtime of the program is in taken by openFile, hPutStr, and hClose.I thought that it might be faster to call the C library functions exposed as foreign imports in System.Posix.Internals, and thereby cut out some of Haskell's overhead. This indeed improved performance, but the program is still nearly twice as slow as the corresponding C program.I took some benchmarks. I wrote a program to create 500.000 files on a tmpfs filesystem, and write an integer into each of them. I did this in C, using the open; and twice in Haskell, using openFile and c_open. Here are the results:C program, using open and friends (gcc 4.4.3)real    0m4.614suser    0m0.380ssys     0m4.200sHaskell, using System.IO.openFile and friends (ghc 7.4.2)real    0m14.892suser    0m7.700ssys     0m6.890sHaskell, using System.Posix.Internals.c_open and friends (ghc 7.4.2)real    0m7.372suser    0m2.390ssys     0m4.570sWhy question is: why is this so slow? Could the culprit be the marshaling necessary to pass the parameters to the foreign functions? If I'm calling the low-level function c_open anyway, shouldn't performance be closer to C? Does anyone have suggestions for how to improve this?If anyone is interested, I can provide the code I used for these benchmarks. 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.haskell.org/pipermail/glasgow-haskell-users/attachments/20120827/0b928a80/attachment.htm>


More information about the Glasgow-haskell-users mailing list