openFile and threads
Matthias Neubauer
neubauer@informatik.uni-freiburg.de
13 Jan 2003 17:47:14 +0100
"Simon Marlow" <simonmar@microsoft.com> writes:
> > > You might consider bypassing the Handle interface and going
> > to the bare
> > > metal using the Posix library, which will cut down on the
> > overhead in
> > > openFile.
> >
> > That's what I was fearing. Is the conversion from Haskell Strings to
> > C strings a performance problem?
>
> Haskell Strings are a common performance bottleneck; for example when
> serving files in the Haskell web server I avoided the conversion to
> Haskell Strings altogether by reading/writing arrays of bytes (see the
> paper for details).
I was curious to see if this is also the case here. Therefore I just
pasted the GHC implementation of openFile into Peter's suspicious
module ('openFile' obtained from
http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/libraries/base/GHC/Handle.hs---I
hope this was the right one?) to be able to also profile the GHC
internal openfile code. Here are the relevant parts of the resulting
output of the profiler:
COST CENTRE MODULE %time %alloc
withCString' MailStore 39.1 19.7
f1 MailStore 26.1 40.9
f9 MailStore 21.7 8.8
getBuffer MailStore 4.3 0.1
f6.2 MailStore 4.3 4.0
f6 MailStore 4.3 2.3
f6.3 MailStore 0.0 1.6
allocateBuffer MailStore 0.0 19.4
...
COST CENTRE MODULE no. entries %time %alloc %time %alloc
f6.1 MailStore 361 0 0.0 0.1 43.5 41.2
openFile MailStore 362 1154 0.0 0.1 43.5 41.1
openFile' MailStore 365 1154 0.0 0.0 43.5 40.9
withCString' MailStore 367 0 39.1 19.7 39.1 19.7
openFd MailStore 371 1154 0.0 0.7 4.3 20.9
mkFileHandle MailStore 372 1154 0.0 0.3 4.3 20.2
initBufferState MailStore 387 1154 0.0 0.0 0.0 0.0
newFileHandle MailStore 376 1154 0.0 0.1 0.0 0.3
handleFinalizer MailStore 377 0 0.0 0.1 0.0 0.2
flushWriteBufferOnly MailStore 389 1154 0.0 0.0 0.0 0.0
getBuffer MailStore 373 1154 4.3 0.1 4.3 19.6
allocateBuffer MailStore 374 1154 0.0 19.4 0.0 19.5
newEmptyBuffer MailStore 375 0 0.0 0.1 0.0 0.1
...
The cost centre "f6.1" is the location of the recurring call of
"openFile". As you can see almost all of the time is spent in the
function "withCString" translating Haskell strings representing the
file names to the C representation.
I knew that Haskell strings are bad, but I really did not expect them
to cause such a huge time penalty ...
Cheers,
Matthias
--
Matthias Neubauer |
Universität Freiburg, Institut für Informatik | tel +49 761 203 8060
Georges-Köhler-Allee 79, 79110 Freiburg i. Br., Germany | fax +49 761 203 8052