[Haskell-cafe] Re: Optimizing locking with MVars

Simon Marlow simonmarhaskell at gmail.com
Fri May 5 05:52:00 EDT 2006


Bulat Ziganshin wrote:

> Wednesday, May 3, 2006, 3:07:19 PM, you wrote:
> 
> 
>>I propose to add INLINE pragmas to withMVar and friends.
> 
> 
> and to any higher-order function? :)  imho, key of this problem is
> that GHC don't have a way to optimize just the required function with
> all it's enclosed calls. in old GHC papers there is an "inline"
> function that does this (afai understood) but it is no more supported
> 
> problem is not that withMVar should be inlined in any case of its use.
> problem is what i need to inline it's use in one concrete place of my
> concrete library and there is no way to force GHC do this beside of
> copying it's implementation! i think that GHC should just give more
> priority to inlining higher-order and dictionaries-dependent functions
> (afaiu, currently they are weighted by the same rules as ordinary
> functions)

Giving the programmer more control over which call sites are inlined 
would be a good thing.  However, it only improves code size, not 
performance, compared with just using INLINE.

GHC has a complicated heuristic to decide whether to inline a function 
or not, taking into account whether the arguments look "interesting" or 
not.  An interesting argument valiue (eg. a function or a constructor) 
might lead to more transformations when the function is inlined, so the 
decision is weighted in favour of that.

See the inlining paper:
   http://www.haskell.org/~simonmar/papers/inline.pdf

There are definitely cases where GHC doesn't inline and it would be 
beneficial to do so.  I'd just like to point out that the current 
heuristics weren't arrived at by accident - we did lots of measurements 
of the effect of different values for the tuning parameters on the nofib 
benchmark suite, and the result is a reasonable compromise between 
performance and code size.

> and the second part of problem is what it can't be decided at the
> place of "withMVar" definition whether it should be inlined or not -
> this depends in the first place on the call site. if this is the code
> that performed just one time, we should decide on the space basis. if
> this is the innermost loop, we should inline even largest functions
> (especially if they are higher-order/polymorphic ones). but the
> compiler don't know how much times some code fragment will be
> performed (obviously it's place for future integration between
> optimizer and profiler...) so, i think, it should generally optimize
> program on code space and only for procedures marked as OPTIMIZE it
> should enable all the inlining machinery, trying to make code
> generated for this procedure (including all the inner calls) as fast
> as possible. this should allow programmer to just mark speed-critical
> procedures and will make library writers absolutely free from dancing
> with INLINE pragmas.

annotating code fragments for "optimise readly hard" would be a good 
thing, yes.  Of course, you can always put those code fragments in a 
separate module, and compile it with -O2 -funfolding-use-threshold100 
-fliberate-case-threshold100 etc.



>>This won't affect Handle I/O unfortunately, because we need block to 
>>protect against asynchronous exceptions.
> 
> 
> i think that it is not accidental that hPutChar/hGetChar is exactly 3
> and 2 times slower than their v* analogs. my routines use only one
> locked MVar and i think that Handle routines used 3 and 2 mvars,
> respectively. i thought that they call withMVar directly, but it is
> not the case. but may be great blocks that they are perform inside
> "block" operations are still not inlined and therefore suffers from
> the same problem as withMVar? i guess that adding INLINE pragma to
> block definition may speed up their execution (in 52/38 ~= 1.4 times),
> but possibly at the cost of code size. anyway, Handle I/O is already too
> complicated thing to worry about it's further optimizations :(

'block' is trivial, it is already inlined.

block (IO io) = IO $ blockAsyncExceptions# io
unblock (IO io) = IO $ unblockAsyncExceptions# io

> thank, it's sort of critics what i very need. the total checking seems
> to rather complex problem so i will decline it until working on next
> release. the good news is what my buffering scheme updates only one
> unboxed pointer (current read/write position inside buffer) on all
> operations that can be performed using only buffer contents, so it
> seems that i don't need to use "block" to protect such operations
> until routine realizes that there is need to switch to other buffer.
> btw, i've changed default buffer size to 4kb
> 
> in next version i will work on this problem. i think that it should be
> described as some contract between library authors (including authors
> of 3rd-party streams and transformers) and users, in this form:
> 
> if stream operation abandoned because of asynchronous interrupt,
> internal bug or error in input data:
> 
> - it should leave stream in coherent state (f.e., it should not leave
> vIsEOF=True and vTell=0 for non-empty-file)
> 
> - it should not have "side-effects", f.e. changing position if whole
> operation (such as vShow) don't have such effect
> 
> - operation can be performed only partially (i.e. vSeek operation can
> leave pointer in unpredictable place, vPutStr can output only part of
> string, vGetBuf may fill only part of buffer)
> 
> - if possible, operation implementation should give to async.
> exceptions chances to interrupt execution with rather small
> granularity (i.e. don't write 100 mb in one C call, but split it to 4
> kb blocks)
> 
> what you will say?

I think you can give stronger guarantees than these.  eg.  Seek either 
succeeds or it doesn't - if it raises an exception (including an async 
exception) the seek never happened.  Partial reads/writes before an 
exception are unavoidable, though.  Unfortunately the caller will have 
no way to tell how much of the operation completed before the exception 
was raised, unless you provide a secondary call to obtain this 
information, or pass a mutable counter to the operation.

Cheers,
	Simon


More information about the Haskell-Cafe mailing list