[Haskell-cafe] RE: [Haskell] reader-like IO, parents of threads

Simon Marlow simonmar at microsoft.com
Wed Oct 19 05:41:09 EDT 2005


Certainly, adding hooks to forkIO is possible.  Also, providing thread
local data directly is something we could do quite easily (and
efficiently) in GHC.

Cheers,
	Simon

On 18 October 2005 19:12, Frederik Eaton wrote:

> What about adding support for hooks in forkIO? These could be useful
> for other things as well. Pthreads could be said to have this
> functionality:
> 
>  -- Function: int pthread_atfork (void (*PREPARE)(void), void
>           (*PARENT)(void), void (*CHILD)(void))
>      `pthread_atfork' registers handler functions to be called just
>      before and just after a new process is created with `fork'. The
>      PREPARE handler will be called from the parent process, just
>      before the new process is created. The PARENT handler will be
>      called from the parent process, just before `fork' returns. The
>      CHILD handler will be called from the child process, just before
>      `fork' returns.
> 
> As well as:
> 
>  -- Function: void pthread_cleanup_push (void (*ROUTINE) (void *),
>           void *ARG)
>      `pthread_cleanup_push' installs the ROUTINE function with
>      argument ARG as a cleanup handler. From this point on to the
>      matching `pthread_cleanup_pop', the function ROUTINE will be
>      called with arguments ARG when the thread terminates, either
>      through `pthread_exit' or by cancellation. If several cleanup
>      handlers are active at that point, they are called in LIFO
>      order: the most recently installed handler is called first.
> 
> Of course, 'fork' has a bit of a different meaning in pthreads. I
> don't know if there is support for handlers which are run when a new
> thread is created.
> 
> (Pthreads also has support for "thread-specific data":
> 
>     -- Function: int pthread_setspecific (pthread_key_t KEY, const
>              void *POINTER)
>         `pthread_setspecific' changes the value associated with KEY
>         in the calling thread, storing the given POINTER instead.
> 
>         If there is no such key KEY, it returns `EINVAL'.  Otherwise
>         it returns 0.
> 
>     -- Function: void * pthread_getspecific (pthread_key_t KEY)
>         `pthread_getspecific' returns the value currently associated
>         with KEY in the calling thread.
> 
>         If there is no such key KEY, it returns `NULL'.
> )
> 
> Regards,
> 
> Frederik
> 
> On Tue, Oct 18, 2005 at 11:47:29AM +0100, Simon Marlow wrote:
>> It seems that you can do this as long as you provide your own
>> version of forkIO, but not if you want to use the built-in forkIO.
>> 
>> One could argue that getting the parent ThreadId is something that
>> should be supported natively by forkIO, and I might be inlined to
>> agree. Unfortunately there are some subtleties: currently a ThreadId
>> is represented by a pointer to the thread itself, which causes the
>> thread to be kept alive.  This has implications not only for space
>> leaks, but also for reporting deadlock: if you have a ThreadId for a
>> thread, you can send it an exception with throwTo at any time, and
>> hence the runtime can never determine that the thread is deadlocked
>> so it will never get the NonTermination exception.  Perhaps we need
>> two kinds of ThreadId: a weak one for use in Maps, and a strong one
>> that you can use with throwTo.  But then building a Map in which
>> some elements can be garbage collected is a bit tricky (it can be
>> done though; see our old Memo table implementation in
>> fptools/hslibs/util/Memo.hs). 
>> 
>> Cheers,
>> 	Simon
>> 
>> On 16 October 2005 20:53, Frederik Eaton wrote:
>> 
>>> John Meacham suggested that I should be a little more clear about
>>> the semantics I'm seeking. Also, apparently it isn't possible to
>>> implement writeTLRef/modifyTLRef with the data structure I gave:
>>> 
>>>> data TLRef a = TLR a (MVar (Map ThreadId a))
>>> (the first argument is a default value, the second is a map storing
>>> the values in each thread. The MVar is for safe concurrent access)
>>> 
>>> Without those functions, it looks a little more like the Reader
>>> monad I'm comparing it to. 
>>> 
>>> - What happens on fork? The child thread effectively gets a "copy"
>>> of each TLRef in its parent. They have the same values, but
>>> modifying them using withTLRef has no effect on the values in other
>>> threads. 
>>> 
>>> - Can you pass a TLRef to a different thread? Yes, but the value it
>>> holds will not be the same when it is dereferenced in a different
>>> thread. 
>>> 
>>> The problem with writeTLRef is that if a child thread looks up the
>>> default value for an unbound reference by looking up the value in
>>> its parent, but after calling forkIO the parent changes the value
>>> with writeTLRef, then the child thread will get the wrong value. It
>>> is supposed to only see the value which was stored in the reference
>>> at the point where forkIO was called.
>>> 
>>> Also, for this reason, I think withTLRef would have to be
>>> implemented by creating a separate thread with forkIO and waiting
>>> for it to finish. This would avoid overwriting a value which other
>>> child threads might still need to access. 
>>> 
>>> Note that an e.g. "myParentThreadId" function isn't enough - what
>>> is needed is a 
>>> 
>>> parentThreadId :: ThreadId -> IO (Maybe ThreadId)
>>> 
>>> which can look up the parent of an arbitrary thread.
>>> 
>>> Alternatively, if 'forkIO' supported hooks to run before and/or
>>> after forking, then a 'parentThreadId' function could be
>>> implemented from that. 
>>> 
>>> Frederik
>>> 
>>> On Sun, Oct 16, 2005 at 04:40:40AM -0700, Frederik Eaton wrote:
>>>> Hi,
>>>> 
>>>> I'm trying to get MonadReader-like functionality in the IO monad.
>>>> It doesn't appear possible implement it with the interfaces that
>>>> Haskell98 or GHC provide. I'm looking for something like
>>>> "thread-local variables". The interface could be something like
>>>> this: 
>>>> 
>>>> newTLRef :: a -> IO (TLRef a)
>>>> withTLRef :: TLRef a -> a -> IO b -> IO b
>>>> readTLRef :: TLRef a -> IO a
>>>> writeTLRef :: TLRef a -> a -> IO ()
>>>> modifyTLRef :: TLRef a -> (a -> a) -> IO ()
>>>> 
>>>> This would have a lot of uses. I am aware of the "Implicit
>>>> Configurations" paper by Kiselyov and Shan, but a solution such as
>>>> theirs which requires modifying the type signatures of all
>>>> intermediate function calls is not suitable. I want to be able to
>>>> say "run algorithm A using database D" without requiring all of the
>>>> functions in algorithm A to know that databases are somehow
>>>> involved. One way to look at it is that I am seeking something
>>>> like the type-based approach, but easier and with less explicit
>>>> syntax; another way to look at it is that I am seeking something
>>>> like a global IORef based approach, but more safe.
>>>> 
>>>> An implementation based on ThreadId-keyed maps is almost workable:
>>>> 
>>>> data TLRef a = TLR a (MVar (Map ThreadId a))
>>>> 
>>>> The problem with this is that while it is possible to find out the
>>>> ThreadId of the current thread, it doesn't appear to be possible to
>>>> get the ThreadId of the parent thread, which would be needed for
>>>> values to be properly inherited.
>>>> 
>>>> Is there a way around this? Will there ever be standard support for
>>>> either finding the thread id of the parent of the current thread,
>>>> or for something like the thread-local references I have proposed?
>>>> 
>>>> Thanks,
>>>> 
>>>> Frederik
>>>> _______________________________________________
>>>> Haskell mailing list
>>>> Haskell at haskell.org
>>>> http://www.haskell.org/mailman/listinfo/haskell
>>>> 
>>> _______________________________________________
>>> Haskell mailing list
>>> Haskell at haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell
>> 
>> _______________________________________________
>> Haskell mailing list
>> Haskell at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell



More information about the Haskell-Cafe mailing list