[Haskell] thread-local variables

Fri Aug 4 17:19:20 EDT 2006

On 04.08 17:29, Frederik Eaton wrote:
> might, in Adrian Hey and Einar Karttunen's world, become:
> 
> newMain host environment program_args
>     network_config locale terminal_settings
>     stdin stdout stderr = do
>     hPutStrLn stdout (defaultEncoding locale) "Hello world"

Actually I have implemented network-libraries, and I don't remember
them requiring such things ;-)

I think our main difference is that when designing concurrent
applications in Haskell I frequently use monadic actions as
callbacks invoked in distant unrelated threads. Threading
behind the API is seen by me as mostly an implementation issue
as long as the service guarantees don't change.

You seem to use threads in a much more constrained fashion (my own
interpretation) which results in us seeing TLS from very different
perspectives.

> Maybe I'm misunderstanding your position - maybe you think that I
> should use lots of different processes to segregate global state into
> separate contexts? Well, that's nice, but I'd rather not. For
> instance, I'm writing a server - and it's just not efficient to use a
> separate process for each request. And there are some things such as
> database connections, current user id, log files, various profiling
> data, etc., that I would like to be thread-global but not
> process-global.

I have done many servers in Haskell. Usually I have threads allocated
to specific tasks rather than specific requests.

What guarantees do your code have that all the relevant parameters
are already initialized - and how can an user of the code know
which TLS variables need to be initialized? If it is documented
maybe it could be done at the level of an implicit parameter?

> Or maybe you think that certain types of global state should be
> privileged - for instance, that all of the things which are arguments
> to 'newMain' above are OK to have as global state, but that anything
> else should be passed as function arguments, thus making
> thread-localization moot. I disagree with this - I am a proponent of
> extensibility, and think that the language should make as few things
> as possible "built-in". I want to define my own application-specific
> global state, and, additionally, I want to have it thread-global, not
> process-global.

This can cause much fun with the FFI. If we change e.g. stdout to
thread specific what should be do before each foreign call? Same
with the other things that are related to the OS process in question.

A thread is a context of execution while a process is a context for
resources. Would you like to have multiple Haskell processes inside
one OS process?

I don't consider these very different:
1) use one thread from a pre-allocated pool to do a task
2) fork a new thread to do the task

With TLS they are vastly different.

> You asked for an example, but, because of the nature of this topic, it
> would have to be a very large example to prove my point. Thread-local
> variables are things that only become really useful in large programs. 
> Instead, I've asked you to put yourself in my shoes - what if the bits
> of context that you already take for granted in your programs had to
> be thread-local? How would you cope, without thread-local variables,
> in such a situation?

I have been using an application specific monad (newtyped transformer) and
a clean set of functions so that the implementation is not hardcoded
and can be changed easily. Thus I haven't had the same difficulties
as you.

I don't think many of the process global resources would make sense
on a per-thread basis and I am not against all global state.

> > But I would say that I think I would find having to know what thread
> > a particular bit of code was running in in order to "grok it" very
> > strange,
> 
> I agree that it is important to have code which is easy to understand.
> 
> Usually, functions run in the same thread as their caller, unless they
> are passed to something with the word 'fork' in the name. That's a
> good rule of thumb that is in fact sufficient to let you understand
> the code I write. Also, if that's too much to remember, then since I'm
> only proposing and using non-mutable thread-local state (i.e. it
> behaves like a MonadReader), and since I'm not passing actions between
> threads as Einar is, then you can forget about the 'fork' caveat.

The only problem appears when someone uses two libraries one written
by me and an another written by you and wonders "why is my program
failing in mysterious ways".

> I think the code would in fact be more difficult to "grok", if all of
> the things which I want to be thread-local were instead passed around
> as parameters, a la 'newMain'. This is simply because, in that
> scenario, there would much more code to read, and it would be very
> repetitive. If I used special monads for my state, then the situation
> would be only slightly better - a single monad would not suffice, and
> I'd be faced with a plethora of 'lift' functions and redefinitions of
> 'catch', as well as long type signatures and a crowded namespace.

As said before the monadic approach can be quite clean. I haven't used
implicit parameters that much, so I won't comment on them.

> > unless there was some obvious technical reason why the
> > thread local state needed to be thread local (can't think of any
> > such reason right now).
> 
> Some things are not immediately obvious. If you don't like to think of
> reasons, then just take my word for it that it would help me. A
> facility for thread-local variables would be just another of many
> facilities that programmers could choose from when designing their
> code. I'm not asking you to change the way you program - I don't care
> how other people program. I trust them to know what is best for their
> particular application. It's none of my business, anyway.

I think we can agree to disagree on whether they are a good idea :-)

Mainly I am concerned with the ability to share and reuse code
between different Haskell projects. We really don't want to make
it hard to combine libraries because one uses much threading 
and the other one TLS. I think this is the most important
issue.

> Since Simon Marlow said that he had been considering a thread-local
> variable facility, I merely wanted to voice my support:
> 
> http://www.mail-archive.com/haskell@haskell.org/msg18398.html
> 
> It seems that there are enough resources to implement one. The
> discussion should not be about "do we allow this" but rather "what
> should the API be".

There is nothing stopping you from implementing them.

To make it type-safe maybe something like:

data Proxy a = Proxy
class TLSVar name ty | name -> ty
getTLS  :: TLSVar name ty => Proxy name -> IO ty
withTLS :: TLSVar name ty => Proxy name -> ty -> IO a -> IO a

But I don't have strong feelings about the API as I would
probably not use it.

- Einar Karttunen