[Haskell-cafe] Estimating the time to garbage collect

Neil Davies semanticphilosopher at googlemail.com
Sat May 2 03:34:54 EDT 2009


Yes, you've got the problem domain. I don't have to deliver responses
to stimuli all the time within a bound, but I need to supply some
probability for that figure.
That problem domain is everywhere - all that varies is the bound on the time
and the probability of meeting it.

'Hard real time' systems are very expensive to build and, typically, make
very low utilisation of resources and have interesting failure modes when
timing stops being be met. Meeting strict timing constraints is becoming
more difficult as processors become more complex (think multi-level caching,
clock rates that vary with temperature and/or load) and when those systems
use packet based multiplexed as their interconnect (time slotted shared bus
being too expensive).

Yes, the proof obligations are more challenging, no more ability to
enumerate the complete state space and prove that the schedule can always be
met, no more 'certainty' that events and communications will occur within a
fixed time. Interestingly giving up that constraint may well have its
up-side, it was being used as a design 'crutch' - possibly being leaned on
too heavily. Having to explicitly consider a probability distribution
appears to create more robust overall systems.

On the flip side, this more stochastic approach has to work - the commercial
trends in wide area networking mean things are getting more
stochastic, deterministic timings for wide are communications will be a
thing of the past in 10 - 15 years (or prohibitively expensive). This is
already worrying people like electricity distribution folks - their control
systems are looking vulnerable to such changes and the issue of
co-ordination electricity grids is only going to get more difficult as the
number of generations sources increase, as is inevitable.

Perhaps this is too much for a Saturday morning, sunny one at that....

Neil

2009/5/1 John Van Enk <vanenkj at gmail.com>

> I think the problem becomes slightly easier if you can provide an upper
> bound on the time GC will take. If I understand your problem domain, Neil,
> you're most concerned with holding up other processes/partitions who are
> expecting to have a certain amount of processing time per frame. If we can
> give an upper bound to the GC time, then we can plan for it in the schedule
> without upsetting the other processes.
>
> I don't have an answer (though I'd love one), but I do think that asking
> for an upper bound substantially simplifies the problem (though, I could be
> wrong) and still gives you the characterisics you need to give a 'time to
> complete'.
>  /jve
> On Fri, May 1, 2009 at 4:14 AM, Neil Davies <
> semanticphilosopher at googlemail.com> wrote:
>
>> Hi
>>
>> With the discussion on threads and priority, and given that (in Stats.c)
>> there are lots of useful pieces of information that the run time system is
>> collecting, some of which is already visible (like the total amount of
>> memory mutated) and it is easy to make other measures available - it has
>> raised this question in my mind:
>>
>> Given that you have access to that information (the stuff that comes out
>> at the end of a run if you use +RTS -S) is it possible to estimate the time
>> a GC will take before asking for one?
>>
>> Ignoring, at least for the moment, all the issues of paging, processor
>> cache occupancy etc, what are the complexity drivers for the time to GC?
>>
>> I realise that it is going to depend on things like, volume of data
>> mutated, count of objects mutated, what fraction of them are live etc - and
>> even if it turns out that these things are very program specific then I have
>> a follow-on question - what properties do you need from your program to be
>> able to construct a viable estimate of GC time from a past history of such
>> garbage collections?
>>
>> Why am I interested? There are all manners of 'real time' in systems,
>> there is a vast class where a statistical bound (ie some sort of 'time to
>> complete' CDF) is more than adequate for production use. If this is possible
>> then it opens up areas where all the lovely properties of haskell can be
>> exploited if only you had confidence in the timing behaviour.
>>
>> Cheers
>> Neil
>> _______________________________________________
>> Haskell-Cafe mailing list
>> Haskell-Cafe at haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
>
>
> --
> /jve
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20090502/a824797d/attachment.htm


More information about the Haskell-Cafe mailing list