[Git][ghc/ghc][wip/gc/preparation] 18 commits: gitlab: Add merge request template

Ben Gamari gitlab at gitlab.haskell.org
Fri May 31 13:17:30 UTC 2019



Ben Gamari pushed to branch wip/gc/preparation at Glasgow Haskell Compiler / GHC


Commits:
5ebcfc04 by Ben Gamari at 2019-01-21T23:05:52Z
gitlab: Add merge request template

This begins to define our expectations of contributions.

[skip-ci]

- - - - -
7262a815 by Ben Gamari at 2019-01-21T23:06:30Z
Add CODEOWNERS

GitLab uses this file to suggest reviewers based upon the files that a Merge
Request touches.

[skip-ci]

- - - - -
64ce6afa by Samuel Holland at 2019-01-21T23:28:38Z
Extend linker-script workaround to work with musl libc

GHC has code to handle unsuffixed .so files that are linker scripts
pointing to the real shared library. The detection is done by parsing
the result of `dlerror()` after calling `dlopen()` and looking for
certain error strings. On musl libc, the error message is "Exec format
error", which happens to be `strerror(ENOEXEC)`:

```
$ cat tmp.c
#include <dlfcn.h>
#include <stdio.h>

int main(void) {
        dlopen("libz.so", RTLD_NOW | RTLD_GLOBAL);
        puts(dlerror());
        return 0;
}
$ gcc -o tmp tmp.c
$ ./tmp
Error loading shared library libz.so: Exec format error
$
```

This change fixes the workaround to also work on musl libc.

Link: https://phabricator.haskell.org/D5474

- - - - -
a5373c1f by Simon Peyton Jones at 2019-01-22T08:02:20Z
Fix bogus worker for newtypes

The "worker" for a newtype is actually a function
with a small (compulsory) unfolding, namely a cast.

But the construction of this function was plain wrong
for newtype /instances/; it cast the arguemnt to the
family type rather than the representation type.

This never actually bit us because, in the case of a
family instance, we immediately cast the result to
the family type.  So we get
   \x. (x |> co1) |> co2

where the compositio of co1 and co2 is ill-kinded.
However the optimiser (even the simple optimiser)
just collapsed those casts, ignoring the mis-match
in the middle, so we never saw the problem.

Trac #16191 is indeed a dup of #16141; but the resaon
these tickets produce Lint errors is not the unnecessary
forcing; it's because of the ill-typed casts.

This patch fixes the ill-typed casts, properly.  I can't
see a way to trigger an actual failure prior to this
patch, but it's still wrong wrong wrong to have ill-typed
casts, so better to get rid of them.

- - - - -
92b30982 by Ben Gamari at 2019-02-22T00:55:25Z
rts/Schedule: Allow synchronization without holding a capability

The concurrent mark-and-sweep will be performed by a GHC task which will
not hold a capability. This is necessary to avoid a concurrent mark from
interfering with minor generation collections.

However, the major collector must synchronize with the mutators at the
end of marking to flush their update remembered sets. This patch extends
the `requestSync` mechanism used to synchronize garbage collectors to
allow synchronization without holding a capability.

This change is fairly straightforward as the capability was previously
only required for two reasons:

 1. to ensure that we don't try to re-acquire a capability that we
    the sync requestor already holds.

 2. to provide a way to suspend and later resume the sync request if
    there is already a sync pending.

When synchronizing without holding a capability we needn't worry about
consideration (1) at all.

(2) is slightly trickier and may happen, for instance, when a capability
requests a minor collection and shortly thereafter the non-moving mark
thread requests a post-mark synchronization. In this case we need to
ensure that the non-moving mark thread suspends his request until after
the minor GC has concluded to avoid dead-locking. For this we introduce
a condition variable, `sync_finished_cond`, which a
non-capability-bearing requestor will wait on and which is signalled
after a synchronization or GC has finished.

- - - - -
ced51daa by Ben Gamari at 2019-02-22T00:55:36Z
rts: Factor out large bitmap walking

This will be needed by the mark phase of the non-moving collector
so let's factor it out.

- - - - -
b5fb4210 by Ömer Sinan Ağacan at 2019-02-22T00:55:43Z
rts/BlockAlloc: Allow aligned allocation requests

This implements support for block group allocations which are aligned to
an integral number of blocks.

This will be used by the nonmoving garbage collector, which uses the
block allocator to allocate the segments which back its heap. These
segments are a fixed number of blocks in size, with each segment being
aligned to the segment size boundary. This allows us to easily find the
segment metadata stored at the beginning of the segment.

- - - - -
96e666e4 by Ömer Sinan Ağacan at 2019-02-22T01:00:01Z
rts/GC: Add an obvious assertion during block initialization

Namely ensure that block descriptors are initialized with valid
generation numbers.

- - - - -
6a97a4ea by Ben Gamari at 2019-02-22T01:00:01Z
rts: Add Note explaining applicability of selector optimisation depth limit

This was slightly non-obvious so a note seems deserved.

- - - - -
9f985ddf by Ben Gamari at 2019-02-22T01:00:01Z
rts/Capability: A few documentation comments

- - - - -
1f81ed5a by Ömer Sinan Ağacan at 2019-02-22T01:00:01Z
rts/Printer: Introduce a few more printing utilities

These include printLargeAndPinnedObjects, printWeakLists, and
printStaticObjects. These are generally useful things to have.

- - - - -
88cbcfaa by Ben Gamari at 2019-02-22T01:00:01Z
rts: Give stack flags proper macros

This were previously quite unclear and will change a bit under the
non-moving collector so let's clear this up now.

- - - - -
5d8e2f59 by Ömer Sinan Ağacan at 2019-02-22T01:00:01Z
rts: Unglobalize dead_weak_ptr_list and resurrected_threads

In the concurrent nonmoving collector we will need the ability to call
`traverseWeakPtrList` concurrently with minor generation collections.
This global state stands in the way of this. However, refactoring it
away is straightforward since this list only persists the length of a
single GC.

- - - - -
213c28a7 by Ömer Sinan Ağacan at 2019-02-22T01:00:01Z
rts/Printer: Print forwarding pointers

- - - - -
b97b3508 by Ben Gamari at 2019-02-22T01:00:01Z
rts/GC: Refactor gcCAFs

- - - - -
edec78fc by Ben Gamari at 2019-02-22T01:00:02Z
Merge branches 'wip/gc/sync-without-capability', 'wip/gc/factor-out-bitmap-walking', 'wip/gc/aligned-block-allocation', 'wip/gc/misc-rts', 'wip/gc/printer-improvements' and 'wip/gc/unglobalize-gc-state' into wip/gc/preparation

- - - - -
48504bf5 by Ben Gamari at 2019-05-16T01:46:28Z
rts: Fix macro parenthesisation

- - - - -
a2b74bc7 by Ben Gamari at 2019-05-16T16:31:34Z
Merge branch 'wip/gc/misc-rts' into wip/gc/preparation

- - - - -


27 changed files:

- + .gitlab/merge_request_templates/merge-request.md
- + CODEOWNERS
- compiler/basicTypes/MkId.hs
- includes/rts/storage/Block.h
- includes/rts/storage/GC.h
- includes/rts/storage/InfoTables.h
- includes/rts/storage/TSO.h
- rts/Capability.c
- rts/Linker.c
- rts/PrimOps.cmm
- rts/Printer.c
- rts/Printer.h
- rts/Schedule.c
- rts/Schedule.h
- rts/Threads.c
- rts/sm/BlockAlloc.c
- rts/sm/Compact.c
- rts/sm/Compact.h
- rts/sm/Evac.c
- rts/sm/GC.c
- + rts/sm/HeapUtils.h
- rts/sm/MarkWeak.c
- rts/sm/MarkWeak.h
- rts/sm/Sanity.c
- rts/sm/Scav.c
- rts/sm/Storage.c
- utils/deriveConstants/Main.hs


Changes:

=====================================
.gitlab/merge_request_templates/merge-request.md
=====================================
@@ -0,0 +1,19 @@
+Thank you for your contribution to GHC!
+
+Please take a few moments to verify that your commits fulfill the following:
+
+ * [ ] are either individually buildable or squashed
+ * [ ] have commit messages which describe *what they do*
+   (referring to [Notes][notes] and tickets using `#NNNN` syntax when
+   appropriate)
+ * [ ] have added source comments describing your change. For larger changes you
+   likely should add a [Note][notes] and cross-reference it from the relevant
+   places.
+ * [ ] add a [testcase to the
+   testsuite](https://ghc.haskell.org/trac/ghc/wiki/Building/RunningTests/Adding).
+
+If you have any questions don't hesitate to open your merge request and inquire
+in a comment. If your patch isn't quite done yet please do add prefix your MR
+title with `WIP:`.
+
+[notes]: https://ghc.haskell.org/trac/ghc/wiki/Commentary/CodingStyle#Commentsinthesourcecode


=====================================
CODEOWNERS
=====================================
@@ -0,0 +1,22 @@
+# Confused about what this is? See
+# https://gitlab.haskell.org/help/user/project/code_owners
+
+# Catch-all
+* @bgamari
+
+# Build system
+/hadrian                         @snowleopard @alp @DavidEichmann
+
+# RTS-like things
+/rts                             @bgamari @simonmar @osa1 @Phyx
+/includes                        @bgamari @simonmar @osa1
+
+# The compiler
+/compiler/typecheck              @simonpj @goldfire
+/compiler/rename                 @simonpj @goldfire
+/compiler/typecheck/TcDeriv*     @RyanGlScott
+/compiler/nativeGen              @simonmar @bgamari @AndreasK
+
+# Core libraries
+/libraries/base                  @hvr
+/libraries/template-haskell      @goldfire


=====================================
compiler/basicTypes/MkId.hs
=====================================
@@ -425,26 +425,26 @@ dictSelRule val_index n_ty_args _ id_unf _ args
 mkDataConWorkId :: Name -> DataCon -> Id
 mkDataConWorkId wkr_name data_con
   | isNewTyCon tycon
-  = mkGlobalId (DataConWrapId data_con) wkr_name nt_wrap_ty nt_work_info
+  = mkGlobalId (DataConWrapId data_con) wkr_name wkr_ty nt_work_info
   | otherwise
-  = mkGlobalId (DataConWorkId data_con) wkr_name alg_wkr_ty wkr_info
+  = mkGlobalId (DataConWorkId data_con) wkr_name wkr_ty alg_wkr_info
 
   where
-    tycon = dataConTyCon data_con
+    tycon  = dataConTyCon data_con  -- The representation TyCon
+    wkr_ty = dataConRepType data_con
 
         ----------- Workers for data types --------------
-    alg_wkr_ty = dataConRepType data_con
+    alg_wkr_info = noCafIdInfo
+                   `setArityInfo`          wkr_arity
+                   `setStrictnessInfo`     wkr_sig
+                   `setUnfoldingInfo`      evaldUnfolding  -- Record that it's evaluated,
+                                                           -- even if arity = 0
+                   `setLevityInfoWithType` wkr_ty
+                     -- NB: unboxed tuples have workers, so we can't use
+                     -- setNeverLevPoly
+
     wkr_arity = dataConRepArity data_con
-    wkr_info  = noCafIdInfo
-                `setArityInfo`          wkr_arity
-                `setStrictnessInfo`     wkr_sig
-                `setUnfoldingInfo`      evaldUnfolding  -- Record that it's evaluated,
-                                                        -- even if arity = 0
-                `setLevityInfoWithType` alg_wkr_ty
-                  -- NB: unboxed tuples have workers, so we can't use
-                  -- setNeverLevPoly
-
-    wkr_sig = mkClosedStrictSig (replicate wkr_arity topDmd) (dataConCPR data_con)
+    wkr_sig   = mkClosedStrictSig (replicate wkr_arity topDmd) (dataConCPR data_con)
         --      Note [Data-con worker strictness]
         -- Notice that we do *not* say the worker Id is strict
         -- even if the data constructor is declared strict
@@ -465,20 +465,21 @@ mkDataConWorkId wkr_name data_con
         -- not from the worker Id.
 
         ----------- Workers for newtypes --------------
-    (nt_tvs, _, nt_arg_tys, _) = dataConSig data_con
-    res_ty_args  = mkTyCoVarTys nt_tvs
-    nt_wrap_ty   = dataConUserType data_con
+    univ_tvs = dataConUnivTyVars data_con
+    arg_tys  = dataConRepArgTys  data_con  -- Should be same as dataConOrigArgTys
     nt_work_info = noCafIdInfo          -- The NoCaf-ness is set by noCafIdInfo
                   `setArityInfo` 1      -- Arity 1
                   `setInlinePragInfo`     alwaysInlinePragma
                   `setUnfoldingInfo`      newtype_unf
-                  `setLevityInfoWithType` nt_wrap_ty
-    id_arg1      = mkTemplateLocal 1 (head nt_arg_tys)
+                  `setLevityInfoWithType` wkr_ty
+    id_arg1      = mkTemplateLocal 1 (head arg_tys)
+    res_ty_args  = mkTyCoVarTys univ_tvs
     newtype_unf  = ASSERT2( isVanillaDataCon data_con &&
-                            isSingleton nt_arg_tys, ppr data_con  )
+                            isSingleton arg_tys
+                          , ppr data_con  )
                               -- Note [Newtype datacons]
                    mkCompulsoryUnfolding $
-                   mkLams nt_tvs $ Lam id_arg1 $
+                   mkLams univ_tvs $ Lam id_arg1 $
                    wrapNewTypeBody tycon res_ty_args (Var id_arg1)
 
 dataConCPR :: DataCon -> DmdResult


=====================================
includes/rts/storage/Block.h
=====================================
@@ -290,6 +290,13 @@ EXTERN_INLINE bdescr* allocBlock(void)
 
 bdescr *allocGroupOnNode(uint32_t node, W_ n);
 
+// Allocate n blocks, aligned at n-block boundary. The returned bdescr will
+// have this invariant
+//
+//     bdescr->start % BLOCK_SIZE*n == 0
+//
+bdescr *allocAlignedGroupOnNode(uint32_t node, W_ n);
+
 EXTERN_INLINE bdescr* allocBlockOnNode(uint32_t node);
 EXTERN_INLINE bdescr* allocBlockOnNode(uint32_t node)
 {


=====================================
includes/rts/storage/GC.h
=====================================
@@ -240,9 +240,14 @@ void dirty_MUT_VAR(StgRegTable *reg, StgClosure *p);
 /* (needed when dynamic libraries are used). */
 extern bool keepCAFs;
 
+#include "rts/Flags.h"
+
 INLINE_HEADER void initBdescr(bdescr *bd, generation *gen, generation *dest)
 {
     bd->gen     = gen;
     bd->gen_no  = gen->no;
     bd->dest_no = dest->no;
+
+    ASSERT(gen->no < RtsFlags.GcFlags.generations);
+    ASSERT(dest->no < RtsFlags.GcFlags.generations);
 }


=====================================
includes/rts/storage/InfoTables.h
=====================================
@@ -355,7 +355,7 @@ typedef struct StgConInfoTable_ {
  */
 #if defined(TABLES_NEXT_TO_CODE)
 #define GET_CON_DESC(info) \
-            ((const char *)((StgWord)((info)+1) + (info->con_desc)))
+            ((const char *)((StgWord)((info)+1) + ((info)->con_desc)))
 #else
 #define GET_CON_DESC(info) ((const char *)(info)->con_desc)
 #endif


=====================================
includes/rts/storage/TSO.h
=====================================
@@ -185,6 +185,11 @@ typedef struct StgTSO_ {
 
 } *StgTSOPtr; // StgTSO defined in rts/Types.h
 
+
+#define STACK_DIRTY 1
+// used by sanity checker to verify that all dirty stacks are on the mutable list
+#define STACK_SANE 64
+
 typedef struct StgStack_ {
     StgHeader  header;
     StgWord32  stack_size;     // stack size in *words*


=====================================
rts/Capability.c
=====================================
@@ -748,6 +748,8 @@ static Capability * waitForReturnCapability (Task *task)
  * result of the external call back to the Haskell thread that
  * made it.
  *
+ * pCap is strictly an output.
+ *
  * ------------------------------------------------------------------------- */
 
 void waitForCapability (Capability **pCap, Task *task)
@@ -840,6 +842,9 @@ void waitForCapability (Capability **pCap, Task *task)
  *      SYNC_GC_PAR), either to do a sequential GC, forkProcess, or
  *      setNumCapabilities.  We should give up the Capability temporarily.
  *
+ * When yieldCapability returns *pCap will have been updated to the new
+ * capability held by the caller.
+ *
  * ------------------------------------------------------------------------- */
 
 #if defined (THREADED_RTS)


=====================================
rts/Linker.c
=====================================
@@ -483,7 +483,7 @@ initLinker_ (int retain_cafs)
 #   endif /* RTLD_DEFAULT */
 
     compileResult = regcomp(&re_invalid,
-           "(([^ \t()])+\\.so([^ \t:()])*):([ \t])*(invalid ELF header|file too short|invalid file format)",
+           "(([^ \t()])+\\.so([^ \t:()])*):([ \t])*(invalid ELF header|file too short|invalid file format|Exec format error)",
            REG_EXTENDED);
     if (compileResult != 0) {
         barf("Compiling re_invalid failed");


=====================================
rts/PrimOps.cmm
=====================================
@@ -1721,7 +1721,7 @@ loop:
     // indicate that the MVar operation has now completed.
     StgTSO__link(tso) = stg_END_TSO_QUEUE_closure;
 
-    if (TO_W_(StgStack_dirty(stack)) == 0) {
+    if ((TO_W_(StgStack_dirty(stack)) & STACK_DIRTY) == 0) {
         ccall dirty_STACK(MyCapability() "ptr", stack "ptr");
     }
 
@@ -1802,7 +1802,7 @@ loop:
     // indicate that the MVar operation has now completed.
     StgTSO__link(tso) = stg_END_TSO_QUEUE_closure;
 
-    if (TO_W_(StgStack_dirty(stack)) == 0) {
+    if ((TO_W_(StgStack_dirty(stack)) & STACK_DIRTY) == 0) {
         ccall dirty_STACK(MyCapability() "ptr", stack "ptr");
     }
 


=====================================
rts/Printer.c
=====================================
@@ -111,10 +111,15 @@ printThunkObject( StgThunk *obj, char* tag )
 void
 printClosure( const StgClosure *obj )
 {
-    const StgInfoTable *info;
-
+    debugBelch("%p: ", obj);
     obj = UNTAG_CONST_CLOSURE(obj);
-    info = get_itbl(obj);
+    const StgInfoTable* info = get_itbl(obj);
+
+    while (IS_FORWARDING_PTR(info)) {
+        obj = (StgClosure*)UN_FORWARDING_PTR(obj);
+        debugBelch("(forwarding to %p) ", (void*)obj);
+        info = get_itbl(obj);
+    }
 
     switch ( info->type ) {
     case INVALID_OBJECT:
@@ -646,6 +651,81 @@ void printTSO( StgTSO *tso )
     printStack( tso->stackobj );
 }
 
+void printStaticObjects( StgClosure *p )
+{
+    while (p != END_OF_STATIC_OBJECT_LIST) {
+        p = UNTAG_STATIC_LIST_PTR(p);
+        printClosure(p);
+
+        const StgInfoTable *info = get_itbl(p);
+        p = *STATIC_LINK(info, p);
+    }
+}
+
+void printWeakLists()
+{
+    debugBelch("======= WEAK LISTS =======\n");
+
+    for (uint32_t cap_idx = 0; cap_idx < n_capabilities; ++cap_idx) {
+        debugBelch("Capability %d:\n", cap_idx);
+        Capability *cap = capabilities[cap_idx];
+        for (StgWeak *weak = cap->weak_ptr_list_hd; weak; weak = weak->link) {
+            printClosure((StgClosure*)weak);
+        }
+    }
+
+    for (uint32_t gen_idx = 0; gen_idx <= oldest_gen->no; ++gen_idx) {
+        generation *gen = &generations[gen_idx];
+        debugBelch("Generation %d current weaks:\n", gen_idx);
+        for (StgWeak *weak = gen->weak_ptr_list; weak; weak = weak->link) {
+            printClosure((StgClosure*)weak);
+        }
+        debugBelch("Generation %d old weaks:\n", gen_idx);
+        for (StgWeak *weak = gen->old_weak_ptr_list; weak; weak = weak->link) {
+            printClosure((StgClosure*)weak);
+        }
+    }
+
+    debugBelch("=========================\n");
+}
+
+void printLargeAndPinnedObjects()
+{
+    debugBelch("====== PINNED OBJECTS ======\n");
+
+    for (uint32_t cap_idx = 0; cap_idx < n_capabilities; ++cap_idx) {
+        debugBelch("Capability %d:\n", cap_idx);
+        Capability *cap = capabilities[cap_idx];
+
+        debugBelch("Current pinned object block: %p\n", (void*)cap->pinned_object_block);
+        // just to check if my understanding is correct
+        // 4/6/2018: assertion fails
+        // ASSERT(cap->pinned_object_block == NULL || cap->pinned_object_block->link == NULL);
+
+        for (bdescr *bd = cap->pinned_object_blocks; bd; bd = bd->link) {
+            debugBelch("%p\n", (void*)bd);
+        }
+    }
+
+    debugBelch("====== LARGE OBJECTS =======\n");
+    for (uint32_t gen_idx = 0; gen_idx <= oldest_gen->no; ++gen_idx) {
+        generation *gen = &generations[gen_idx];
+        debugBelch("Generation %d current large objects:\n", gen_idx);
+        for (bdescr *bd = gen->large_objects; bd; bd = bd->link) {
+            debugBelch("%p: ", (void*)bd);
+            printClosure((StgClosure*)bd->start);
+        }
+
+        debugBelch("Generation %d scavenged large objects:\n", gen_idx);
+        for (bdescr *bd = gen->scavenged_large_objects; bd; bd = bd->link) {
+            debugBelch("%p: ", (void*)bd);
+            printClosure((StgClosure*)bd->start);
+        }
+    }
+
+    debugBelch("============================\n");
+}
+
 /* --------------------------------------------------------------------------
  * Address printing code
  *


=====================================
rts/Printer.h
=====================================
@@ -25,6 +25,9 @@ extern void        printClosure    ( const StgClosure *obj );
 extern void        printStackChunk ( StgPtr sp, StgPtr spLim );
 extern void        printTSO        ( StgTSO *tso );
 extern void        printMutableList( bdescr *bd );
+extern void        printStaticObjects ( StgClosure *obj );
+extern void        printWeakLists ( void );
+extern void        printLargeAndPinnedObjects ( void );
 
 extern void DEBUG_LoadSymbols( const char *name );
 


=====================================
rts/Schedule.c
=====================================
@@ -110,6 +110,19 @@ Mutex sched_mutex;
 #define FORKPROCESS_PRIMOP_SUPPORTED
 #endif
 
+/*
+ * sync_finished_cond allows threads which do not own any capability (e.g. the
+ * concurrent mark thread) to participate in the sync protocol. In particular,
+ * if such a thread requests a sync while sync is already in progress it will
+ * block on sync_finished_cond, which will be signalled when the sync is
+ * finished (by releaseAllCapabilities).
+ */
+#if defined(THREADED_RTS)
+static Condition sync_finished_cond;
+static Mutex sync_finished_mutex;
+#endif
+
+
 /* -----------------------------------------------------------------------------
  * static function prototypes
  * -------------------------------------------------------------------------- */
@@ -130,7 +143,6 @@ static void scheduleYield (Capability **pcap, Task *task);
 static bool requestSync (Capability **pcap, Task *task,
                          PendingSync *sync_type, SyncType *prev_sync_type);
 static void acquireAllCapabilities(Capability *cap, Task *task);
-static void releaseAllCapabilities(uint32_t n, Capability *cap, Task *task);
 static void startWorkerTasks (uint32_t from USED_IF_THREADS,
                               uint32_t to USED_IF_THREADS);
 #endif
@@ -1368,17 +1380,24 @@ scheduleNeedHeapProfile( bool ready_to_gc )
  * change to the system, such as altering the number of capabilities, or
  * forking.
  *
+ * pCap may be NULL in the event that the caller doesn't yet own a capability.
+ *
  * To resume after stopAllCapabilities(), use releaseAllCapabilities().
  * -------------------------------------------------------------------------- */
 
 #if defined(THREADED_RTS)
-static void stopAllCapabilities (Capability **pCap, Task *task)
+void stopAllCapabilities (Capability **pCap, Task *task)
+{
+    stopAllCapabilitiesWith(pCap, task, SYNC_OTHER);
+}
+
+void stopAllCapabilitiesWith (Capability **pCap, Task *task, SyncType sync_type)
 {
     bool was_syncing;
     SyncType prev_sync_type;
 
     PendingSync sync = {
-        .type = SYNC_OTHER,
+        .type = sync_type,
         .idle = NULL,
         .task = task
     };
@@ -1387,9 +1406,10 @@ static void stopAllCapabilities (Capability **pCap, Task *task)
         was_syncing = requestSync(pCap, task, &sync, &prev_sync_type);
     } while (was_syncing);
 
-    acquireAllCapabilities(*pCap,task);
+    acquireAllCapabilities(pCap ? *pCap : NULL, task);
 
     pending_sync = 0;
+    signalCondition(&sync_finished_cond);
 }
 #endif
 
@@ -1400,6 +1420,16 @@ static void stopAllCapabilities (Capability **pCap, Task *task)
  * directly, instead use stopAllCapabilities().  This is used by the GC, which
  * has some special synchronisation requirements.
  *
+ * Note that this can be called in two ways:
+ *
+ * - where *pcap points to a capability owned by the caller: in this case
+ *   *prev_sync_type will reflect the in-progress sync type on return, if one
+ *   *was found
+ *
+ *  - where pcap == NULL: in this case the caller doesn't hold a capability.
+ *    we only return whether or not a pending sync was found and prev_sync_type
+ *    is unchanged.
+ *
  * Returns:
  *    false if we successfully got a sync
  *    true  if there was another sync request in progress,
@@ -1424,13 +1454,25 @@ static bool requestSync (
         // After the sync is completed, we cannot read that struct any
         // more because it has been freed.
         *prev_sync_type = sync->type;
-        do {
-            debugTrace(DEBUG_sched, "someone else is trying to sync (%d)...",
-                       sync->type);
-            ASSERT(*pcap);
-            yieldCapability(pcap,task,true);
-            sync = pending_sync;
-        } while (sync != NULL);
+        if (pcap == NULL) {
+            // The caller does not hold a capability (e.g. may be a concurrent
+            // mark thread). Consequently we must wait until the pending sync is
+            // finished before proceeding to ensure we don't loop.
+            // TODO: Don't busy-wait
+            ACQUIRE_LOCK(&sync_finished_mutex);
+            while (pending_sync) {
+                waitCondition(&sync_finished_cond, &sync_finished_mutex);
+            }
+            RELEASE_LOCK(&sync_finished_mutex);
+        } else {
+            do {
+                debugTrace(DEBUG_sched, "someone else is trying to sync (%d)...",
+                          sync->type);
+                ASSERT(*pcap);
+                yieldCapability(pcap,task,true);
+                sync = pending_sync;
+            } while (sync != NULL);
+        }
 
         // NOTE: task->cap might have changed now
         return true;
@@ -1445,9 +1487,9 @@ static bool requestSync (
 /* -----------------------------------------------------------------------------
  * acquireAllCapabilities()
  *
- * Grab all the capabilities except the one we already hold.  Used
- * when synchronising before a single-threaded GC (SYNC_SEQ_GC), and
- * before a fork (SYNC_OTHER).
+ * Grab all the capabilities except the one we already hold (cap may be NULL is
+ * the caller does not currently hold a capability). Used when synchronising
+ * before a single-threaded GC (SYNC_SEQ_GC), and before a fork (SYNC_OTHER).
  *
  * Only call this after requestSync(), otherwise a deadlock might
  * ensue if another thread is trying to synchronise.
@@ -1477,29 +1519,30 @@ static void acquireAllCapabilities(Capability *cap, Task *task)
             }
         }
     }
-    task->cap = cap;
+    task->cap = cap == NULL ? tmpcap : cap;
 }
 #endif
 
 /* -----------------------------------------------------------------------------
- * releaseAllcapabilities()
+ * releaseAllCapabilities()
  *
- * Assuming this thread holds all the capabilities, release them all except for
- * the one passed in as cap.
+ * Assuming this thread holds all the capabilities, release them all (except for
+ * the one passed in as keep_cap, if non-NULL).
  * -------------------------------------------------------------------------- */
 
 #if defined(THREADED_RTS)
-static void releaseAllCapabilities(uint32_t n, Capability *cap, Task *task)
+void releaseAllCapabilities(uint32_t n, Capability *keep_cap, Task *task)
 {
     uint32_t i;
 
     for (i = 0; i < n; i++) {
-        if (cap->no != i) {
-            task->cap = capabilities[i];
-            releaseCapability(capabilities[i]);
+        Capability *tmpcap = capabilities[i];
+        if (keep_cap != tmpcap) {
+            task->cap = tmpcap;
+            releaseCapability(tmpcap);
         }
     }
-    task->cap = cap;
+    task->cap = keep_cap;
 }
 #endif
 
@@ -1801,6 +1844,7 @@ delete_threads_and_gc:
     // reset pending_sync *before* GC, so that when the GC threads
     // emerge they don't immediately re-enter the GC.
     pending_sync = 0;
+    signalCondition(&sync_finished_cond);
     GarbageCollect(collect_gen, heap_census, gc_type, cap, idle_cap);
 #else
     GarbageCollect(collect_gen, heap_census, 0, cap, NULL);


=====================================
rts/Schedule.h
=====================================
@@ -49,6 +49,12 @@ StgWord findRetryFrameHelper (Capability *cap, StgTSO *tso);
 /* Entry point for a new worker */
 void scheduleWorker (Capability *cap, Task *task);
 
+#if defined(THREADED_RTS)
+void stopAllCapabilitiesWith (Capability **pCap, Task *task, SyncType sync_type);
+void stopAllCapabilities (Capability **pCap, Task *task);
+void releaseAllCapabilities(uint32_t n, Capability *keep_cap, Task *task);
+#endif
+
 /* The state of the scheduler.  This is used to control the sequence
  * of events during shutdown.  See Note [shutdown] in Schedule.c.
  */


=====================================
rts/Threads.c
=====================================
@@ -85,7 +85,7 @@ createThread(Capability *cap, W_ size)
     SET_HDR(stack, &stg_STACK_info, cap->r.rCCCS);
     stack->stack_size   = stack_size - sizeofW(StgStack);
     stack->sp           = stack->stack + stack->stack_size;
-    stack->dirty        = 1;
+    stack->dirty        = STACK_DIRTY;
 
     tso = (StgTSO *)allocate(cap, sizeofW(StgTSO));
     TICK_ALLOC_TSO();
@@ -788,7 +788,7 @@ loop:
     // indicate that the MVar operation has now completed.
     tso->_link = (StgTSO*)&stg_END_TSO_QUEUE_closure;
 
-    if (stack->dirty == 0) {
+    if ((stack->dirty & STACK_DIRTY) == 0) {
         dirty_STACK(cap, stack);
     }
 


=====================================
rts/sm/BlockAlloc.c
=====================================
@@ -310,7 +310,7 @@ setup_tail (bdescr *bd)
 // Take a free block group bd, and split off a group of size n from
 // it.  Adjust the free list as necessary, and return the new group.
 static bdescr *
-split_free_block (bdescr *bd, uint32_t node, W_ n, uint32_t ln)
+split_free_block (bdescr *bd, uint32_t node, W_ n, uint32_t ln /* log_2_ceil(n) */)
 {
     bdescr *fg; // free group
 
@@ -325,6 +325,46 @@ split_free_block (bdescr *bd, uint32_t node, W_ n, uint32_t ln)
     return fg;
 }
 
+// Take N blocks off the end, free the rest.
+static bdescr *
+split_block_high (bdescr *bd, W_ n)
+{
+    ASSERT(bd->blocks > n);
+
+    bdescr* ret = bd + bd->blocks - n; // take n blocks off the end
+    ret->blocks = n;
+    ret->start = ret->free = bd->start + (bd->blocks - n)*BLOCK_SIZE_W;
+    ret->link = NULL;
+
+    bd->blocks -= n;
+
+    setup_tail(ret);
+    setup_tail(bd);
+    freeGroup(bd);
+
+    return ret;
+}
+
+// Like `split_block_high`, but takes n blocks off the beginning rather
+// than the end.
+static bdescr *
+split_block_low (bdescr *bd, W_ n)
+{
+    ASSERT(bd->blocks > n);
+
+    bdescr* bd_ = bd + n;
+    bd_->blocks = bd->blocks - n;
+    bd_->start = bd_->free = bd->start + n*BLOCK_SIZE_W;
+
+    bd->blocks = n;
+
+    setup_tail(bd_);
+    setup_tail(bd);
+    freeGroup(bd_);
+
+    return bd;
+}
+
 /* Only initializes the start pointers on the first megablock and the
  * blocks field of the first bdescr; callers are responsible for calling
  * initGroup afterwards.
@@ -461,6 +501,75 @@ finish:
     return bd;
 }
 
+bdescr *
+allocAlignedGroupOnNode (uint32_t node, W_ n)
+{
+    // allocate enough blocks to have enough space aligned at n-block boundary
+    // free any slops on the low and high side of this space
+
+    // number of blocks to allocate to make sure we have enough aligned space
+    uint32_t num_blocks = 2*n - 1;
+    W_ group_size = n * BLOCK_SIZE;
+
+    bdescr *bd = allocGroupOnNode(node, num_blocks);
+
+    // slop on the low side
+    W_ slop_low = 0;
+    if ((uintptr_t)bd->start % group_size != 0) {
+        slop_low = group_size - ((uintptr_t)bd->start % group_size);
+    }
+
+    W_ slop_high = (bd->blocks*BLOCK_SIZE) - group_size - slop_low;
+
+    ASSERT((slop_low % BLOCK_SIZE) == 0);
+    ASSERT((slop_high % BLOCK_SIZE) == 0);
+
+    W_ slop_low_blocks = slop_low / BLOCK_SIZE;
+    W_ slop_high_blocks = slop_high / BLOCK_SIZE;
+
+    ASSERT(slop_low_blocks + slop_high_blocks + n == num_blocks);
+
+#ifdef DEBUG
+    checkFreeListSanity();
+    W_ free_before = countFreeList();
+#endif
+
+    if (slop_low_blocks != 0) {
+        bd = split_block_high(bd, num_blocks - slop_low_blocks);
+        ASSERT(countBlocks(bd) == num_blocks - slop_low_blocks);
+    }
+
+#ifdef DEBUG
+    ASSERT(countFreeList() == free_before + slop_low_blocks);
+    checkFreeListSanity();
+#endif
+
+    // At this point the bd should be aligned, but we may have slop on the high side
+    ASSERT((uintptr_t)bd->start % group_size == 0);
+
+#ifdef DEBUG
+    free_before = countFreeList();
+#endif
+
+    if (slop_high_blocks != 0) {
+        bd = split_block_low(bd, n);
+        ASSERT(countBlocks(bd) == n);
+    }
+
+#ifdef DEBUG
+    ASSERT(countFreeList() == free_before + slop_high_blocks);
+    checkFreeListSanity();
+#endif
+
+    // Should still be aligned
+    ASSERT((uintptr_t)bd->start % group_size == 0);
+
+    // Just to make sure I get this right
+    ASSERT(Bdescr(bd->start) == bd);
+
+    return bd;
+}
+
 STATIC_INLINE
 uint32_t nodeWithLeastBlocks (void)
 {


=====================================
rts/sm/Compact.c
=====================================
@@ -940,7 +940,7 @@ update_bkwd_compact( generation *gen )
 }
 
 void
-compact(StgClosure *static_objects)
+compact(StgClosure *static_objects, StgWeak *dead_weak_ptr_list, StgTSO *resurrected_threads)
 {
     W_ n, g, blocks;
     generation *gen;


=====================================
rts/sm/Compact.h
=====================================
@@ -45,6 +45,8 @@ is_marked(StgPtr p, bdescr *bd)
     return (*bitmap_word & bit_mask);
 }
 
-void compact (StgClosure *static_objects);
+void compact (StgClosure *static_objects,
+              StgWeak *dead_weak_ptr_list,
+              StgTSO *resurrected_threads);
 
 #include "EndPrivate.h"


=====================================
rts/sm/Evac.c
=====================================
@@ -39,7 +39,19 @@
         copy_tag(p, info, src, size, stp, tag)
 #endif
 
-/* Used to avoid long recursion due to selector thunks
+/* Note [Selector optimisation depth limit]
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * MAX_THUNK_SELECTOR_DEPTH is used to avoid long recursion of
+ * eval_thunk_selector due to nested selector thunks. Note that this *only*
+ * counts nested selector thunks, e.g. `fst (fst (... (fst x)))`. The collector
+ * will traverse interleaved selector-constructor pairs without limit, e.g.
+ *
+ *     a = (fst b, _)
+ *     b = (fst c, _)
+ *     c = (fst d, _)
+ *     d = (x, _)
+ *
  */
 #define MAX_THUNK_SELECTOR_DEPTH 16
 
@@ -1252,6 +1264,7 @@ selector_loop:
 
           // recursively evaluate this selector.  We don't want to
           // recurse indefinitely, so we impose a depth bound.
+          // See Note [Selector optimisation depth limit].
           if (gct->thunk_selector_depth >= MAX_THUNK_SELECTOR_DEPTH) {
               goto bale_out;
           }


=====================================
rts/sm/GC.c
=====================================
@@ -416,15 +416,20 @@ GarbageCollect (uint32_t collect_gen,
    * Repeatedly scavenge all the areas we know about until there's no
    * more scavenging to be done.
    */
+
+  StgWeak *dead_weak_ptr_list = NULL;
+  StgTSO *resurrected_threads = END_TSO_QUEUE;
+
   for (;;)
   {
       scavenge_until_all_done();
+
       // The other threads are now stopped.  We might recurse back to
       // here, but from now on this is the only thread.
 
       // must be last...  invariant is that everything is fully
       // scavenged at this point.
-      if (traverseWeakPtrList()) { // returns true if evaced something
+      if (traverseWeakPtrList(&dead_weak_ptr_list, &resurrected_threads)) { // returns true if evaced something
           inc_running();
           continue;
       }
@@ -468,7 +473,7 @@ GarbageCollect (uint32_t collect_gen,
   // Finally: compact or sweep the oldest generation.
   if (major_gc && oldest_gen->mark) {
       if (oldest_gen->compact)
-          compact(gct->scavenged_static_objects);
+          compact(gct->scavenged_static_objects, dead_weak_ptr_list, resurrected_threads);
       else
           sweep(oldest_gen);
   }
@@ -1836,21 +1841,16 @@ resize_nursery (void)
 
 #if defined(DEBUG)
 
-static void gcCAFs(void)
+void gcCAFs(void)
 {
-    StgIndStatic *p, *prev;
-
-    const StgInfoTable *info;
-    uint32_t i;
-
-    i = 0;
-    p = debug_caf_list;
-    prev = NULL;
+    uint32_t i = 0;
+    StgIndStatic *prev = NULL;
 
-    for (p = debug_caf_list; p != (StgIndStatic*)END_OF_CAF_LIST;
-         p = (StgIndStatic*)p->saved_info) {
-
-        info = get_itbl((StgClosure*)p);
+    for (StgIndStatic *p = debug_caf_list;
+         p != (StgIndStatic*) END_OF_CAF_LIST;
+         p = (StgIndStatic*) p->saved_info)
+    {
+        const StgInfoTable *info = get_itbl((StgClosure*)p);
         ASSERT(info->type == IND_STATIC);
 
         // See Note [STATIC_LINK fields] in Storage.h


=====================================
rts/sm/HeapUtils.h
=====================================
@@ -0,0 +1,33 @@
+/* -----------------------------------------------------------------------------
+ *
+ * (c) The GHC Team 1998-2008
+ *
+ * General utilities for walking the heap
+ *
+ * ---------------------------------------------------------------------------*/
+
+#pragma once
+
+typedef void (walk_closures_cb)(StgClosure **, void *);
+
+INLINE_HEADER void
+walk_large_bitmap(walk_closures_cb *cb,
+                  StgClosure **p,
+                  StgLargeBitmap *large_bitmap,
+                  StgWord size,
+                  void *user)
+{
+    uint32_t b = 0;
+
+    for (uint32_t i = 0; i < size; b++) {
+        StgWord bitmap = large_bitmap->bitmap[b];
+        uint32_t j = stg_min(size-i, BITS_IN(W_));
+        i += j;
+        for (; j > 0; j--, p++) {
+            if ((bitmap & 1) == 0) {
+                cb(p, user);
+            }
+            bitmap = bitmap >> 1;
+        }
+    }
+}


=====================================
rts/sm/MarkWeak.c
=====================================
@@ -77,15 +77,9 @@
 typedef enum { WeakPtrs, WeakThreads, WeakDone } WeakStage;
 static WeakStage weak_stage;
 
-// List of weak pointers whose key is dead
-StgWeak *dead_weak_ptr_list;
-
-// List of threads found to be unreachable
-StgTSO *resurrected_threads;
-
-static void    collectDeadWeakPtrs (generation *gen);
+static void    collectDeadWeakPtrs (generation *gen, StgWeak **dead_weak_ptr_list);
 static bool tidyWeakList (generation *gen);
-static bool resurrectUnreachableThreads (generation *gen);
+static bool resurrectUnreachableThreads (generation *gen, StgTSO **resurrected_threads);
 static void    tidyThreadList (generation *gen);
 
 void
@@ -100,12 +94,10 @@ initWeakForGC(void)
     }
 
     weak_stage = WeakThreads;
-    dead_weak_ptr_list = NULL;
-    resurrected_threads = END_TSO_QUEUE;
 }
 
 bool
-traverseWeakPtrList(void)
+traverseWeakPtrList(StgWeak **dead_weak_ptr_list, StgTSO **resurrected_threads)
 {
   bool flag = false;
 
@@ -140,7 +132,7 @@ traverseWeakPtrList(void)
 
       // Resurrect any threads which were unreachable
       for (g = 0; g <= N; g++) {
-          if (resurrectUnreachableThreads(&generations[g])) {
+          if (resurrectUnreachableThreads(&generations[g], resurrected_threads)) {
               flag = true;
           }
       }
@@ -175,7 +167,7 @@ traverseWeakPtrList(void)
        */
       if (flag == false) {
           for (g = 0; g <= N; g++) {
-              collectDeadWeakPtrs(&generations[g]);
+              collectDeadWeakPtrs(&generations[g], dead_weak_ptr_list);
           }
 
           weak_stage = WeakDone;  // *now* we're done,
@@ -190,7 +182,7 @@ traverseWeakPtrList(void)
   }
 }
 
-static void collectDeadWeakPtrs (generation *gen)
+static void collectDeadWeakPtrs (generation *gen, StgWeak **dead_weak_ptr_list)
 {
     StgWeak *w, *next_w;
     for (w = gen->old_weak_ptr_list; w != NULL; w = next_w) {
@@ -201,12 +193,12 @@ static void collectDeadWeakPtrs (generation *gen)
         }
         evacuate(&w->finalizer);
         next_w = w->link;
-        w->link = dead_weak_ptr_list;
-        dead_weak_ptr_list = w;
+        w->link = *dead_weak_ptr_list;
+        *dead_weak_ptr_list = w;
     }
 }
 
-static bool resurrectUnreachableThreads (generation *gen)
+static bool resurrectUnreachableThreads (generation *gen, StgTSO **resurrected_threads)
 {
     StgTSO *t, *tmp, *next;
     bool flag = false;
@@ -225,8 +217,8 @@ static bool resurrectUnreachableThreads (generation *gen)
         default:
             tmp = t;
             evacuate((StgClosure **)&tmp);
-            tmp->global_link = resurrected_threads;
-            resurrected_threads = tmp;
+            tmp->global_link = *resurrected_threads;
+            *resurrected_threads = tmp;
             flag = true;
         }
     }


=====================================
rts/sm/MarkWeak.h
=====================================
@@ -19,7 +19,7 @@ extern StgTSO *resurrected_threads;
 
 void    collectFreshWeakPtrs   ( void );
 void    initWeakForGC          ( void );
-bool    traverseWeakPtrList    ( void );
+bool    traverseWeakPtrList    ( StgWeak **dead_weak_ptr_list, StgTSO **resurrected_threads );
 void    markWeakPtrList        ( void );
 void    scavengeLiveWeak       ( StgWeak * );
 


=====================================
rts/sm/Sanity.c
=====================================
@@ -619,9 +619,9 @@ checkGlobalTSOList (bool checkTSOs)
 
               stack = tso->stackobj;
               while (1) {
-                  if (stack->dirty & 1) {
-                      ASSERT(Bdescr((P_)stack)->gen_no == 0 || (stack->dirty & TSO_MARKED));
-                      stack->dirty &= ~TSO_MARKED;
+                  if (stack->dirty & STACK_DIRTY) {
+                      ASSERT(Bdescr((P_)stack)->gen_no == 0 || (stack->dirty & STACK_SANE));
+                      stack->dirty &= ~STACK_SANE;
                   }
                   frame = (StgUnderflowFrame*) (stack->stack + stack->stack_size
                                                 - sizeofW(StgUnderflowFrame));
@@ -656,7 +656,7 @@ checkMutableList( bdescr *mut_bd, uint32_t gen )
                 ((StgTSO *)p)->flags |= TSO_MARKED;
                 break;
             case STACK:
-                ((StgStack *)p)->dirty |= TSO_MARKED;
+                ((StgStack *)p)->dirty |= STACK_SANE;
                 break;
             }
         }


=====================================
rts/sm/Scav.c
=====================================
@@ -58,6 +58,7 @@
 #include "Sanity.h"
 #include "Capability.h"
 #include "LdvProfile.h"
+#include "HeapUtils.h"
 #include "Hash.h"
 
 #include "sm/MarkWeak.h"
@@ -77,6 +78,11 @@ static void scavenge_large_bitmap (StgPtr p,
 # define scavenge_capability_mut_lists(cap) scavenge_capability_mut_Lists1(cap)
 #endif
 
+static void do_evacuate(StgClosure **p, void *user STG_UNUSED)
+{
+    evacuate(p);
+}
+
 /* -----------------------------------------------------------------------------
    Scavenge a TSO.
    -------------------------------------------------------------------------- */
@@ -1777,22 +1783,7 @@ scavenge_static(void)
 static void
 scavenge_large_bitmap( StgPtr p, StgLargeBitmap *large_bitmap, StgWord size )
 {
-    uint32_t i, j, b;
-    StgWord bitmap;
-
-    b = 0;
-
-    for (i = 0; i < size; b++) {
-        bitmap = large_bitmap->bitmap[b];
-        j = stg_min(size-i, BITS_IN(W_));
-        i += j;
-        for (; j > 0; j--, p++) {
-            if ((bitmap & 1) == 0) {
-                evacuate((StgClosure **)p);
-            }
-            bitmap = bitmap >> 1;
-        }
-    }
+    walk_large_bitmap(do_evacuate, (StgClosure **) p, large_bitmap, size, NULL);
 }
 
 


=====================================
rts/sm/Storage.c
=====================================
@@ -1133,8 +1133,8 @@ dirty_TSO (Capability *cap, StgTSO *tso)
 void
 dirty_STACK (Capability *cap, StgStack *stack)
 {
-    if (stack->dirty == 0) {
-        stack->dirty = 1;
+    if (! (stack->dirty & STACK_DIRTY)) {
+        stack->dirty = STACK_DIRTY;
         recordClosureMutated(cap,(StgClosure*)stack);
     }
 }


=====================================
utils/deriveConstants/Main.hs
=====================================
@@ -307,6 +307,9 @@ wanteds os = concat
                              "sizeofW(StgHeader) - sizeofW(StgProfHeader)"
           ,constantWord Both "PROF_HDR_SIZE" "sizeofW(StgProfHeader)"
 
+           -- Stack flags for C--
+          ,constantWord C "STACK_DIRTY" "STACK_DIRTY"
+
            -- Size of a storage manager block (in bytes).
           ,constantWord Both "BLOCK_SIZE"  "BLOCK_SIZE"
           ,constantWord C    "MBLOCK_SIZE" "MBLOCK_SIZE"



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/412abe8b6eb1b17af9c0db503ab7a4ed74064faa...a2b74bc7f839a8dcca0b50cf9d08d9fcde69cde2

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/412abe8b6eb1b17af9c0db503ab7a4ed74064faa...a2b74bc7f839a8dcca0b50cf9d08d9fcde69cde2
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20190531/c86f070a/attachment-0001.html>


More information about the ghc-commits mailing list