[Git][ghc/ghc][wip/andreask/bytecode_tagging] GHCi interpreter: Tag constructor closures when possible.

Andreas Klebinger (@AndreasK) gitlab at gitlab.haskell.org
Wed Jun 5 21:33:07 UTC 2024



Andreas Klebinger pushed to branch wip/andreask/bytecode_tagging at Glasgow Haskell Compiler / GHC


Commits:
3c487956 by Andreas Klebinger at 2024-06-05T23:32:30+02:00
GHCi interpreter: Tag constructor closures when possible.

When evaluating PUSH_G try to tag the reference we are pushing if it's a
constructor or function. This is potentially helpful for performance and required to
fix #24870.

- - - - -


7 changed files:

- compiler/GHC/ByteCode/Instr.hs
- rts/Interpreter.c
- + testsuite/tests/th/should_compile/T24870/Def.hs
- + testsuite/tests/th/should_compile/T24870/T24870.stderr
- + testsuite/tests/th/should_compile/T24870/T24870.stderr-mingw32
- + testsuite/tests/th/should_compile/T24870/Use.hs
- + testsuite/tests/th/should_compile/T24870/all.T


Changes:

=====================================
compiler/GHC/ByteCode/Instr.hs
=====================================
@@ -83,7 +83,7 @@ data BCInstr
    | PUSH16_W !ByteOff
    | PUSH32_W !ByteOff
 
-   -- Push a ptr  (these all map to PUSH_G really)
+   -- Push a (heap) ptr  (these all map to PUSH_G really)
    | PUSH_G       Name
    | PUSH_PRIMOP  PrimOp
    | PUSH_BCO     (ProtoBCO Name)


=====================================
rts/Interpreter.c
=====================================
@@ -4,6 +4,30 @@
  * Copyright (c) The GHC Team, 1994-2002.
  * ---------------------------------------------------------------------------*/
 
+/*
+Note [CBV Functions and the interpreter]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When the byte code interpreter loads a reference to a value it often
+ends up as a non-tagged pointers *especially* if we already know a value
+is a certain constructor and therefore don't perform an eval on the reference.
+This causes friction with CBV functions which assume
+their value arguments are properly tagged by the caller.
+
+In order to ensure CBV functions still get passed tagged functions we have
+three options:
+a)  Special case the interpreter behaviour into the tag inference analysis.
+    If we assume the interpreter can't properly tag value references the STG passes
+    would then wrap such calls in appropriate evals which are executed at runtime.
+    This would ensure tags by doing additional evals at runtime.
+b)  When the interpreter pushes references for known constructors instead of
+    pushing the objects address add the tag to the value pushed. This is what
+    the NCG backends do.
+c)  When the interpreter pushes a reference inspect the closure of the object
+    and apply the appropriate tag at runtime.
+
+For now we use approach c). Mostly because it's easiest to implement. We also don't
+tag functions as tag inference currently doesn't rely on those being properly tagged.
+*/
 
 #include "rts/PosixSource.h"
 #include "Rts.h"
@@ -292,6 +316,18 @@ STATIC_INLINE StgClosure *tagConstr(StgClosure *con) {
     return TAG_CLOSURE(stg_min(TAG_MASK, 1 + GET_TAG(con)), con);
 }
 
+// Compute the pointer tag for the function and tag the pointer;
+STATIC_INLINE StgClosure *tagFun(StgClosure *fun) {
+    StgHalfWord tag = GET_TAG(fun);
+    if(tag > TAG_MASK) { return fun; }
+    else {
+        return TAG_CLOSURE(tag, fun);
+    }
+
+
+}
+
+
 static StgWord app_ptrs_itbl[] = {
     (W_)&stg_ap_p_info,
     (W_)&stg_ap_pp_info,
@@ -1306,7 +1342,52 @@ run_BCO:
 
         case bci_PUSH_G: {
             W_ o1 = BCO_GET_LARGE_ARG;
-            SpW(-1) = BCO_PTR(o1);
+            StgClosure *tagged_obj = (StgClosure*) BCO_PTR(o1);
+
+            tag_push_g:
+            ASSERT(LOOKS_LIKE_CLOSURE_PTR((StgClosure*) tagged_obj));
+            // Here we make sure references we push are tagged.
+            // See Note [CBV Functions and the interpreter] in Info.hs
+
+            //Safe some memory reads if we already have a tag.
+            if(GET_CLOSURE_TAG(tagged_obj) == 0) {
+                StgClosure *obj = UNTAG_CLOSURE(tagged_obj);
+                switch ( get_itbl(obj)->type ) {
+                    case IND:
+                    case IND_STATIC:
+                    {
+                        tagged_obj = ACQUIRE_LOAD(&((StgInd*)obj)->indirectee);
+                        goto tag_push_g;
+                    }
+                    case CONSTR:
+                    case CONSTR_1_0:
+                    case CONSTR_0_1:
+                    case CONSTR_2_0:
+                    case CONSTR_1_1:
+                    case CONSTR_0_2:
+                    case CONSTR_NOCAF:
+                        // The value is already evaluated, so we can just return it. However,
+                        // before we do, we MUST ensure that the pointer is tagged, because we
+                        // might return to a native `case` expression, which assumes the returned
+                        // pointer is tagged so it can use the tag to select an alternative.
+                        tagged_obj = tagConstr(obj);
+                        break;
+                    case FUN:
+                    case FUN_1_0:
+                    case FUN_0_1:
+                    case FUN_2_0:
+                    case FUN_1_1:
+                    case FUN_0_2:
+                    case FUN_STATIC:
+                        // Purely for performance since we already hit memory anyway.
+                        tagged_obj = tagFun(obj);
+                        break;
+                    default:
+                        break;
+                }
+            }
+
+            SpW(-1) = (W_) tagged_obj;
             Sp_subW(1);
             goto nextInsn;
         }


=====================================
testsuite/tests/th/should_compile/T24870/Def.hs
=====================================
@@ -0,0 +1,9 @@
+{-# LANGUAGE TemplateHaskell #-}
+
+module SDef where
+
+{-# NOINLINE aValue #-}
+aValue = True
+
+{-# NOINLINE aStrictFunction #-}
+aStrictFunction !x = [| x |]


=====================================
testsuite/tests/th/should_compile/T24870/T24870.stderr
=====================================
@@ -0,0 +1,2 @@
+[1 of 2] Compiling SDef             ( Def.hs, Def.o, Def.dyn_o )
+[2 of 2] Compiling SUse             ( Use.hs, Use.o )


=====================================
testsuite/tests/th/should_compile/T24870/T24870.stderr-mingw32
=====================================
@@ -0,0 +1,2 @@
+[1 of 2] Compiling SDef             ( Def.hs, Def.o )
+[2 of 2] Compiling SUse             ( Use.hs, Use.o )


=====================================
testsuite/tests/th/should_compile/T24870/Use.hs
=====================================
@@ -0,0 +1,9 @@
+{-# LANGUAGE TemplateHaskell #-}
+
+module SUse where
+
+import qualified Language.Haskell.TH.Syntax as TH
+import SDef
+import GHC.Exts
+
+bar = $( inline aStrictFunction aValue )


=====================================
testsuite/tests/th/should_compile/T24870/all.T
=====================================
@@ -0,0 +1,6 @@
+# The interpreter must uphold tagging invariants, and failed to do so in #24870
+# We test this here by having the interpreter calls a strict worker function
+# with a reference to a value it constructed.
+# See also Note [CBV Functions and the interpreter]
+test('T24870', [extra_files(['Def.hs', 'Use.hs']), req_th],
+	      multimod_compile, ['Def Use', '-dtag-inference-checks'])



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/3c48795654f7ea643b5a90b7de9db87b8d044f51

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/3c48795654f7ea643b5a90b7de9db87b8d044f51
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240605/7e8e0bd2/attachment-0001.html>


More information about the ghc-commits mailing list