[Git][ghc/ghc][wip/general-catgeory] unicode: Don't inline bitmap in generalCategory

Matthew Pickering (@mpickering) gitlab at gitlab.haskell.org
Mon Feb 13 15:05:37 UTC 2023



Matthew Pickering pushed to branch wip/general-catgeory at Glasgow Haskell Compiler / GHC


Commits:
c4f246af by Matthew Pickering at 2023-02-13T15:05:22+00:00
unicode: Don't inline bitmap in generalCategory

generalCategory contains a huge literal string but is marked INLINE,
this will duplicate the string into any use site of generalCategory. In
particular generalCategory is used in functions like isSpace and the
literal gets inlined into this function which makes it massive.

https://github.com/haskell/core-libraries-committee/issues/130

Fixes #22949

-------------------------
Metric Decrease:
    T4029
    T18304
-------------------------

- - - - -


3 changed files:

- libraries/base/GHC/Unicode/Internal/Char/UnicodeData/GeneralCategory.hs
- libraries/base/changelog.md
- libraries/base/tools/ucd2haskell/exe/Parser/Text.hs


Changes:

=====================================
libraries/base/GHC/Unicode/Internal/Char/UnicodeData/GeneralCategory.hs
=====================================
The diff for this file was not included because it is too large.

=====================================
libraries/base/changelog.md
=====================================
@@ -4,6 +4,8 @@
   * Add `Data.List.!?` ([CLC proposal #110](https://github.com/haskell/core-libraries-committee/issues/110))
   * `maximumBy`/`minimumBy` are now marked as `INLINE` improving performance for unpackable
     types significantly.
+  * Refactor `generalCategory` to stop very large literal string being inlined to call-sites.
+      ([CLC proposal #130](https://github.com/haskell/core-libraries-committee/issues/130))
 
 ## 4.18.0.0 *TBA*
   * `Foreign.C.ConstPtr.ConstrPtr` was added to encode `const`-qualified


=====================================
libraries/base/tools/ucd2haskell/exe/Parser/Text.hs
=====================================
@@ -205,7 +205,11 @@ genEnumBitmap funcName def as = unlines
                <> show (length as)
                <> " then "
                <> show (fromEnum def)
-               <> " else lookupIntN bitmap# n"
+               <> " else lookup_bitmap n"
+
+    , "{-# NOINLINE lookup_bitmap #-}"
+    , "lookup_bitmap :: Int -> Int"
+    , "lookup_bitmap n = lookupIntN bitmap# n"
     , "  where"
     , "    bitmap# = \"" <> enumMapToAddrLiteral as "\"#"
     ]



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/c4f246afbe71d4e7a29963d0830455076cc9c353

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/commit/c4f246afbe71d4e7a29963d0830455076cc9c353
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230213/7ee90d6a/attachment-0001.html>


More information about the ghc-commits mailing list