[Git][ghc/ghc][wip/marge_bot_batch_merge_job] 5 commits: compiler: Make ghc-experimental not wired in

Marge Bot (@marge-bot) gitlab at gitlab.haskell.org
Thu Jun 13 15:15:18 UTC 2024



Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC


Commits:
b3cc5366 by Matthew Pickering at 2024-06-12T23:06:57-04:00
compiler: Make ghc-experimental not wired in

If you need to wire in definitions, then place them in ghc-internal and
reexport them from ghc-experimental.

Ticket #24903

- - - - -
700eeab9 by Hécate Kleidukos at 2024-06-12T23:07:37-04:00
base: Use a more appropriate unicode arrow for the ByteArray diagram

This commit rectifies the usage of a unicode arrow in favour of one that
doesn't provoke mis-alignment.

- - - - -
cca7de25 by Matthew Pickering at 2024-06-12T23:08:14-04:00
ghcup-metadata: Fix debian version ranges

This was caught by `ghcup-ci` failing and attempting to install a deb12
bindist on deb11.

```
configure: WARNING: m4/prep_target_file.m4: Expecting YES/NO but got  in ArSupportsDashL_STAGE0. Defaulting to False.
bin/ghc-toolchain-bin: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by bin/ghc-toolchain-bin)
bin/ghc-toolchain-bin: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /tmp/tmp.LBhwvFbVoy/foobarbaz/.ghcup/tmp/ghcup-708d9668d5d82287/ghc-9.11.20240609-x86_64-unknown-linux/bin/../lib/x86_64-linux-ghc-9.11.20240609/libHSunix-2.8.5.1-inplace-ghc9.11.20240609.so)
bin/ghc-toolchain-bin: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /tmp/tmp.LBhwvFbVoy/foobarbaz/.ghcup/tmp/ghcup-708d9668d5d82287/ghc-9.11.20240609-x86_64-unknown-linux/bin/../lib/x86_64-linux-ghc-9.11.20240609/libHSunix-2.8.5.1-inplace-ghc9.11.20240609.so)
```

Fixes #24974

- - - - -
c29a3ad2 by Pierre Le Marre at 2024-06-13T11:14:51-04:00
ucd2haskell: remove Streamly dependency + misc

- Remove dead code.
- Remove `streamly` dependency.
- Process files with `bytestring`.
- Replace Unicode files parsers with the corresponding ones from the
  package `unicode-data-parser`.
- Simplify cabal file and rename module
- Regenerate `ghc-internal` Unicode files with new header

- - - - -
565f6192 by Jacco Krijnen at 2024-06-13T11:14:52-04:00
Document how to run haddocks tests (#24976)

Also remove ghc 9.7 requirement

- - - - -


17 changed files:

- .gitlab/rel_eng/mk-ghcup-metadata/mk_ghcup_metadata.py
- compiler/GHC/Builtin/Names.hs
- compiler/GHC/Unit/Types.hs
- libraries/base/src/Data/Array/Byte.hs
- libraries/ghc-experimental/ghc-experimental.cabal
- libraries/ghc-internal/src/GHC/Internal/Unicode/Char/DerivedCoreProperties.hs
- libraries/ghc-internal/src/GHC/Internal/Unicode/Char/UnicodeData/GeneralCategory.hs
- libraries/ghc-internal/src/GHC/Internal/Unicode/Char/UnicodeData/SimpleLowerCaseMapping.hs
- libraries/ghc-internal/src/GHC/Internal/Unicode/Char/UnicodeData/SimpleTitleCaseMapping.hs
- libraries/ghc-internal/src/GHC/Internal/Unicode/Char/UnicodeData/SimpleUpperCaseMapping.hs
- libraries/ghc-internal/src/GHC/Internal/Unicode/Version.hs
- − libraries/ghc-internal/tools/ucd2haskell/exe/Parser/Text.hs
- libraries/ghc-internal/tools/ucd2haskell/exe/UCD2Haskell.hs
- + libraries/ghc-internal/tools/ucd2haskell/exe/UCD2Haskell/ModuleGenerators.hs
- libraries/ghc-internal/tools/ucd2haskell/ucd2haskell.cabal
- utils/haddock/CONTRIBUTING.md
- utils/haddock/cabal.project


Changes:

=====================================
.gitlab/rel_eng/mk-ghcup-metadata/mk_ghcup_metadata.py
=====================================
@@ -217,7 +217,7 @@ def mk_new_yaml(release_mode, version, date, pipeline_type, job_map):
     a64 = { "Linux_Debian": { "< 10": deb9
                            , "( >= 10 && < 11 )": deb10
                            , "( >= 11 && < 12 )": deb11
-                           , ">= 11": deb12
+                           , ">= 12": deb12
                            , "unknown_versioning": deb11 }
           , "Linux_Ubuntu" : { "unknown_versioning": ubuntu2004
                              , "( >= 16 && < 18 )": deb9


=====================================
compiler/GHC/Builtin/Names.hs
=====================================
@@ -669,10 +669,6 @@ gHC_INTERNAL_OVER_LABELS = mkGhcInternalModule (fsLit "GHC.Internal.OverloadedLa
 gHC_INTERNAL_RECORDS :: Module
 gHC_INTERNAL_RECORDS = mkGhcInternalModule (fsLit "GHC.Internal.Records")
 
-dATA_TUPLE_EXPERIMENTAL, dATA_SUM_EXPERIMENTAL :: Module
-dATA_TUPLE_EXPERIMENTAL = mkExperimentalModule (fsLit "Data.Tuple.Experimental")
-dATA_SUM_EXPERIMENTAL = mkExperimentalModule (fsLit "Data.Sum.Experimental")
-
 rOOT_MAIN :: Module
 rOOT_MAIN       = mkMainModule (fsLit ":Main") -- Root module for initialisation
 
@@ -714,9 +710,6 @@ mkMainModule m = mkModule mainUnit (mkModuleNameFS m)
 mkMainModule_ :: ModuleName -> Module
 mkMainModule_ m = mkModule mainUnit m
 
-mkExperimentalModule :: FastString -> Module
-mkExperimentalModule m = mkModule experimentalUnit (mkModuleNameFS m)
-
 {-
 ************************************************************************
 *                                                                      *


=====================================
compiler/GHC/Unit/Types.hs
=====================================
@@ -77,7 +77,6 @@ module GHC.Unit.Types
    , mainUnit
    , thisGhcUnit
    , interactiveUnit
-   , experimentalUnit
 
    , isInteractiveModule
    , wiredInUnitIds
@@ -594,11 +593,10 @@ Make sure you change 'GHC.Unit.State.findWiredInUnits' if you add an entry here.
 -}
 
 bignumUnitId, primUnitId, ghcInternalUnitId, baseUnitId, rtsUnitId,
-  mainUnitId, thisGhcUnitId, interactiveUnitId,
-  experimentalUnitId :: UnitId
+  mainUnitId, thisGhcUnitId, interactiveUnitId :: UnitId
 
 bignumUnit, primUnit, ghcInternalUnit, baseUnit, rtsUnit,
-  mainUnit, thisGhcUnit, interactiveUnit, experimentalUnit  :: Unit
+  mainUnit, thisGhcUnit, interactiveUnit :: Unit
 
 primUnitId        = UnitId (fsLit "ghc-prim")
 bignumUnitId      = UnitId (fsLit "ghc-bignum")
@@ -607,7 +605,6 @@ baseUnitId        = UnitId (fsLit "base")
 rtsUnitId         = UnitId (fsLit "rts")
 thisGhcUnitId     = UnitId (fsLit cProjectUnitId) -- See Note [GHC's Unit Id]
 interactiveUnitId = UnitId (fsLit "interactive")
-experimentalUnitId = UnitId (fsLit "ghc-experimental")
 
 primUnit          = RealUnit (Definite primUnitId)
 bignumUnit        = RealUnit (Definite bignumUnitId)
@@ -616,7 +613,6 @@ baseUnit          = RealUnit (Definite baseUnitId)
 rtsUnit           = RealUnit (Definite rtsUnitId)
 thisGhcUnit       = RealUnit (Definite thisGhcUnitId)
 interactiveUnit   = RealUnit (Definite interactiveUnitId)
-experimentalUnit  = RealUnit (Definite experimentalUnitId)
 
 -- | This is the package Id for the current program.  It is the default
 -- package Id if you don't specify a package name.  We don't add this prefix
@@ -634,7 +630,6 @@ wiredInUnitIds =
    , ghcInternalUnitId
    , baseUnitId
    , rtsUnitId
-   , experimentalUnitId
    ]
    -- NB: ghc is no longer part of the wired-in units since its unit-id, given
    -- by hadrian or cabal, is no longer overwritten and now matches both the


=====================================
libraries/base/src/Data/Array/Byte.hs
=====================================
@@ -48,7 +48,7 @@ import Prelude
 -- The memory representation of a 'ByteArray' is:
 --
 -- > ╭─────────────┬───╮  ╭────────┬──────┬─────────╮
--- > │ Constructor │ * ┼─➤│ Header │ Size │ Payload │
+-- > │ Constructor │ * ┼─►│ Header │ Size │ Payload │
 -- > ╰─────────────┴───╯  ╰────────┴──────┴─────────╯
 --
 -- And its overhead is the following:


=====================================
libraries/ghc-experimental/ghc-experimental.cabal
=====================================
@@ -35,4 +35,3 @@ library
                       ghc-prim >= 0.11 && < 0.12
     hs-source-dirs:   src
     default-language: Haskell2010
-    ghc-options: -this-unit-id ghc-experimental


=====================================
libraries/ghc-internal/src/GHC/Internal/Unicode/Char/DerivedCoreProperties.hs
=====================================
@@ -8,9 +8,8 @@
 -----------------------------------------------------------------------------
 -- |
 -- Module      : GHC.Internal.Unicode.Char.DerivedCoreProperties
--- Copyright   : (c) 2020 Composewell Technologies and Contributors
 -- License     : BSD-3-Clause
--- Maintainer  : streamly at composewell.com
+-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>
 -- Stability   : internal
 -----------------------------------------------------------------------------
 


=====================================
libraries/ghc-internal/src/GHC/Internal/Unicode/Char/UnicodeData/GeneralCategory.hs
=====================================
@@ -8,9 +8,8 @@
 -----------------------------------------------------------------------------
 -- |
 -- Module      : GHC.Internal.Unicode.Char.UnicodeData.GeneralCategory
--- Copyright   : (c) 2020 Composewell Technologies and Contributors
 -- License     : BSD-3-Clause
--- Maintainer  : streamly at composewell.com
+-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>
 -- Stability   : internal
 -----------------------------------------------------------------------------
 


=====================================
libraries/ghc-internal/src/GHC/Internal/Unicode/Char/UnicodeData/SimpleLowerCaseMapping.hs
=====================================
@@ -7,9 +7,8 @@
 -----------------------------------------------------------------------------
 -- |
 -- Module      : GHC.Internal.Unicode.Char.UnicodeData.SimpleLowerCaseMapping
--- Copyright   : (c) 2020 Composewell Technologies and Contributors
 -- License     : BSD-3-Clause
--- Maintainer  : streamly at composewell.com
+-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>
 -- Stability   : internal
 -----------------------------------------------------------------------------
 


=====================================
libraries/ghc-internal/src/GHC/Internal/Unicode/Char/UnicodeData/SimpleTitleCaseMapping.hs
=====================================
@@ -7,9 +7,8 @@
 -----------------------------------------------------------------------------
 -- |
 -- Module      : GHC.Internal.Unicode.Char.UnicodeData.SimpleTitleCaseMapping
--- Copyright   : (c) 2020 Composewell Technologies and Contributors
 -- License     : BSD-3-Clause
--- Maintainer  : streamly at composewell.com
+-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>
 -- Stability   : internal
 -----------------------------------------------------------------------------
 


=====================================
libraries/ghc-internal/src/GHC/Internal/Unicode/Char/UnicodeData/SimpleUpperCaseMapping.hs
=====================================
@@ -7,9 +7,8 @@
 -----------------------------------------------------------------------------
 -- |
 -- Module      : GHC.Internal.Unicode.Char.UnicodeData.SimpleUpperCaseMapping
--- Copyright   : (c) 2020 Composewell Technologies and Contributors
 -- License     : BSD-3-Clause
--- Maintainer  : streamly at composewell.com
+-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>
 -- Stability   : internal
 -----------------------------------------------------------------------------
 


=====================================
libraries/ghc-internal/src/GHC/Internal/Unicode/Version.hs
=====================================
@@ -6,9 +6,8 @@
 -----------------------------------------------------------------------------
 -- |
 -- Module      : GHC.Internal.Unicode.Version
--- Copyright   : (c) 2020 Composewell Technologies and Contributors
 -- License     : BSD-3-Clause
--- Maintainer  : streamly at composewell.com
+-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>
 -- Stability   : internal
 -----------------------------------------------------------------------------
 


=====================================
libraries/ghc-internal/tools/ucd2haskell/exe/Parser/Text.hs deleted
=====================================
@@ -1,1127 +0,0 @@
-{-# LANGUAGE LambdaCase #-}
-
--- |
--- Module      : Parser.Text
--- Copyright   : (c) 2020 Composewell Technologies and Contributors
---               (c) 2016-2017 Harendra Kumar
---               (c) 2014-2015 Antonio Nikishaev
--- License     : BSD-3-Clause
--- Maintainer  : streamly at composewell.com
--- Stability   : internal
-
--- This code was taken from https://github.com/composewell/unicode-data.
--- The original Unicode database parser was taken from
--- https://github.com/composewell/unicode-transforms but was completely
--- rewritten from scratch to parse from UCD text files instead of XML, only
--- some types remain the same. That code in turn was originally taken from
--- https://github.com/llelf/prose (Antonio Nikishaev) and heavily modified by
--- Harendra Kumar.
---
-module Parser.Text (genModules) where
-
-import Control.Exception (catch, IOException)
-import Control.Monad (void)
-import Data.Bits (Bits(..))
-import Data.Word (Word8)
-import Data.Char (chr, ord, isSpace)
-import Data.Functor ((<&>))
-import Data.Function ((&))
-import Data.List (intersperse, unfoldr)
-import Data.List.Split (splitWhen)
-import Numeric (showHex)
-import Streamly.Data.Fold (Fold)
-import System.Directory (createDirectoryIfMissing)
-import System.Environment (getEnv)
-import System.FilePath ((</>), (<.>))
-
--- import qualified Data.Set as Set
-import Streamly.Data.Stream (Stream)
-import qualified Streamly.Data.Stream.Prelude as Stream
-import qualified Streamly.Data.Fold as Fold
-import qualified Streamly.Internal.Data.Fold as Fold
-import qualified Streamly.Data.Unfold as Unfold
-import qualified Streamly.FileSystem.Handle as Handle
-import qualified Streamly.Unicode.Stream as Unicode
-import qualified Streamly.Internal.Unicode.Stream as Unicode
-import qualified System.IO as Sys
-
-import Prelude hiding (pred)
-
--------------------------------------------------------------------------------
--- Types
--------------------------------------------------------------------------------
-
-data GeneralCategory =
-    Lu|Ll|Lt|             --LC
-    Lm|Lo|                --L
-    Mn|Mc|Me|             --M
-    Nd|Nl|No|             --N
-    Pc|Pd|Ps|Pe|Pi|Pf|Po| --P
-    Sm|Sc|Sk|So|          --S
-    Zs|Zl|Zp|             --Z
-    Cc|Cf|Cs|Co|Cn        --C
-    deriving (Show, Bounded, Enum, Read)
-
-data DecompType =
-       DTCanonical | DTCompat  | DTFont
-     | DTNoBreak   | DTInitial | DTMedial   | DTFinal
-     | DTIsolated  | DTCircle  | DTSuper    | DTSub
-     | DTVertical  | DTWide    | DTNarrow
-     | DTSmall     | DTSquare  | DTFraction
-    deriving (Show, Eq)
-
-data Decomp = DCSelf | DC [Char] deriving (Show, Eq)
-
--- data DType = Canonical | Kompat
-
-data DetailedChar =
-    DetailedChar
-        { _char :: Char
-        , _name :: String
-        , _generalCategory :: GeneralCategory
-        , _combiningClass :: Int
-        , _decompositionType :: Maybe DecompType
-        , _decomposition :: Decomp
-        , _simpleUppercaseMapping :: Maybe Char
-        , _simpleLowercaseMapping :: Maybe Char
-        , _simpleTitlecaseMapping :: Maybe Char
-        }
-    deriving (Show)
-
-{- [NOTE] Used by disabled generator
-
--- See: https://www.unicode.org/reports/tr44/#Default_Values_Table
-mkDefaultDetailedChar :: Char -> DetailedChar
-mkDefaultDetailedChar c = DetailedChar
-  { _char = c
-  , _name = mempty
-  , _generalCategory = Cn
-  , _combiningClass = 0
-  , _decompositionType = Nothing
-  , _decomposition = DCSelf
-  , _simpleUppercaseMapping = Nothing
-  , _simpleLowercaseMapping = Nothing
-  , _simpleTitlecaseMapping = Nothing }
--}
-
--------------------------------------------------------------------------------
--- Helpers
--------------------------------------------------------------------------------
-
-headerRule :: String
-headerRule = "-----------------------------------------------------------------------------"
-
-mkModuleHeader :: String -> String
-mkModuleHeader modName =
-    unlines
-        [ headerRule
-        , "-- |"
-        , "-- Module      : " <> modName
-        , "-- Copyright   : (c) 2020 Composewell Technologies and Contributors"
-        , "-- License     : BSD-3-Clause"
-        -- [FIXME] Update maintainer
-        , "-- Maintainer  : streamly at composewell.com"
-        , "-- Stability   : internal"
-        , headerRule
-        ]
-
-readCodePoint :: String -> Char
-readCodePoint = chr . read . ("0x"<>)
-
-readCodePointM :: String -> Maybe Char
-readCodePointM "" = Nothing
-readCodePointM u  = Just (readCodePoint u)
-
-genSignature :: String -> String
-genSignature = (<> " :: Char -> Bool")
-
--- | Check that var is between minimum and maximum of orderList
-genRangeCheck :: String -> [Int] -> String
-genRangeCheck var ordList =
-    var
-        <> " >= "
-        <> show (minimum ordList)
-        <> " && " <> var <> " <= " <> show (maximum ordList)
-
-genBitmap :: String -> [Int] -> String
-genBitmap funcName ordList =
-    unlines
-        [ "{-# INLINE " <> funcName <> " #-}"
-        , genSignature funcName
-        , funcName <> " = \\c -> let n = ord c in "
-              <> genRangeCheck "n" ordList <> " && lookupBit64 bitmap# n"
-        , "  where"
-        , "    bitmap# = \"" <> bitMapToAddrLiteral (positionsToBitMap ordList) "\"#"
-        ]
-
-positionsToBitMap :: [Int] -> [Bool]
-positionsToBitMap = go 0
-
-    where
-
-    go _ [] = []
-    go i xxs@(x:xs)
-        | i < x = False : go (i + 1) xxs
-        | otherwise = True : go (i + 1) xs
-
-bitMapToAddrLiteral ::
-  -- | Values to encode
-  [Bool] ->
-  -- | String to append
-  String ->
-  String
-bitMapToAddrLiteral bs cs = foldr encode cs (unfoldr mkChunks bs)
-
-    where
-
-    mkChunks :: [a] -> Maybe ([a], [a])
-    mkChunks [] = Nothing
-    mkChunks xs = Just $ splitAt 8 xs
-
-    encode :: [Bool] -> String -> String
-    encode chunk acc = '\\' : shows (toByte (padTo8 chunk)) acc
-
-    padTo8 :: [Bool] -> [Bool]
-    padTo8 xs
-        | length xs >= 8 = xs
-        | otherwise = xs <> replicate (8 - length xs) False
-
-    toByte :: [Bool] -> Int
-    toByte xs = sum $ map (\i -> if xs !! i then 1 `shiftL` i else 0) [0..7]
-
-genEnumBitmap ::
-  forall a. (Bounded a, Enum a, Show a) =>
-  -- | Function name
-  String ->
-  -- | Default value
-  a ->
-  -- | List of values to encode
-  [a] ->
-  String
-genEnumBitmap funcName def as = unlines
-    [ "{-# INLINE " <> funcName <> " #-}"
-    , funcName <> " :: Char -> Int"
-    , funcName <> " c = let n = ord c in if n >= "
-               <> show (length as)
-               <> " then "
-               <> show (fromEnum def)
-               <> " else lookup_bitmap n"
-
-    , "{-# NOINLINE lookup_bitmap #-}"
-    , "lookup_bitmap :: Int -> Int"
-    , "lookup_bitmap n = lookupIntN bitmap# n"
-    , "  where"
-    , "    bitmap# = \"" <> enumMapToAddrLiteral as "\"#"
-    ]
-
-{-| Encode a list of values as a byte map, using their 'Enum' instance.
-
-__Note:__ 'Enum' instance must respect the following:
-
-* @fromEnum minBound >= 0x00@
-* @fromEnum maxBound <= 0xff@
--}
-enumMapToAddrLiteral ::
-  forall a. (Bounded a, Enum a, Show a) =>
-  -- | Values to encode
-  [a] ->
-  -- | String to append
-  String ->
-  String
-enumMapToAddrLiteral xs cs = foldr go cs xs
-
-    where
-
-    go :: a -> String -> String
-    go x acc = '\\' : shows (toWord8 x) acc
-
-    toWord8 :: a -> Word8
-    toWord8 a = let w = fromEnum a in if 0 <= w && w <= 0xff
-        then fromIntegral w
-        else error $ "Cannot convert to Word8: " <> show a
-
-{- [NOTE] Disabled generator (normalization)
--- This bit of code is duplicated but this duplication allows us to reduce 2
--- dependencies on the executable.
-
-jamoLCount :: Int
-jamoLCount = 19
-
-jamoVCount :: Int
-jamoVCount = 21
-
-jamoTCount :: Int
-jamoTCount = 28
-
-hangulFirst :: Int
-hangulFirst = 0xac00
-
-hangulLast :: Int
-hangulLast = hangulFirst + jamoLCount * jamoVCount * jamoTCount - 1
-
-isHangul :: Char -> Bool
-isHangul c = n >= hangulFirst && n <= hangulLast
-    where n = ord c
--}
-
-genUnicodeVersion :: FilePath -> IO ()
-genUnicodeVersion outdir = do
-  version <- catch
-              (getEnv "UNICODE_VERSION")
-              (\(_ :: IOException) -> return "<unknown>")
-  Stream.fold f (Stream.fromList (body version))
-  where
-    moduleName = "GHC.Internal.Unicode.Version"
-    f = moduleFileEmitter Nothing outdir
-          (moduleName, \_ -> Fold.foldMap (<> "\n"))
-    body :: String -> [String]
-    body version =
-      [ "{-# LANGUAGE NoImplicitPrelude #-}"
-      , "{-# OPTIONS_HADDOCK hide #-}"
-      , ""
-      , mkModuleHeader moduleName
-      , "module " <> moduleName
-      , "(unicodeVersion)"
-      , "where"
-      , ""
-      , "import {-# SOURCE #-} GHC.Internal.Data.Version"
-      , ""
-      , "-- | Version of Unicode standard used by @base@:"
-      , "-- [" <> version <> "](https://www.unicode.org/versions/Unicode" <> version <> "/)."
-      , "--"
-      , "-- @since base-4.15.0.0"
-      , "unicodeVersion :: Version"
-      , "unicodeVersion = makeVersion [" <> mkVersion version <> "]" ]
-    mkVersion = foldr (\c acc -> case c of {'.' -> ',':' ':acc; _ -> c:acc}) mempty
-
--------------------------------------------------------------------------------
--- Parsers
--------------------------------------------------------------------------------
-
--------------------------------------------------------------------------------
--- Parsing UnicodeData.txt
--------------------------------------------------------------------------------
-
-genGeneralCategoryModule
-    :: Monad m
-    => String
-    -> Fold m DetailedChar String
-genGeneralCategoryModule moduleName =
-    done <$> Fold.foldl' step initial
-
-    where
-
-    -- (categories, expected char)
-    initial = ([], '\0')
-
-    step (acc, p) a = if p < _char a
-        -- Fill missing char entry with default category Cn
-        -- See: https://www.unicode.org/reports/tr44/#Default_Values_Table
-        then step (Cn : acc, succ p) a
-        -- Regular entry
-        else (_generalCategory a : acc, succ (_char a))
-
-    done (acc, _) = unlines
-        [ "{-# LANGUAGE NoImplicitPrelude #-}"
-        , "{-# LANGUAGE MagicHash #-}"
-        , "{-# OPTIONS_HADDOCK hide #-}"
-        , ""
-        , mkModuleHeader moduleName
-        , "module " <> moduleName
-        , "(generalCategory)"
-        , "where"
-        , ""
-        , "import GHC.Internal.Base (Char, Int, Ord(..), ord)"
-        , "import GHC.Internal.Unicode.Bits (lookupIntN)"
-        , ""
-        , genEnumBitmap "generalCategory" Cn (reverse acc)
-        ]
-
-readDecomp :: String -> (Maybe DecompType, Decomp)
-readDecomp s =
-    if null wrds
-    then (Nothing, DCSelf)
-    else decmps wrds
-
-    where
-
-    decmps [] = error "Unreachable flow point"
-    decmps y@(x:xs) =
-        case dtmap x of
-            DTCanonical -> (,) (Just DTCanonical) (readCP y)
-            other -> (,) (Just other) (readCP xs)
-
-    wrds = words s
-
-    readCP ws = DC $ map readCodePoint ws
-
-    dtmap "<compat>" = DTCompat
-    dtmap "<circle>" = DTCircle
-    dtmap "<final>" = DTFinal
-    dtmap "<font>" = DTFont
-    dtmap "<fraction>" = DTFraction
-    dtmap "<initial>" = DTInitial
-    dtmap "<isolated>" = DTIsolated
-    dtmap "<medial>" = DTMedial
-    dtmap "<narrow>" = DTNarrow
-    dtmap "<noBreak>" = DTNoBreak
-    dtmap "<small>" = DTSmall
-    dtmap "<square>" = DTSquare
-    dtmap "<sub>" = DTSub
-    dtmap "<super>" = DTSuper
-    dtmap "<vertical>" = DTVertical
-    dtmap "<wide>" = DTWide
-    dtmap _ = DTCanonical
-
-{- [NOTE] Disabled generators
-
-filterNonHangul :: Monad m => Fold m DetailedChar a -> Fold m DetailedChar a
-filterNonHangul = Fold.filter (not . isHangul . _char)
-
-filterDecomposableType ::
-       Monad m => DType -> Fold m DetailedChar a -> Fold m DetailedChar a
-filterDecomposableType dtype =
-    Fold.filter ((/= DCSelf) . _decomposition)
-        . Fold.filter (predicate . _decompositionType)
-
-    where
-
-    predicate =
-        case dtype of
-            Canonical -> (== Just DTCanonical)
-            Kompat -> const True
-
-genDecomposableModule ::
-       Monad m => String -> DType -> Fold m DetailedChar String
-genDecomposableModule moduleName dtype =
-    filterNonHangul
-        $ filterDecomposableType dtype $ done <$> Fold.foldl' step initial
-
-    where
-
-    initial = []
-
-    step st a = ord (_char a) : st
-
-    done st =
-        unlines
-            [ "{-# LANGUAGE NoImplicitPrelude #-}"
-            , "{-# LANGUAGE MagicHash #-}"
-            , "{-# OPTIONS_HADDOCK hide #-}"
-            , ""
-            , mkModuleHeader moduleName
-            , "module " <> moduleName
-            , "(isDecomposable)"
-            , "where"
-            , ""
-            , "import Data.Char (ord)"
-            , "import GHC.Internal.Unicode.Bits (lookupBit64)"
-            , ""
-            , genBitmap "isDecomposable" (reverse st)
-            ]
-
-genCombiningClassModule :: Monad m => String -> Fold m DetailedChar String
-genCombiningClassModule moduleName =
-    Fold.filter (\dc -> _combiningClass dc /= 0)
-        $ done <$> Fold.foldl' step initial
-
-    where
-
-    initial = ([], [])
-
-    step (st1, st2) a = (genCombiningClassDef a : st1, ord (_char a) : st2)
-
-    done (st1, st2) =
-        unlines
-            [ "{-# LANGUAGE NoImplicitPrelude #-}"
-            , "{-# LANGUAGE MagicHash #-}"
-            , "{-# OPTIONS_HADDOCK hide #-}"
-            , ""
-            , mkModuleHeader moduleName
-            , "module " <> moduleName
-            , "(combiningClass, isCombining)"
-            , "where"
-            , ""
-            , "import Data.Char (ord)"
-            , "import GHC.Internal.Unicode.Bits (lookupBit64)"
-            , ""
-            , "combiningClass :: Char -> Int"
-            , unlines (reverse st1)
-            , "combiningClass _ = 0\n"
-            , ""
-            , genBitmap "isCombining" (reverse st2)
-            ]
-
-    genCombiningClassDef dc =
-        "combiningClass "
-            <> show (_char dc) <> " = " <> show (_combiningClass dc)
-
-genDecomposeDefModule ::
-       Monad m
-    => String
-    -> [String]
-    -> [String]
-    -> DType
-    -> (Int -> Bool)
-    -> Fold m DetailedChar String
-genDecomposeDefModule moduleName before after dtype pred =
-    Fold.filter (pred . ord . _char)
-        $ filterNonHangul
-        $ filterDecomposableType dtype $ done <$> Fold.foldl' step initial
-
-    where
-
-    decomposeChar c DCSelf = [c]
-    decomposeChar _c (DC ds) = ds
-
-    genHeader =
-        [ "{-# OPTIONS_GHC -fno-warn-incomplete-patterns #-}"
-        , ""
-        , mkModuleHeader moduleName
-        , "module " <> moduleName
-        , "(decompose)"
-        , "where"
-        , ""
-        ]
-    genSign =
-        [ "-- Note: this is a partial function we do not expect to call"
-        , "-- this if isDecomposable returns false."
-        , "{-# NOINLINE decompose #-}"
-        , "decompose :: Char -> [Char]"
-        ]
-    initial = []
-
-    step st dc = genDecomposeDef dc : st
-
-    done st =
-        let body = mconcat [genHeader, before, genSign, reverse st, after]
-        in unlines body
-
-    genDecomposeDef dc =
-        "decompose "
-            <> show (_char dc)
-            <> " = " <> show (decomposeChar (_char dc) (_decomposition dc))
-
-genCompositionsModule ::
-       Monad m
-    => String
-    -> [Int]
-    -> [Int]
-    -> Fold m DetailedChar String
-genCompositionsModule moduleName compExclu non0CC =
-    Fold.filter (not . flip elem compExclu . ord . _char)
-        $ filterNonHangul
-        $ Fold.filter (isDecompositionLen2 . _decomposition)
-        $ filterDecomposableType Canonical $ done <$> Fold.foldl' step initial
-
-    where
-
-    isDecompositionLen2 DCSelf = False
-    isDecompositionLen2 (DC ds) = length ds == 2
-
-    genComposePairDef name dc =
-        name
-            <> " "
-            <> show (head d01)
-            <> " " <> show (d01 !! 1) <> " = Just " <> show (_char dc)
-
-        where
-
-        d01 = decompPair dc
-
-    decompPair dc =
-        case _decomposition dc of
-            DCSelf -> error "toCompFormat: DCSelf"
-            (DC ds) ->
-                if length ds == 2
-                then ds
-                else error "toCompFormat: length /= 2"
-
-    initial = ([], [], [])
-
-    step (dec, sp, ss) dc = (dec1, sp1, ss1)
-
-        where
-
-        d01 = decompPair dc
-        d1Ord = ord $ d01 !! 1
-        dec1 = genComposePairDef "compose" dc : dec
-        sp1 =
-            if d1Ord `notElem` non0CC
-            then genComposePairDef "composeStarters" dc : sp
-            else sp
-        ss1 =
-            if d1Ord `notElem` non0CC
-            then d1Ord : ss
-            else ss
-
-    header =
-        [ "{-# LANGUAGE NoImplicitPrelude #-}"
-        , "{-# LANGUAGE MagicHash #-}"
-        , "{-# OPTIONS_GHC -fno-warn-incomplete-patterns #-}"
-        , "{-# OPTIONS_HADDOCK hide #-}"
-        , ""
-        , mkModuleHeader moduleName
-        , "module " <> moduleName
-        , "(compose, composeStarters, isSecondStarter)"
-        , "where"
-        , ""
-        , "import GHC.Internal.Base (Char, ord)"
-        , "import GHC.Internal.Unicode.Bits (lookupBit64)"
-        , ""
-        ]
-
-    composePair decomps =
-        [ "{-# NOINLINE compose #-}"
-        , "compose :: Char -> Char -> Maybe Char"
-        , unlines decomps
-        , "compose _ _ = " <> "Nothing" <> "\n"
-        , ""
-        ]
-
-    composeStarterPair starterPairs =
-        [ "composeStarters :: Char -> Char -> Maybe Char"
-        , unlines starterPairs
-        , "composeStarters _ _ = " <> "Nothing" <> "\n"
-        , ""
-        ]
-
-    isSecondStarter secondStarters =
-        [genBitmap "isSecondStarter" secondStarters]
-
-    done (dec, sp, ss) =
-        unlines
-            $ header
-            <> composePair (reverse dec)
-            <> composeStarterPair (reverse sp)
-            <> isSecondStarter (Set.toList (Set.fromList ss))
--}
-genSimpleCaseMappingModule
-    :: Monad m
-    => String
-    -> String
-    -> (DetailedChar -> Maybe Char)
-    -> Fold m DetailedChar String
-genSimpleCaseMappingModule moduleName funcName field =
-    done <$> Fold.foldl' step initial
-
-    where
-
-    genHeader =
-        [ "{-# LANGUAGE NoImplicitPrelude, LambdaCase #-}"
-        , "{-# OPTIONS_HADDOCK hide #-}"
-        , ""
-        , mkModuleHeader moduleName
-        , "module " <> moduleName
-        , "(" <> funcName <> ")"
-        , "where"
-        , ""
-        , "import GHC.Internal.Base (Char)"
-        , ""
-        ]
-    genSign =
-        [ "{-# NOINLINE " <> funcName <> " #-}"
-        , funcName <> " :: Char -> Char"
-        , funcName <> " = \\case"
-        ]
-    initial = []
-
-    step ds dc = case mkEntry dc of
-        Nothing -> ds
-        Just d  -> d : ds
-
-    after = ["  c -> c"]
-
-    done st =
-        let body = mconcat [genHeader, genSign, reverse st, after]
-        in unlines body
-
-    mkEntry dc = field dc <&> \c -> mconcat
-        [ "  '\\x"
-        , showHexChar (_char dc) "' -> '\\x"
-        , showHexChar c "'"
-        ]
-
-    showHexChar c = showHex (ord c)
-
-genCorePropertiesModule ::
-       Monad m => String -> (String -> Bool) -> Fold m (String, [Int]) String
-genCorePropertiesModule moduleName isProp =
-    Fold.filter (\(name, _) -> isProp name) $ done <$> Fold.foldl' step initial
-
-    where
-
-    prop2FuncName x = "is" <> x
-
-    initial = ([], [])
-
-    step (props, bitmaps) (name, bits) =
-        (name : props, genBitmap (prop2FuncName name) bits : bitmaps)
-
-    done (props, bitmaps) = unlines $ header props <> bitmaps
-
-    header exports =
-        [ "{-# LANGUAGE NoImplicitPrelude #-}"
-        , "{-# LANGUAGE MagicHash #-}"
-        , "{-# OPTIONS_HADDOCK hide #-}"
-        , ""
-        , mkModuleHeader moduleName
-        , "module " <> moduleName
-        , "(" <> unwords (intersperse "," (map prop2FuncName exports)) <> ")"
-        , "where"
-        , ""
-        , "import GHC.Internal.Base (Bool, Char, Ord(..), (&&), ord)"
-        , "import GHC.Internal.Unicode.Bits (lookupBit64)"
-        , ""
-        ]
-
-{- [NOTE] Disabled generator
-genUnicode002TestResults :: Monad m => Fold m DetailedChar String
-genUnicode002TestResults = done <$> Fold.foldl' step initial
-
-    where
-
-    header = "Code C P S U L A D"
-    -- (output, expected char)
-    initial = ([], '\0')
-    -- [TODO] Increase the number of tested char?
-    -- maxChar = '\xF0000' -- First codepoint of the last private use areas.
-    -- maxChar = '\xFFFF' -- Last codepoint of BMP.
-    maxChar = chr 6553 -- Value in GHC 9.2.2 base test
-
-    step (acc, c) dc = if c > maxChar
-      then (acc, c)
-      else if c < _char dc
-        -- Fill missing char entry with default values
-        -- See: https://www.unicode.org/reports/tr44/#Default_Values_Table
-        then step (mkEntry (mkDefaultDetailedChar c) : acc, succ c) dc
-        -- Regular entry
-        else (mkEntry dc : acc, succ (_char dc))
-
-    done (acc, _) = unlines (header : reverse acc)
-
-    mkEntry dc = mconcat
-        [ showCode (_char dc)
-        -- [TODO] General category
-        , showBool (isControl' dc)
-        , showBool (isPrint'   dc)
-        , showBool (isSpace'   dc)
-        -- [TODO] isSeparator
-        , showBool (isUpper'   dc)
-        , showBool (isLower'   dc)
-        , showBool (isAlpha'   dc)
-        -- [TODO] isAlphaNum
-        , showBool (isDigit'   dc)
-        -- [TODO] isNumber
-        -- [TODO] isMark
-        -- [TODO] isPunctuation
-        -- [TODO] isSymbol
-        ]
-
-    padding = length (show (ord maxChar))
-    showCode c = take padding (shows (ord c) (repeat ' '))
-    -- [TODO] use showHex
-    -- showCode c =
-    --   let code = showHex (ord c) mempty
-    --   in replicate (padding - length code) '0' <> code
-    showBool b = if b then " T" else " F"
-
-    -- [NOTE] The following functions replicates Data.Char. Keep them up to date!
-
-    isControl' dc = case _generalCategory dc of
-      Cc -> True   -- Control
-      _  -> False
-
-    isPrint' dc = case _generalCategory dc of
-      Zl -> False -- LineSeparator
-      Zp -> False -- ParagraphSeparator
-      Cc -> False -- Control
-      Cf -> False -- Format
-      Cs -> False -- Surrogate
-      Co -> False -- PrivateUse
-      Cn -> False -- NotAssigned
-      _  -> True
-
-    isSpace' dc = case _char dc of
-      '\t' -> True
-      '\n' -> True
-      '\v' -> True
-      '\f' -> True
-      '\r' -> True
-      _    -> case _generalCategory dc of
-        Zs -> True -- Space
-        _  -> False
-
-    isUpper' dc = case _generalCategory dc of
-      Lu -> True -- UppercaseLetter
-      Lt -> True -- TitlecaseLetter
-      _  -> False
-
-    isLower' dc = case _generalCategory dc of
-      Ll -> True -- LowercaseLetter
-      _  -> False
-
-    isAlpha' dc = case _generalCategory dc of
-      Lu -> True -- UppercaseLetter
-      Ll -> True -- LowercaseLetter
-      Lt -> True -- TitlecaseLetter
-      Lm -> True -- ModifierLetter
-      Lo -> True -- OtherLetter
-      _  -> False
-
-    isDigit' dc = let c = _char dc
-                  in (fromIntegral (ord c - ord '0') :: Word) <= 9
--}
-
--------------------------------------------------------------------------------
--- Parsing property files
--------------------------------------------------------------------------------
-
-type PropertyLine = (String, [Int])
-
-trim :: String -> String
-trim = takeWhile (not . isSpace) . dropWhile isSpace
-
-emptyPropertyLine :: PropertyLine
-emptyPropertyLine = ("", [])
-
-combinePropertyLines :: PropertyLine -> PropertyLine -> PropertyLine
-combinePropertyLines t1@(n1, o1) t2@(n2, o2)
-    | n1 == "" = t2
-    | n2 == "" = t1
-    | n1 == n2 = (n1, o1 <> o2)
-    | otherwise = error $ "Cannot group " <> n1 <> " with " <> n2
-
-parsePropertyLine :: String -> PropertyLine
-parsePropertyLine ln
-    | null ln = emptyPropertyLine
-    | head ln == '#' = emptyPropertyLine
-    | otherwise = parseLineJ ln
-
-    where
-
-    parseLineJ :: String -> (String, [Int])
-    parseLineJ line =
-        let (rangeLn, line1) = span (/= ';') line
-            propLn = takeWhile (/= '#') (tail line1)
-         in (trim propLn, parseRange (trim rangeLn))
-
-    parseRange :: String -> [Int]
-    parseRange rng =
-        if '.' `elem` rng
-        then let low = read $ "0x" <> takeWhile (/= '.') rng
-                 high =
-                     read $ "0x" <> reverse (takeWhile (/= '.') (reverse rng))
-              in [low .. high]
-        else [read $ "0x" <> rng]
-
-isDivider :: String -> Bool
-isDivider x = x == "# ================================================"
-
-parsePropertyLines :: (Monad m) => Stream m String -> Stream m PropertyLine
-parsePropertyLines =
-    Stream.splitOn isDivider
-        $ Fold.lmap parsePropertyLine
-        $ Fold.foldl' combinePropertyLines emptyPropertyLine
-
--- | A range entry in @UnicodeData.txt at .
-data UnicodeDataRange
-    = SingleCode    !DetailedChar
-    -- ^ Regular entry for one code point
-    | FirstCode     !String !DetailedChar
-    -- ^ A partial range for entry with a name as: @\<RANGE_IDENTIFIER, First\>@
-    | CompleteRange !String !DetailedChar !DetailedChar
-    -- ^ A complete range, requiring 2 continuous entries with respective names:
-    --
-    -- * @\<RANGE_IDENTIFIER, First\>@
-    -- * @\<RANGE_IDENTIFIER, Last\>@
-
-{-| Parse UnicodeData.txt lines
-
-Parse ranges according to https://www.unicode.org/reports/tr44/#Code_Point_Ranges.
-
-__Note:__ this does /not/ fill missing char entries,
-i.e. entries with no explicit entry nor within a range.
--}
-parseUnicodeDataLines :: forall m. (Monad m) => Stream m String -> Stream m DetailedChar
-parseUnicodeDataLines
-    = Stream.unfoldMany (Unfold.unfoldr unitToRange)
-    . Stream.foldMany ( Fold.lmap parseDetailedChar
-                      $ Fold.foldt' step initial id)
-
-    where
-
-    step :: Maybe UnicodeDataRange
-         -> DetailedChar
-         -> Fold.Step (Maybe UnicodeDataRange) (Maybe UnicodeDataRange)
-    step Nothing dc = case span (/= ',') (_name dc) of
-        (range, ", First>") -> Fold.Partial (Just (FirstCode range dc))
-        _                   -> Fold.Done (Just (SingleCode dc))
-    step (Just (FirstCode range1 dc1)) dc2 = case span (/= ',') (_name dc2) of
-        (range2, ", Last>") -> if range1 == range2 && _char dc1 < _char dc2
-            then Fold.Done (Just (CompleteRange range1 dc1 dc2))
-            else error $ "Cannot create range: incompatible ranges" <> show (dc1, dc2)
-        _ -> error $ "Cannot create range: missing <range, Last> entry correspong to: " <> show range1
-    step _ _ = error "impossible case"
-
-    initial :: Fold.Step (Maybe UnicodeDataRange) (Maybe UnicodeDataRange)
-    initial = Fold.Partial Nothing
-
-    unitToRange :: Maybe UnicodeDataRange -> Maybe (DetailedChar, Maybe UnicodeDataRange)
-    unitToRange = fmap $ \case
-        SingleCode          dc      -> (dc, Nothing)
-        FirstCode     _     dc      -> error $ "Incomplete range: " <> show dc
-        CompleteRange range dc1 dc2 -> if _char dc1 < _char dc2
-            -- [TODO] Create the proper name
-            then (dc1{_name="TODO"}, Just (CompleteRange range dc1{_char=succ (_char dc1)} dc2))
-            else (dc2{_name="TODO"}, Nothing)
-
--- | Parse a single entry of @UnicodeData.txt@
-parseDetailedChar :: String -> DetailedChar
-parseDetailedChar line = case splitWhen (== ';') line of
-  char
-    :name
-    :gc
-    :combining
-    :_bidi
-    :decomposition
-    :_decimal
-    :_digit
-    :_numeric
-    :_bidiM
-    :_uni1Name
-    :_iso
-    :sUpper
-    :sLower
-    :sTitle
-    :_ ->
-      let (dctype, dcval) = readDecomp decomposition
-      in DetailedChar
-          { _char = readCodePoint char
-          , _name = name
-          , _generalCategory = read gc
-          , _combiningClass = read combining
-          , _decompositionType = dctype
-          , _decomposition = dcval
-          , _simpleUppercaseMapping = readCodePointM sUpper
-          , _simpleLowercaseMapping = readCodePointM sLower
-          , _simpleTitlecaseMapping = readCodePointM sTitle
-          }
-  _ -> error ("Unsupported line: " <> line)
-
--------------------------------------------------------------------------------
--- Generation
--------------------------------------------------------------------------------
-
-readLinesFromFile :: String -> Stream IO String
-readLinesFromFile file =
-    withFile file Sys.ReadMode
-        $ \h -> Handle.read h & Unicode.decodeUtf8 & Unicode.lines Fold.toList
-
-    where
-    withFile file_ mode =
-        Stream.bracketIO (Sys.openFile file_ mode) (Sys.hClose)
-
-
-moduleToFileName :: String -> String
-moduleToFileName = map (\x -> if x == '.' then '/' else x)
-
-dirFromFileName :: String -> String
-dirFromFileName = reverse . dropWhile (/= '/') . reverse
-
-data FileRecipe a
-    = ModuleRecipe
-      -- ^ A recipe to create a Haskell module file.
-        String
-        -- ^ Module name
-        (String -> Fold IO a String)
-        -- ^ Function that generate the module, given the module name.
-    | TestOutputRecipe
-      -- ^ A recipe to create a test output file.
-        String
-        -- ^ Test name
-        (Fold IO a String)
-        -- ^ Test output generator
-
--- ModuleRecipe is a tuple of the module name and a function that generates the
--- module using the module name
-type ModuleRecipe a = (String, String -> Fold IO a String)
-type TestOutputRecipe a = (FilePath, Fold IO a String)
-
--- GeneratorRecipe is a list of ModuleRecipe
-type GeneratorRecipe a = [FileRecipe a]
-
-moduleFileEmitter :: Maybe FilePath -> FilePath -> ModuleRecipe a -> Fold IO a ()
-moduleFileEmitter mfile outdir (modName, fldGen) = Fold.rmapM action $ fldGen modName
-
-    where
-
-    pretext version = case mfile of
-      Just file -> mconcat
-        [ "-- DO NOT EDIT: This file is automatically generated by the internal tool ucd2haskell,\n"
-        , "-- with data from: https://www.unicode.org/Public/"
-        , version
-        , "/ucd/"
-        , file
-        ,".\n\n"
-        ]
-      Nothing -> "-- DO NOT EDIT: This file is automatically generated by the internal tool ucd2haskell.\n\n"
-    outfile = outdir </> moduleToFileName modName <.> ".hs"
-    outfiledir = dirFromFileName outfile
-    action c = do
-        version <-
-            catch
-                (getEnv "UNICODE_VERSION")
-                (\(_ :: IOException) -> return "<unknown>")
-        createDirectoryIfMissing True outfiledir
-        writeFile outfile (pretext version <> c)
-
-testOutputFileEmitter :: FilePath -> TestOutputRecipe a -> Fold IO a ()
-testOutputFileEmitter outdir (name, fldGen) = Fold.rmapM action fldGen
-
-    where
-
-    outfile = outdir </> "tests" </> name <.> ".stdout"
-    outfiledir = dirFromFileName outfile
-    action c
-        = createDirectoryIfMissing True outfiledir
-        *> writeFile outfile c
-
-runGenerator ::
-       FilePath
-    -> FilePath
-    -> (Stream IO String -> Stream IO a)
-    -> FilePath
-    -> GeneratorRecipe a
-    -> IO ()
-runGenerator indir file transformLines outdir recipes =
-    readLinesFromFile (indir <> file) & transformLines & Stream.fold combinedFld
-
-    where
-
-    generatedFolds = recipes <&> \case
-      ModuleRecipe     name f -> moduleFileEmitter (Just file) outdir (name, f)
-      TestOutputRecipe name f -> testOutputFileEmitter         outdir (name, f)
-    combinedFld = void $ Fold.distribute generatedFolds
-
-genModules :: String -> String -> [String] -> IO ()
-genModules indir outdir props = do
-    genUnicodeVersion outdir
-
-    -- [NOTE] Disabled generator
-    -- compExclu <-
-    --     readLinesFromFile (indir <> "DerivedNormalizationProps.txt")
-    --         & parsePropertyLines
-    --         & Stream.find (\(name, _) -> name == "Full_Composition_Exclusion")
-    --         & fmap (snd . fromMaybe ("", []))
-
-    -- [NOTE] Disabled generator
-    -- non0CC <-
-    --     readLinesFromFile (indir <> "extracted/DerivedCombiningClass.txt")
-    --         & parsePropertyLines
-    --         & Stream.filter (\(name, _) -> name /= "0")
-    --         & Stream.map snd
-    --         & Stream.fold (Fold.foldl' (<>) [])
-
-    runGenerator
-        indir
-        "UnicodeData.txt"
-        parseUnicodeDataLines
-        outdir
-        -- [NOTE] Disabled generators
-        -- [ uncurry ModuleRecipe compositions compExclu non0CC
-        -- , uncurry ModuleRecipe combiningClass
-        -- , uncurry ModuleRecipe decomposable
-        -- , uncurry ModuleRecipe decomposableK
-        -- , uncurry ModuleRecipe decompositions
-        -- , uncurry ModuleRecipe decompositionsK2
-        -- , uncurry ModuleRecipe decompositionsK
-        [ uncurry ModuleRecipe generalCategory
-        , uncurry ModuleRecipe simpleUpperCaseMapping
-        , uncurry ModuleRecipe simpleLowerCaseMapping
-        , uncurry ModuleRecipe simpleTitleCaseMapping
-        -- , uncurry TestOutputRecipe unicode002Test
-        ]
-
-    -- [NOTE] Disabled generator
-    -- runGenerator
-    --     indir
-    --     "PropList.txt"
-    --     parsePropertyLines
-    --     outdir
-    --     [ uncurry ModuleRecipe propList ]
-
-    runGenerator
-        indir
-        "DerivedCoreProperties.txt"
-        parsePropertyLines
-        outdir
-        [ uncurry ModuleRecipe derivedCoreProperties ]
-
-    where
-
-    -- [NOTE] Disabled generator
-    -- propList =
-    --     ("GHC.Internal.Unicode.Char.PropList"
-    --     , (`genCorePropertiesModule` (`elem` props)))
-
-    derivedCoreProperties =
-        ("GHC.Internal.Unicode.Char.DerivedCoreProperties"
-        , (`genCorePropertiesModule` (`elem` props)))
-
-    -- [NOTE] Disabled generator
-    -- compositions exc non0 =
-    --     ( "GHC.Internal.Unicode.Char.UnicodeData.Compositions"
-    --     , \m -> genCompositionsModule m exc non0)
-
-    -- [NOTE] Disabled generator
-    -- combiningClass =
-    --     ( "GHC.Internal.Unicode.Char.UnicodeData.CombiningClass"
-    --     , genCombiningClassModule)
-
-    -- [NOTE] Disabled generator
-    -- decomposable =
-    --     ( "GHC.Internal.Unicode.Char.UnicodeData.Decomposable"
-    --     , (`genDecomposableModule` Canonical))
-
-    -- [NOTE] Disabled generator
-    -- decomposableK =
-    --     ( "GHC.Internal.Unicode.Char.UnicodeData.DecomposableK"
-    --     , (`genDecomposableModule` Kompat))
-
-    -- [NOTE] Disabled generator
-    -- decompositions =
-    --     ( "GHC.Internal.Unicode.Char.UnicodeData.Decompositions"
-    --     , \m -> genDecomposeDefModule m [] [] Canonical (const True))
-
-    -- [NOTE] Disabled generator
-    -- decompositionsK2 =
-    --     ( "GHC.Internal.Unicode.Char.UnicodeData.DecompositionsK2"
-    --     , \m -> genDecomposeDefModule m [] [] Kompat (>= 60000))
-
-    -- [NOTE] Disabled generator
-    -- decompositionsK =
-    --     let pre = ["import qualified " <> fst decompositionsK2 <> " as DK2", ""]
-    --         post = ["decompose c = DK2.decompose c"]
-    --      in ( "GHC.Internal.Unicode.Char.UnicodeData.DecompositionsK"
-    --         , \m -> genDecomposeDefModule m pre post Kompat (< 60000))
-
-    generalCategory =
-         ( "GHC.Internal.Unicode.Char.UnicodeData.GeneralCategory"
-         , genGeneralCategoryModule)
-
-    simpleUpperCaseMapping =
-         ( "GHC.Internal.Unicode.Char.UnicodeData.SimpleUpperCaseMapping"
-         , \m -> genSimpleCaseMappingModule m "toSimpleUpperCase" _simpleUppercaseMapping)
-
-    simpleLowerCaseMapping =
-         ( "GHC.Internal.Unicode.Char.UnicodeData.SimpleLowerCaseMapping"
-         , \m -> genSimpleCaseMappingModule m "toSimpleLowerCase" _simpleLowercaseMapping)
-
-    simpleTitleCaseMapping =
-         ( "GHC.Internal.Unicode.Char.UnicodeData.SimpleTitleCaseMapping"
-         , \m -> genSimpleCaseMappingModule m "toSimpleTitleCase" _simpleTitlecaseMapping)
-
-    -- unicode002Test =
-    --      ( "unicode002"
-    --      , genUnicode002TestResults)


=====================================
libraries/ghc-internal/tools/ucd2haskell/exe/UCD2Haskell.hs
=====================================
@@ -2,14 +2,17 @@
 -- Module      : Main
 -- Copyright   : (c) 2020 Composewell Technologies and Contributors
 -- License     : BSD-3-Clause
--- Maintainer  : streamly at composewell.com
+-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>"
 -- Stability   : internal
 --
 module Main where
 
-import WithCli (HasArguments(..), withCli)
-import Parser.Text (genModules)
+import qualified Data.ByteString.Char8 as B8
+import qualified Data.ByteString.Short as BS
 import GHC.Generics (Generic)
+import WithCli (HasArguments(..), withCli)
+
+import UCD2Haskell.ModuleGenerators (genModules)
 
 data CLIOptions =
     CLIOptions
@@ -20,7 +23,10 @@ data CLIOptions =
     deriving (Show, Generic, HasArguments)
 
 cliClient :: CLIOptions -> IO ()
-cliClient opts = genModules (input opts) (output opts) (core_prop opts)
+cliClient opts = genModules
+    opts.input
+    opts.output
+    (BS.toShort . B8.pack <$> opts.core_prop)
 
 main :: IO ()
 main = withCli cliClient


=====================================
libraries/ghc-internal/tools/ucd2haskell/exe/UCD2Haskell/ModuleGenerators.hs
=====================================
@@ -0,0 +1,517 @@
+-- |
+-- Module      : UCD2Haskell.ModuleGenerators
+-- Copyright   : (c) 2020 Composewell Technologies and Contributors
+--               (c) 2016-2017 Harendra Kumar
+--               (c) 2014-2015 Antonio Nikishaev
+--               (c) 2022-2024 Pierre Le Marre
+-- License     : BSD-3-Clause
+-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>"
+-- Stability   : internal
+
+-- Code history:
+--
+-- This code was adapted from https://github.com/composewell/unicode-data/
+-- (around commit c4aa52ed932ad8badf97296858932c3389b275b8) by Pierre Le Marre.
+-- The original Unicode database parser was taken from
+-- https://github.com/composewell/unicode-transforms but was completely
+-- rewritten from scratch to parse from UCD text files instead of XML, only
+-- some types remain the same. That code in turn was originally taken from
+-- https://github.com/llelf/prose (Antonio Nikishaev) and heavily modified by
+-- Harendra Kumar.
+--
+module UCD2Haskell.ModuleGenerators (genModules) where
+
+import Control.Exception (catch, IOException)
+import Data.Bits (Bits(..))
+import Data.Word (Word8)
+import Data.Char (ord)
+import Data.Functor ((<&>), ($>))
+import Data.List (intersperse, unfoldr)
+import System.Directory (createDirectoryIfMissing)
+import System.Environment (getEnv)
+import System.FilePath ((</>), (<.>))
+import Data.String (IsString)
+import Data.Foldable (Foldable(..))
+
+import qualified Data.ByteString.Builder as BB
+import qualified Data.ByteString as B
+import qualified Data.ByteString.Lazy as BL
+import qualified Data.ByteString.Short as BS
+
+import qualified Unicode.CharacterDatabase.Parser.Common as C
+import qualified Unicode.CharacterDatabase.Parser.UnicodeData as UD
+import qualified Unicode.CharacterDatabase.Parser.Properties.Multiple as P
+
+import Prelude hiding (pred)
+
+--------------------------------------------------------------------------------
+-- Helpers
+--------------------------------------------------------------------------------
+
+unlinesBB :: [BB.Builder] -> BB.Builder
+unlinesBB = (<> "\n") . mconcat . intersperse "\n"
+
+unwordsBB :: [BB.Builder] -> BB.Builder
+unwordsBB = mconcat . intersperse " "
+
+headerRule :: BB.Builder
+headerRule = "-----------------------------------------------------------------------------"
+
+mkModuleHeader :: BB.Builder -> BB.Builder
+mkModuleHeader modName =
+    unlinesBB
+        [ headerRule
+        , "-- |"
+        , "-- Module      : " <> modName
+        , "-- License     : BSD-3-Clause"
+        , "-- Maintainer  : The GHC Developers <ghc-devs at haskell.org>"
+        , "-- Stability   : internal"
+        , headerRule
+        ]
+
+genSignature :: BB.Builder -> BB.Builder
+genSignature = (<> " :: Char -> Bool")
+
+-- | Check that var is between minimum and maximum of orderList
+genRangeCheck :: BB.Builder -> [Int] -> BB.Builder
+genRangeCheck var ordList =
+    var
+        <> " >= "
+        <> BB.intDec (minimum ordList)
+        <> " && " <> var <> " <= " <> BB.intDec (maximum ordList)
+
+genBitmap :: BB.Builder -> [Int] -> BB.Builder
+genBitmap funcName ordList =
+    unlinesBB
+        [ "{-# INLINE " <> funcName <> " #-}"
+        , genSignature funcName
+        , funcName <> " = \\c -> let n = ord c in "
+              <> genRangeCheck "n" ordList <> " && lookupBit64 bitmap# n"
+        , "  where"
+        , "    bitmap# = \"" <> bitMapToAddrLiteral (positionsToBitMap ordList) "\"#"
+        ]
+
+positionsToBitMap :: [Int] -> [Bool]
+positionsToBitMap = go 0
+
+    where
+
+    go _ [] = []
+    go i xxs@(x:xs)
+        | i < x = False : go (i + 1) xxs
+        | otherwise = True : go (i + 1) xs
+
+bitMapToAddrLiteral ::
+  -- | Values to encode
+  [Bool] ->
+  -- | String to append
+  BB.Builder ->
+  BB.Builder
+bitMapToAddrLiteral bs cs = foldr encode cs (unfoldr mkChunks bs)
+
+    where
+
+    mkChunks :: [a] -> Maybe ([a], [a])
+    mkChunks [] = Nothing
+    mkChunks xs = Just $ splitAt 8 xs
+
+    encode :: [Bool] -> BB.Builder -> BB.Builder
+    encode chunk acc = BB.char7 '\\' <> BB.intDec (toByte (padTo8 chunk)) <> acc
+
+    padTo8 :: [Bool] -> [Bool]
+    padTo8 xs
+        | length xs >= 8 = xs
+        | otherwise = xs <> replicate (8 - length xs) False
+
+    toByte :: [Bool] -> Int
+    toByte xs = sum $ map (\i -> if xs !! i then 1 `shiftL` i else 0) [0..7]
+
+genEnumBitmap ::
+  forall a. (Bounded a, Enum a, Show a) =>
+  -- | Function name
+  BB.Builder ->
+  -- | Default value
+  a ->
+  -- | List of values to encode
+  [a] ->
+  BB.Builder
+genEnumBitmap funcName def as = unlinesBB
+    [ "{-# INLINE " <> funcName <> " #-}"
+    , funcName <> " :: Char -> Int"
+    , funcName <> " c = let n = ord c in if n >= "
+               <> BB.intDec (length as)
+               <> " then "
+               <> BB.intDec (fromEnum def)
+               <> " else lookup_bitmap n"
+
+    , "{-# NOINLINE lookup_bitmap #-}"
+    , "lookup_bitmap :: Int -> Int"
+    , "lookup_bitmap n = lookupIntN bitmap# n"
+    , "  where"
+    , "    bitmap# = \"" <> enumMapToAddrLiteral as "\"#"
+    ]
+
+{-| Encode a list of values as a byte map, using their 'Enum' instance.
+
+__Note:__ 'Enum' instance must respect the following:
+
+* @fromEnum minBound >= 0x00@
+* @fromEnum maxBound <= 0xff@
+-}
+enumMapToAddrLiteral ::
+  forall a. (Bounded a, Enum a, Show a) =>
+  -- | Values to encode
+  [a] ->
+  -- | String to append
+  BB.Builder ->
+  BB.Builder
+enumMapToAddrLiteral xs cs = foldr go cs xs
+
+    where
+
+    go :: a -> BB.Builder -> BB.Builder
+    go x acc = BB.char7 '\\' <> BB.word8Dec (toWord8 x) <> acc
+
+    toWord8 :: a -> Word8
+    toWord8 a = let w = fromEnum a in if 0 <= w && w <= 0xff
+        then fromIntegral w
+        else error $ "Cannot convert to Word8: " <> show a
+
+genUnicodeVersion :: FilePath -> IO ()
+genUnicodeVersion outdir = do
+  version <- catch
+              (getEnv "UNICODE_VERSION")
+              (\(_ :: IOException) -> return "<unknown>")
+  runFold f [body version]
+  where
+    moduleName :: (IsString a) => a
+    moduleName = "GHC.Internal.Unicode.Version"
+    f = moduleFileEmitter Nothing outdir
+          (moduleName, \_ -> Fold (\_ x -> x) mempty id)
+    body :: String -> BB.Builder
+    body version = unlinesBB
+      [ "{-# LANGUAGE NoImplicitPrelude #-}"
+      , "{-# OPTIONS_HADDOCK hide #-}"
+      , ""
+      , mkModuleHeader moduleName
+      , "module " <> moduleName
+      , "(unicodeVersion)"
+      , "where"
+      , ""
+      , "import {-# SOURCE #-} GHC.Internal.Data.Version"
+      , ""
+      , "-- | Version of Unicode standard used by @base@:"
+      , "-- [" <> BB.string7 version <> "](https://www.unicode.org/versions/Unicode" <> BB.string7 version <> "/)."
+      , "--"
+      , "-- @since base-4.15.0.0"
+      , "unicodeVersion :: Version"
+      , "unicodeVersion = makeVersion [" <> mkVersion version <> "]" ]
+    mkVersion = foldMap (\c -> case c of {'.' -> BB.char7 ',' <> BB.char7 ' '; _ -> BB.char7 c})
+
+--------------------------------------------------------------------------------
+-- Fold
+--------------------------------------------------------------------------------
+
+data Fold a b = forall s. Fold
+  { _step :: s -> a -> s
+  , _initial :: s
+  , _final :: s -> b }
+
+data Pair a b = Pair !a !b
+
+teeWith :: (a -> b -> c) -> Fold x a -> Fold x b -> Fold x c
+teeWith f (Fold stepL initialL finalL) (Fold stepR initialR finalR) =
+  Fold step initial final
+  where
+    step (Pair sL sR) x = Pair (stepL sL x) (stepR sR x)
+    initial = Pair initialL initialR
+    final (Pair sL sR) = f (finalL sL) (finalR sR)
+
+distribute :: [Fold a b] -> Fold a [b]
+distribute = foldr (teeWith (:)) (Fold const () (const []))
+
+rmapFold :: (b -> c) -> Fold a b -> Fold a c
+rmapFold f (Fold step initial final) = Fold step initial (f . final)
+
+runFold :: Fold a b -> [a] -> b
+runFold (Fold step initial final) = final . foldl' step initial
+
+--------------------------------------------------------------------------------
+-- Modules generators
+--------------------------------------------------------------------------------
+
+data GeneralCategoryAcc = GeneralCategoryAcc
+  { _categories :: ![UD.GeneralCategory]
+  , _expectedChar :: !Char
+  }
+
+genGeneralCategoryModule :: BB.Builder -> Fold UD.Entry BB.Builder
+genGeneralCategoryModule moduleName = Fold step initial done
+
+    where
+
+    -- (categories, expected char)
+    initial = GeneralCategoryAcc [] '\0'
+
+    step (GeneralCategoryAcc acc p) e@(UD.Entry r d)
+      | p < r.start
+      -- Fill missing char entry with default category Cn
+      -- See: https://www.unicode.org/reports/tr44/#Default_Values_Table
+      = step (GeneralCategoryAcc (replicate (ord r.start - ord p) UD.Cn <> acc) r.start) e
+      -- Regular entry
+      | otherwise = case r of
+        C.SingleChar ch -> GeneralCategoryAcc
+          (d.generalCategory : acc)
+          (succ ch)
+        C.CharRange ch1 ch2 -> GeneralCategoryAcc
+          (replicate (ord ch2 - ord ch1 + 1) d.generalCategory <> acc)
+          (succ ch2)
+
+    done (GeneralCategoryAcc acc _) = unlinesBB
+        [ "{-# LANGUAGE NoImplicitPrelude #-}"
+        , "{-# LANGUAGE MagicHash #-}"
+        , "{-# OPTIONS_HADDOCK hide #-}"
+        , ""
+        , mkModuleHeader moduleName
+        , "module " <> moduleName
+        , "(generalCategory)"
+        , "where"
+        , ""
+        , "import GHC.Internal.Base (Char, Int, Ord(..), ord)"
+        , "import GHC.Internal.Unicode.Bits (lookupIntN)"
+        , ""
+        , genEnumBitmap "generalCategory" UD.Cn (reverse acc)
+        ]
+
+genSimpleCaseMappingModule
+    :: BB.Builder
+    -> BB.Builder
+    -> (UD.CharDetails -> Maybe Char)
+    -> Fold UD.Entry BB.Builder
+genSimpleCaseMappingModule moduleName funcName field =
+    Fold step initial done
+
+    where
+
+    genHeader =
+        [ "{-# LANGUAGE NoImplicitPrelude, LambdaCase #-}"
+        , "{-# OPTIONS_HADDOCK hide #-}"
+        , ""
+        , mkModuleHeader moduleName
+        , "module " <> moduleName
+        , "(" <> funcName <> ")"
+        , "where"
+        , ""
+        , "import GHC.Internal.Base (Char)"
+        , ""
+        ]
+    genSign =
+        [ "{-# NOINLINE " <> funcName <> " #-}"
+        , funcName <> " :: Char -> Char"
+        , funcName <> " = \\case"
+        ]
+    initial = []
+
+    step ds dc = case mkEntry dc of
+        Nothing -> ds
+        Just d  -> d : ds
+
+    after = ["  c -> c"]
+
+    done st =
+        let body = mconcat [genHeader, genSign, reverse st, after]
+        in unlinesBB body
+
+    mkEntry (UD.Entry r dc) = case r of
+      C.SingleChar ch -> field dc <&> \c -> mconcat
+        [ "  '\\x"
+        , showHexChar ch
+        , "' -> '\\x"
+        , showHexChar c
+        , "'"
+        ]
+      C.CharRange{} -> field dc $> error ("genSimpleCaseMappingModule: unexpected char range: " <> show r)
+
+    showHexChar c = BB.wordHex (fromIntegral (ord c))
+
+data PropertiesAcc = PropertiesAcc
+  { _properties :: ![BS.ShortByteString]
+  , _bitmaps :: ![BB.Builder]
+  , _currentBitmap :: ![[Int]] }
+
+genCorePropertiesModule ::
+       BB.Builder -> (BS.ShortByteString -> Bool) -> Fold P.Entry BB.Builder
+genCorePropertiesModule moduleName isProp = Fold step initial done
+    where
+    prop2FuncName x = "is" <> BB.shortByteString x
+
+    initial = PropertiesAcc [] [] []
+
+    step acc@(PropertiesAcc props bitmaps bits) P.Entry{..}
+      | not (isProp property) = acc -- property filtered out
+      | otherwise = case props of
+        prop' : _
+          | prop' == property -> PropertiesAcc props bitmaps (rangeToBits range : bits)
+          | otherwise -> PropertiesAcc
+              { _properties = property : props
+              , _bitmaps = genBitmap' prop' bits : bitmaps
+              , _currentBitmap = [rangeToBits range] }
+        _ -> PropertiesAcc [property] bitmaps [rangeToBits range]
+
+    rangeToBits = \case
+      C.SingleChar ch -> [ord ch]
+      C.CharRange ch1 ch2 -> [ord ch1 .. ord ch2]
+
+    genBitmap' prop bits = genBitmap (prop2FuncName prop) (mconcat (reverse bits))
+
+    done (PropertiesAcc props bitmaps bits) = unlinesBB (header props <> bitmaps')
+      where
+        lastProp = case props of
+          prop : _ -> prop
+          [] -> error "impossible"
+        bitmaps' = genBitmap' lastProp bits : bitmaps
+
+    header exports =
+        [ "{-# LANGUAGE NoImplicitPrelude #-}"
+        , "{-# LANGUAGE MagicHash #-}"
+        , "{-# OPTIONS_HADDOCK hide #-}"
+        , ""
+        , mkModuleHeader moduleName
+        , "module " <> moduleName
+        , "(" <> unwordsBB (intersperse "," (map prop2FuncName exports)) <> ")"
+        , "where"
+        , ""
+        , "import GHC.Internal.Base (Bool, Char, Ord(..), (&&), ord)"
+        , "import GHC.Internal.Unicode.Bits (lookupBit64)"
+        , ""
+        ]
+
+--------------------------------------------------------------------------------
+-- Generation
+--------------------------------------------------------------------------------
+
+moduleToFileName :: String -> String
+moduleToFileName = map (\x -> if x == '.' then '/' else x)
+
+dirFromFileName :: String -> String
+dirFromFileName = reverse . dropWhile (/= '/') . reverse
+
+data FileRecipe a
+    = ModuleRecipe
+      -- ^ A recipe to create a Haskell module file.
+        String
+        -- ^ Module name
+        (BB.Builder -> Fold a BB.Builder)
+        -- ^ Function that generate the module, given the module name.
+    | TestOutputRecipe
+      -- ^ A recipe to create a test output file.
+        String
+        -- ^ Test name
+        (Fold a BB.Builder)
+        -- ^ Test output generator
+
+-- ModuleRecipe is a tuple of the module name and a function that generates the
+-- module using the module name
+type ModuleRecipe a = (String, BB.Builder -> Fold a BB.Builder)
+type TestOutputRecipe a = (FilePath, Fold a BB.Builder)
+
+-- GeneratorRecipe is a list of ModuleRecipe
+type GeneratorRecipe a = [FileRecipe a]
+
+moduleFileEmitter :: Maybe FilePath -> FilePath -> ModuleRecipe a -> Fold a (IO ())
+moduleFileEmitter mfile outdir (modName, fldGen) = rmapFold action $ fldGen (BB.string7 modName)
+
+    where
+
+    pretext version = case mfile of
+      Just file -> mconcat
+        [ "-- DO NOT EDIT: This file is automatically generated by the internal tool ucd2haskell,\n"
+        , "-- with data from: https://www.unicode.org/Public/"
+        , BB.string7 version
+        , "/ucd/"
+        , BB.string7 file
+        ,".\n\n"
+        ]
+      Nothing -> "-- DO NOT EDIT: This file is automatically generated by the internal tool ucd2haskell.\n\n"
+    outfile = outdir </> moduleToFileName modName <.> ".hs"
+    outfiledir = dirFromFileName outfile
+    action c = do
+        version <-
+            catch
+                (getEnv "UNICODE_VERSION")
+                (\(_ :: IOException) -> return "<unknown>")
+        createDirectoryIfMissing True outfiledir
+        B.writeFile outfile (BL.toStrict (BB.toLazyByteString (pretext version <> c)))
+
+testOutputFileEmitter :: FilePath -> TestOutputRecipe a -> Fold a (IO ())
+testOutputFileEmitter outdir (name, fldGen) = rmapFold action fldGen
+
+    where
+
+    outfile = outdir </> "tests" </> name <.> ".stdout"
+    outfiledir = dirFromFileName outfile
+    action c
+        = createDirectoryIfMissing True outfiledir
+        *> B.writeFile outfile (BL.toStrict (BB.toLazyByteString c))
+
+runGenerator ::
+       FilePath
+    -> FilePath
+    -> (B.ByteString -> [a])
+    -> FilePath
+    -> GeneratorRecipe a
+    -> IO ()
+runGenerator indir file transformLines outdir recipes = do
+    raw <- B.readFile (indir <> file)
+    sequence_ (runFold combinedFld (transformLines raw))
+
+    where
+
+    generatedFolds = recipes <&> \case
+      ModuleRecipe     name f -> moduleFileEmitter (Just file) outdir (name, f)
+      TestOutputRecipe name f -> testOutputFileEmitter         outdir (name, f)
+    combinedFld = distribute generatedFolds
+
+genModules :: FilePath -> FilePath -> [BS.ShortByteString] -> IO ()
+genModules indir outdir props = do
+    genUnicodeVersion outdir
+
+    runGenerator
+        indir
+        "UnicodeData.txt"
+        UD.parse
+        outdir
+        [ generalCategory
+        , simpleUpperCaseMapping
+        , simpleLowerCaseMapping
+        , simpleTitleCaseMapping
+        ]
+
+    runGenerator
+        indir
+        "DerivedCoreProperties.txt"
+        P.parse
+        outdir
+        [ derivedCoreProperties ]
+
+    where
+
+    derivedCoreProperties = ModuleRecipe
+        "GHC.Internal.Unicode.Char.DerivedCoreProperties"
+        (`genCorePropertiesModule` (`elem` props))
+
+    generalCategory = ModuleRecipe
+         "GHC.Internal.Unicode.Char.UnicodeData.GeneralCategory"
+         genGeneralCategoryModule
+
+    simpleUpperCaseMapping = ModuleRecipe
+         "GHC.Internal.Unicode.Char.UnicodeData.SimpleUpperCaseMapping"
+         (\m -> genSimpleCaseMappingModule m "toSimpleUpperCase" UD.simpleUpperCaseMapping)
+
+    simpleLowerCaseMapping = ModuleRecipe
+         "GHC.Internal.Unicode.Char.UnicodeData.SimpleLowerCaseMapping"
+         (\m -> genSimpleCaseMappingModule m "toSimpleLowerCase" UD.simpleLowerCaseMapping)
+
+    simpleTitleCaseMapping = ModuleRecipe
+         "GHC.Internal.Unicode.Char.UnicodeData.SimpleTitleCaseMapping"
+         (\m -> genSimpleCaseMappingModule m "toSimpleTitleCase" UD.simpleTitleCaseMapping)


=====================================
libraries/ghc-internal/tools/ucd2haskell/ucd2haskell.cabal
=====================================
@@ -1,6 +1,6 @@
 cabal-version:       2.2
 name:                ucd2haskell
-version:             0.3.0
+version:             0.4.0
 synopsis:            Converter from Unicode character database to Haskell.
 description:
   The Haskell data structures are generated programmatically from the
@@ -10,12 +10,12 @@ description:
 license:             BSD-3-Clause
 license-file:        LICENSE
 author:              Composewell Technologies and Contributors
-maintainer:          streamly at composewell.com
-copyright:           2020 Composewell Technologies and Contributors
+maintainer:          The GHC Developers <ghc-devs at haskell.org>
+bug-reports:         https://gitlab.haskell.org/ghc/ghc/-/issues/new
 category:            Data,Text,Unicode
 stability:           Experimental
 build-type:          Simple
-tested-with:         GHC==9.2.2
+tested-with:         GHC==9.8.2
 
 extra-source-files:
     README.md
@@ -23,18 +23,14 @@ extra-source-files:
 
 common default-extensions
   default-extensions:
-      BangPatterns
       DeriveGeneric
-      MagicHash
-      RecordWildCards
-      ScopedTypeVariables
-      TupleSections
-      FlexibleContexts
-
-      -- Experimental, may lead to issues
       DeriveAnyClass
-      TemplateHaskell
-      UnboxedTuples
+      ExistentialQuantification
+      LambdaCase
+      OverloadedStrings
+      OverloadedRecordDot
+      ScopedTypeVariables
+      RecordWildCards
 
 common compile-options
   ghc-options: -Wall
@@ -42,21 +38,20 @@ common compile-options
                -fwarn-incomplete-record-updates
                -fwarn-incomplete-uni-patterns
                -fwarn-tabs
-  default-language: Haskell2010
+  default-language: GHC2021
 
 executable ucd2haskell
   import: default-extensions, compile-options
-  default-language: Haskell2010
   ghc-options: -O2
   hs-source-dirs: exe
   main-is: UCD2Haskell.hs
-  other-modules: Parser.Text
+  other-modules: UCD2Haskell.ModuleGenerators
   build-depends:
-      base             >= 4.7   && < 4.20
-    , streamly-core    >= 0.2.2 && < 0.3
-    , streamly         >= 0.10   && < 0.11
-    , split            >= 0.2.3 && < 0.3
-    , getopt-generics  >= 0.13  && < 0.14
-    , containers       >= 0.5   && < 0.7
-    , directory        >= 1.3.6 && < 1.3.8
-    , filepath         >= 1.4.2 && < 1.5
+      base                >= 4.7   && < 5
+    , bytestring          >= 0.11  && < 0.13
+    , containers          >= 0.5   && < 0.7
+    , directory           >= 1.3.6 && < 1.3.8
+    , filepath            >= 1.4.2 && < 1.5
+    , getopt-generics     >= 0.13  && < 0.14
+    , split               >= 0.2.3 && < 0.3
+    , unicode-data-parser >= 0.2.0 && < 0.4


=====================================
utils/haddock/CONTRIBUTING.md
=====================================
@@ -28,6 +28,17 @@ Then, run the following command from the top-level:
 $ ./hadrian/build -j --flavour=Quick --freeze1 _build/stage1/bin/haddock
 ```
 
+### Running the test suites
+
+Currently, this cannot be done with hadrian but has to be done with a
+`cabal-install` built from `master`.
+
+```
+cabal test -w <absolute-path-to-ghc-repo>/_build/stage1/bin/ghc
+```
+
+For more details, see https://gitlab.haskell.org/ghc/ghc/-/issues/24976.
+
 ## Working with the codebase
 
 The project provides a Makefile with rules to accompany you during development:


=====================================
utils/haddock/cabal.project
=====================================
@@ -1,5 +1,3 @@
-with-compiler: ghc-9.7
-
 packages: ./
           ./haddock-api
           ./haddock-library



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/80e1935ae702345b011fb3eac43a699dadd91338...565f6192ef575d484f0448b715f09c7c83eda45e

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/80e1935ae702345b011fb3eac43a699dadd91338...565f6192ef575d484f0448b715f09c7c83eda45e
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20240613/58385bea/attachment-0001.html>


More information about the ghc-commits mailing list