[Git][ghc/ghc][wip/unboxed-codebuffer] 7 commits: docs: release notes, user guide: add js backend

Wed Feb 15 17:18:17 UTC 2023


Josh Meredith pushed to branch wip/unboxed-codebuffer at Glasgow Haskell Compiler / GHC


Commits:
08c0822c by doyougnu at 2023-02-15T00:16:39-05:00
docs: release notes, user guide: add js backend

Follow up from #21078

- - - - -
79d8fd65 by Bryan Richter at 2023-02-15T00:17:15-05:00
Allow failure in nightly-x86_64-linux-deb10-no_tntc-validate

See #22343

- - - - -
9ca51f9e by Cheng Shao at 2023-02-15T00:17:53-05:00
rts: add the rts_clearMemory function

This patch adds the rts_clearMemory function that does its best to
zero out unused RTS memory for a wasm backend use case. See the
comment above rts_clearMemory() prototype declaration for more
detailed explanation. Closes #22920.

- - - - -
18700351 by Josh Meredith at 2023-02-15T17:16:27+00:00
CodeBuffer: change to use unboxed tuples for encoders/decoders

Updates submodules for filepath and haskeline

- - - - -
f2e7f89d by Josh Meredith at 2023-02-15T17:16:27+00:00
Lint

- - - - -
2b51e003 by Josh Meredith at 2023-02-15T17:16:27+00:00
Fix build

- - - - -
edd10397 by Josh Meredith at 2023-02-15T17:16:27+00:00
Tabs

- - - - -


24 changed files:

- .gitlab/gen_ci.hs
- .gitlab/jobs.yaml
- docs/users_guide/9.6.1-notes.rst
- docs/users_guide/codegens.rst
- libraries/base/GHC/IO/Encoding.hs
- libraries/base/GHC/IO/Encoding/Failure.hs
- libraries/base/GHC/IO/Encoding/Iconv.hs
- libraries/base/GHC/IO/Encoding/Latin1.hs
- libraries/base/GHC/IO/Encoding/Types.hs
- libraries/base/GHC/IO/Encoding/UTF16.hs
- libraries/base/GHC/IO/Encoding/UTF32.hs
- libraries/base/GHC/IO/Encoding/UTF8.hs
- libraries/base/GHC/IO/Handle/Internals.hs
- libraries/filepath
- libraries/haskeline
- rts/RtsSymbols.c
- rts/include/RtsAPI.h
- rts/sm/BlockAlloc.c
- rts/sm/BlockAlloc.h
- rts/sm/NonMoving.h
- rts/sm/NonMovingSweep.c
- rts/sm/Storage.c
- rts/sm/Storage.h
- testsuite/tests/ffi/should_run/ffi023_c.c


Changes:

=====================================
.gitlab/gen_ci.hs
=====================================
@@ -713,6 +713,10 @@ modifyJobs = fmap
 modifyValidateJobs :: (a -> a) -> JobGroup a -> JobGroup a
 modifyValidateJobs f jg = jg { v = f <$> v jg }
 
+-- | Modify just the nightly jobs in a 'JobGroup'
+modifyNightlyJobs :: (a -> a) -> JobGroup a -> JobGroup a
+modifyNightlyJobs f jg = jg { n = f <$> n jg }
+
 -- Generic helpers
 
 addJobRule :: Rule -> Job -> Job
@@ -854,7 +858,9 @@ job_groups =
      , fastCI (validateBuilds Amd64 (Linux Debian10) unreg)
      , fastCI (validateBuilds Amd64 (Linux Debian10) debug)
      , modifyValidateJobs manual tsan_jobs
-     , modifyValidateJobs manual (validateBuilds Amd64 (Linux Debian10) noTntc)
+     , -- Nightly allowed to fail: #22343
+       modifyNightlyJobs allowFailure
+        (modifyValidateJobs manual (validateBuilds Amd64 (Linux Debian10) noTntc))
      , addValidateRule LLVMBackend (validateBuilds Amd64 (Linux Debian10) llvm)
 
      , disableValidate (standardBuilds Amd64 (Linux Debian11))


=====================================
.gitlab/jobs.yaml
=====================================
@@ -978,7 +978,7 @@
       ".gitlab/ci.sh clean",
       "cat ci_timings"
     ],
-    "allow_failure": false,
+    "allow_failure": true,
     "artifacts": {
       "expire_in": "8 weeks",
       "paths": [


=====================================
docs/users_guide/9.6.1-notes.rst
=====================================
@@ -132,6 +132,15 @@ Compiler
     presented in this GHC version as a technology preview, bugs and
     missing features are expected.
 
+- The JavaScript backend has been merged. GHC is now able to be built as a
+  cross-compiler targeting the JavaScript platform. The backend should be
+  considered a technology preview. As such it is not ready for use in
+  production, is not distributed in the GHC release bindists and requires the
+  user to manually build GHC as a cross-compiler. See the JavaScript backend
+  `wiki <https://gitlab.haskell.org/ghc/ghc/-/wikis/javascript-backend>`_ page
+  on the GHC wiki for the current status, project roadmap, build instructions
+  and demos.
+
 - The :extension:`TypeInType` is now marked as deprecated. Its meaning has been included
   in :extension:`PolyKinds` and :extension:`DataKinds`.
 


=====================================
docs/users_guide/codegens.rst
=====================================
@@ -95,6 +95,36 @@ was built this way. If it has then the native code generator probably
 won't be available. You can check this information by calling
 ``ghc --info`` (see :ghc-flag:`--info`).
 
+.. _javascript-code-gen:
+
+JavaScript Code Generator
+------------------------------
+
+.. index::
+   single: JavaScript code generator
+
+This is an alternative code generator included in GHC 9.6 and above. It
+generates `ECMA-262 <https://tc39.es/ecma262/>`_ compliant JavaScript and is
+included as a technical preview. At time of writing, it is being actively
+developed but is not suitable for serious projects and production environments.
+The JavaScript backend is not distributed in the GHC bindist and requires a
+manual build. See `building the JavaScript backend
+<https://gitlab.haskell.org/ghc/ghc/-/wikis/javascript-backend/building>`_ page
+on the GHC wiki for build instructions.
+
+A JavaScript cross-compiling GHC produces an executable script, and a directory
+of the same name suffixed with ``.jsexe``. For example, compiling a file named
+``Foo.hs`` will produce an executable script ``Foo`` and a ``Foo.jsexe``
+directory. The script is a thin wrapper that calls `Node.js
+<https://nodejs.org/en/>`_ on the payload of the compiled Haskell code and can
+be run in the usual way, e.g., ``./Foo``, as long as ``node`` is in your
+environment . The actual payload is in ``<ModuleName>.jsexe/all.js``, for
+example ``Foo.jsexe/all.js``. This file is the Haskell program cross-compiled to
+JavaScript *concrete syntax* and can be wrapped in a ``<script>`` HTML tag. For
+a breakdown of the rest of the build artifacts see the `compiler output
+<https://gitlab.haskell.org/ghc/ghc/-/wikis/javascript-backend/building#compiler-output-and-build-artifacts>`_
+section in the wiki.
+
 .. _unreg:
 
 Unregisterised compilation


=====================================
libraries/base/GHC/IO/Encoding.hs
=====================================
@@ -1,5 +1,6 @@
 {-# LANGUAGE Trustworthy #-}
 {-# LANGUAGE CPP, NoImplicitPrelude #-}
+{-# LANGUAGE UnboxedTuples, MagicHash #-}
 {-# OPTIONS_GHC -funbox-strict-fields #-}
 
 -----------------------------------------------------------------------------
@@ -336,11 +337,13 @@ mkTextEncoding' cfm enc =
 
 
 latin1_encode :: CharBuffer -> Buffer Word8 -> IO (CharBuffer, Buffer Word8)
-latin1_encode input output = fmap (\(_why,input',output') -> (input',output')) $ Latin1.latin1_encode input output -- unchecked, used for char8
+latin1_encode input output = IO $ \st -> case Latin1.latin1_encode input output st of
+  (# st', _why, input', output' #) -> (# st', (input', output') #) -- unchecked, used for char8
 --latin1_encode = unsafePerformIO $ do mkTextEncoder Iconv.latin1 >>= return.encode
 
 latin1_decode :: Buffer Word8 -> CharBuffer -> IO (Buffer Word8, CharBuffer)
-latin1_decode input output = fmap (\(_why,input',output') -> (input',output')) $ Latin1.latin1_decode input output
+latin1_decode input output = IO $ \st -> case Latin1.latin1_decode input output st of
+  (# st', _why, input', output' #) -> (# st', (input',output') #)
 --latin1_decode = unsafePerformIO $ do mkTextDecoder Iconv.latin1 >>= return.encode
 
 unknownEncodingErr :: String -> IO a


=====================================
libraries/base/GHC/IO/Encoding/Failure.hs
=====================================
@@ -1,5 +1,7 @@
 {-# LANGUAGE Trustworthy #-}
 {-# LANGUAGE NoImplicitPrelude #-}
+{-# LANGUAGE UnboxedTuples #-}
+{-# LANGUAGE MagicHash #-}
 
 -----------------------------------------------------------------------------
 -- |
@@ -18,7 +20,8 @@
 module GHC.IO.Encoding.Failure (
     CodingFailureMode(..), codingFailureModeSuffix,
     isSurrogate,
-    recoverDecode, recoverEncode
+    recoverDecode, recoverEncode,
+    recoverDecode#, recoverEncode#,
   ) where
 
 import GHC.IO
@@ -142,6 +145,12 @@ unescapeRoundtripCharacterSurrogate c
     | otherwise                 = Nothing
   where x = ord c
 
+recoverDecode# :: CodingFailureMode -> Buffer Word8 -> Buffer Char
+               -> State# RealWorld -> (# State# RealWorld, Buffer Word8, Buffer Char #)
+recoverDecode# cfm input output st =
+  let (# st', (bIn, bOut) #) = unIO (recoverDecode cfm input output) st
+  in (# st', bIn, bOut #)
+
 recoverDecode :: CodingFailureMode -> Buffer Word8 -> Buffer Char
               -> IO (Buffer Word8, Buffer Char)
 recoverDecode cfm input at Buffer{  bufRaw=iraw, bufL=ir, bufR=_  }
@@ -160,6 +169,12 @@ recoverDecode cfm input at Buffer{  bufRaw=iraw, bufL=ir, bufR=_  }
       ow' <- writeCharBuf oraw ow (escapeToRoundtripCharacterSurrogate b)
       return (input { bufL=ir+1 }, output { bufR=ow' })
 
+recoverEncode# :: CodingFailureMode -> Buffer Char -> Buffer Word8
+               -> State# RealWorld -> (# State# RealWorld, Buffer Char, Buffer Word8 #)
+recoverEncode# cfm input output st =
+  let (# st', (bIn, bOut) #) = unIO (recoverEncode cfm input output) st
+  in (# st', bIn, bOut #)
+
 recoverEncode :: CodingFailureMode -> Buffer Char -> Buffer Word8
               -> IO (Buffer Char, Buffer Word8)
 recoverEncode cfm input at Buffer{  bufRaw=iraw, bufL=ir, bufR=_  }


=====================================
libraries/base/GHC/IO/Encoding/Iconv.hs
=====================================
@@ -2,6 +2,8 @@
 {-# LANGUAGE CPP
            , NoImplicitPrelude
            , NondecreasingIndentation
+           , UnboxedTuples
+           , MagicHash
   #-}
 {-# OPTIONS_HADDOCK not-home #-}
 
@@ -133,19 +135,24 @@ newIConv from to rec fn =
   withCAString to   $ \ to_str -> do
     iconvt <- throwErrnoIfMinus1 "mkTextEncoding" $ hs_iconv_open to_str from_str
     let iclose = throwErrnoIfMinus1_ "Iconv.close" $ hs_iconv_close iconvt
+        fn_iconvt ibuf obuf st = case unIO (fn iconvt ibuf obuf) st of
+          (# st', (prog, ibuf', obuf') #) -> (# st', prog, ibuf', obuf' #)
     return BufferCodec{
-                encode = fn iconvt,
-                recover = rec,
-                close  = iclose,
+                encode#  = fn_iconvt,
+                recover# = rec#,
+                close    = iclose,
                 -- iconv doesn't supply a way to save/restore the state
                 getState = return (),
                 setState = const $ return ()
                 }
+  where
+    rec# ibuf obuf st = case unIO (rec ibuf obuf) st of
+      (# st', (ibuf', obuf') #) -> (# st', ibuf', obuf' #)
 
-iconvDecode :: IConv -> DecodeBuffer
+iconvDecode :: IConv -> Buffer Word8 -> Buffer Char -> IO (CodingProgress, Buffer Word8, Buffer Char)
 iconvDecode iconv_t ibuf obuf = iconvRecode iconv_t ibuf 0 obuf char_shift
 
-iconvEncode :: IConv -> EncodeBuffer
+iconvEncode :: IConv -> Buffer Char -> Buffer Word8 -> IO (CodingProgress, Buffer Char, Buffer Word8)
 iconvEncode iconv_t ibuf obuf = iconvRecode iconv_t ibuf char_shift obuf 0
 
 iconvRecode :: IConv -> Buffer a -> Int -> Buffer b -> Int


=====================================
libraries/base/GHC/IO/Encoding/Latin1.hs
=====================================
@@ -2,6 +2,8 @@
 {-# LANGUAGE NoImplicitPrelude
            , BangPatterns
            , NondecreasingIndentation
+           , UnboxedTuples
+           , MagicHash
   #-}
 {-# OPTIONS_GHC  -funbox-strict-fields #-}
 
@@ -57,8 +59,8 @@ mkLatin1 cfm = TextEncoding { textEncodingName = "ISO-8859-1",
 latin1_DF :: CodingFailureMode -> IO (TextDecoder ())
 latin1_DF cfm =
   return (BufferCodec {
-             encode   = latin1_decode,
-             recover  = recoverDecode cfm,
+             encode#  = latin1_decode,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -67,8 +69,8 @@ latin1_DF cfm =
 latin1_EF :: CodingFailureMode -> IO (TextEncoder ())
 latin1_EF cfm =
   return (BufferCodec {
-             encode   = latin1_encode,
-             recover  = recoverEncode cfm,
+             encode#  = latin1_encode,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -86,8 +88,8 @@ mkLatin1_checked cfm = TextEncoding { textEncodingName = "ISO-8859-1",
 latin1_checked_EF :: CodingFailureMode -> IO (TextEncoder ())
 latin1_checked_EF cfm =
   return (BufferCodec {
-             encode   = latin1_checked_encode,
-             recover  = recoverEncode cfm,
+             encode#  = latin1_checked_encode,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -109,8 +111,8 @@ mkAscii cfm = TextEncoding { textEncodingName = "ASCII",
 ascii_DF :: CodingFailureMode -> IO (TextDecoder ())
 ascii_DF cfm =
   return (BufferCodec {
-             encode   = ascii_decode,
-             recover  = recoverDecode cfm,
+             encode#  = ascii_decode,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -119,8 +121,8 @@ ascii_DF cfm =
 ascii_EF :: CodingFailureMode -> IO (TextEncoder ())
 ascii_EF cfm =
   return (BufferCodec {
-             encode   = ascii_encode,
-             recover  = recoverEncode cfm,
+             encode#  = ascii_encode,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -138,65 +140,78 @@ latin1_decode :: DecodeBuffer
 latin1_decode 
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let 
-       loop !ir !ow
-         | ow >= os = done OutputUnderflow ir ow
-         | ir >= iw = done InputUnderflow ir ow
+       loop :: Int -> Int -> DecodingBuffer
+       loop !ir !ow st0
+         | ow >= os = done OutputUnderflow ir ow st0
+         | ir >= iw = done InputUnderflow ir ow st0
          | otherwise = do
-              c0 <- readWord8Buf iraw ir
-              ow' <- writeCharBuf oraw ow (unsafeChr (fromIntegral c0))
-              loop (ir+1) ow'
+              let !(# st1, c0 #) = unIO (readWord8Buf iraw ir) st0
+                  !(# st2, ow' #) = unIO (writeCharBuf oraw ow (unsafeChr (fromIntegral c0))) st1
+              loop (ir+1) ow' st2
 
        -- lambda-lifted, to avoid thunks being built in the inner-loop:
-       done why !ir !ow = return (why,
-                                  if ir == iw then input{ bufL=0, bufR=0 }
-                                              else input{ bufL=ir },
-                                  output{ bufR=ow })
+       {-# NOINLINE done #-}
+       done :: CodingProgress -> Int -> Int -> DecodingBuffer
+       done why !ir !ow st' =
+         let !ri = if ir == iw then input{ bufL=0, bufR=0 } else input{ bufL=ir }
+             !ro = output{ bufR=ow }
+         in  (# st', why, ri, ro #)
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 ascii_decode :: DecodeBuffer
 ascii_decode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-       loop !ir !ow
-         | ow >= os = done OutputUnderflow ir ow
-         | ir >= iw = done InputUnderflow ir ow
+       loop :: Int -> Int -> DecodingBuffer
+       loop !ir !ow st0
+         | ow >= os = done OutputUnderflow ir ow st0
+         | ir >= iw = done InputUnderflow ir ow st0
          | otherwise = do
-              c0 <- readWord8Buf iraw ir
-              if c0 > 0x7f then invalid else do
-              ow' <- writeCharBuf oraw ow (unsafeChr (fromIntegral c0))
-              loop (ir+1) ow'
+              let !(# st1, c0 #) = unIO (readWord8Buf iraw ir) st0
+              if c0 > 0x7f then invalid st1 else do
+                let !(# st2, ow' #) = unIO (writeCharBuf oraw ow (unsafeChr (fromIntegral c0))) st1
+                loop (ir+1) ow' st2
          where
-           invalid = done InvalidSequence ir ow
+           invalid :: DecodingBuffer
+           invalid st' = done InvalidSequence ir ow st'
 
        -- lambda-lifted, to avoid thunks being built in the inner-loop:
-       done why !ir !ow = return (why,
-                                  if ir == iw then input{ bufL=0, bufR=0 }
-                                              else input{ bufL=ir },
-                                  output{ bufR=ow })
+       {-# NOINLINE done #-}
+       done :: CodingProgress -> Int -> Int -> DecodingBuffer
+       done why !ir !ow st' = 
+         let !ri = if ir == iw then input{ bufL=0, bufR=0 } else input{ bufL=ir }
+             !ro = output{ bufR=ow }
+         in  (# st', why, ri, ro #)
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 latin1_encode :: EncodeBuffer
 latin1_encode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-      done why !ir !ow = return (why,
-                                 if ir == iw then input{ bufL=0, bufR=0 }
-                                             else input{ bufL=ir },
-                                 output{ bufR=ow })
-      loop !ir !ow
-        | ow >= os = done OutputUnderflow ir ow
-        | ir >= iw = done InputUnderflow ir ow
+      {-# NOINLINE done #-}
+      done :: CodingProgress -> Int -> Int -> EncodingBuffer
+      done why !ir !ow st' = 
+        let !ri = if ir == iw then input{ bufL=0, bufR=0 } else input{ bufL=ir }
+            !ro = output{ bufR=ow }
+        in  (# st', why, ri, ro #)
+      loop :: Int -> Int -> EncodingBuffer
+      loop !ir !ow st0
+        | ow >= os = done OutputUnderflow ir ow st0
+        | ir >= iw = done InputUnderflow ir ow st0
         | otherwise = do
-           (c,ir') <- readCharBuf iraw ir
-           writeWord8Buf oraw ow (fromIntegral (ord c))
-           loop ir' (ow+1)
+           let !(# st1, (c,ir') #) = unIO (readCharBuf iraw ir) st0
+               !(# st2, () #) = unIO (writeWord8Buf oraw ow (fromIntegral (ord c))) st1
+           loop ir' (ow+1) st2
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 latin1_checked_encode :: EncodeBuffer
 latin1_checked_encode input output
@@ -210,21 +225,26 @@ single_byte_checked_encode :: Int -> EncodeBuffer
 single_byte_checked_encode max_legal_char
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-      done why !ir !ow = return (why,
-                                 if ir == iw then input{ bufL=0, bufR=0 }
-                                             else input{ bufL=ir },
-                                 output{ bufR=ow })
-      loop !ir !ow
-        | ow >= os = done OutputUnderflow ir ow
-        | ir >= iw = done InputUnderflow ir ow
+      {-# NOINLINE done #-}
+      done :: CodingProgress -> Int -> Int -> EncodingBuffer
+      done why !ir !ow st' = 
+        let !ri = if ir == iw then input{ bufL=0, bufR=0 } else input{ bufL=ir }
+            !ro = output{ bufR=ow }
+        in  (# st', why, ri, ro #)
+      loop :: Int -> Int -> EncodingBuffer
+      loop !ir !ow st0
+        | ow >= os = done OutputUnderflow ir ow st0
+        | ir >= iw = done InputUnderflow ir ow st0
         | otherwise = do
-           (c,ir') <- readCharBuf iraw ir
-           if ord c > max_legal_char then invalid else do
-           writeWord8Buf oraw ow (fromIntegral (ord c))
-           loop ir' (ow+1)
+           let !(# st1, (c,ir') #) = unIO (readCharBuf iraw ir) st0
+           if ord c > max_legal_char then invalid st1 else do
+             let (# st2, () #) = unIO (writeWord8Buf oraw ow (fromIntegral (ord c))) st1
+             loop ir' (ow+1) st2
         where
-           invalid = done InvalidSequence ir ow
+          invalid :: EncodingBuffer
+           invalid st' = done InvalidSequence ir ow st'
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 {-# INLINE single_byte_checked_encode #-}


=====================================
libraries/base/GHC/IO/Encoding/Types.hs
=====================================
@@ -1,5 +1,6 @@
 {-# LANGUAGE Trustworthy #-}
 {-# LANGUAGE NoImplicitPrelude, ExistentialQuantification #-}
+{-# LANGUAGE MagicHash, UnboxedTuples #-}
 {-# OPTIONS_GHC -funbox-strict-fields #-}
 
 -----------------------------------------------------------------------------
@@ -21,7 +22,9 @@ module GHC.IO.Encoding.Types (
     TextEncoding(..),
     TextEncoder, TextDecoder,
     CodeBuffer, EncodeBuffer, DecodeBuffer,
-    CodingProgress(..)
+    EncodingBuffer, DecodingBuffer,
+    CodingProgress(..),
+    recover, encode,
   ) where
 
 import GHC.Base
@@ -34,7 +37,7 @@ import GHC.IO.Buffer
 -- Text encoders/decoders
 
 data BufferCodec from to state = BufferCodec {
-  encode :: CodeBuffer from to,
+  encode# :: CodeBuffer# from to,
    -- ^ The @encode@ function translates elements of the buffer @from@
    -- to the buffer @to at .  It should translate as many elements as possible
    -- given the sizes of the buffers, including translating zero elements
@@ -50,7 +53,7 @@ data BufferCodec from to state = BufferCodec {
    -- library in order to report translation errors at the point they
    -- actually occur, rather than when the buffer is translated.
   
-  recover :: Buffer from -> Buffer to -> IO (Buffer from, Buffer to),
+  recover# :: Buffer from -> Buffer to -> State# RealWorld -> (# State# RealWorld, Buffer from, Buffer to #),
    -- ^ The @recover@ function is used to continue decoding
    -- in the presence of invalid or unrepresentable sequences. This includes
    -- both those detected by @encode@ returning @InvalidSequence@ and those
@@ -92,9 +95,24 @@ data BufferCodec from to state = BufferCodec {
    -- call to 'getState'.
  }
 
-type CodeBuffer from to = Buffer from -> Buffer to -> IO (CodingProgress, Buffer from, Buffer to)
-type DecodeBuffer = CodeBuffer Word8 Char
-type EncodeBuffer = CodeBuffer Char Word8
+{-# INLINE encode #-}
+encode :: BufferCodec from to state -> CodeBuffer from to
+encode codec from to = IO $ \s -> case encode# codec from to s of
+  (# s', progress, from', to' #) -> (# s', (progress, from', to') #)
+
+{-# INLINE recover #-}
+recover :: BufferCodec from to state -> Buffer from -> Buffer to -> IO (Buffer from, Buffer to)
+recover codec from to = IO $ \s -> case recover# codec from to s of
+  (# s', from', to' #) -> (# s', (from', to') #)
+
+
+type CodeBuffer    from to = Buffer from -> Buffer to -> IO (CodingProgress, Buffer from, Buffer to)
+type CodingBuffer# from to = State# RealWorld -> (# State# RealWorld, CodingProgress, Buffer from, Buffer to #)
+type CodeBuffer#   from to = Buffer from -> Buffer to -> CodingBuffer# from to
+type DecodingBuffer = CodingBuffer# Word8 Char
+type DecodeBuffer   = CodeBuffer#   Word8 Char
+type EncodingBuffer = CodingBuffer# Char Word8
+type EncodeBuffer   = CodeBuffer#   Char Word8
 
 type TextDecoder state = BufferCodec Word8 CharBufElem state
 type TextEncoder state = BufferCodec CharBufElem Word8 state


=====================================
libraries/base/GHC/IO/Encoding/UTF16.hs
=====================================
@@ -3,6 +3,7 @@
            , BangPatterns
            , NondecreasingIndentation
            , MagicHash
+           , UnboxedTuples
   #-}
 {-# OPTIONS_GHC  -funbox-strict-fields #-}
 
@@ -65,8 +66,8 @@ utf16_DF :: CodingFailureMode -> IO (TextDecoder (Maybe DecodeBuffer))
 utf16_DF cfm = do
   seen_bom <- newIORef Nothing
   return (BufferCodec {
-             encode   = utf16_decode seen_bom,
-             recover  = recoverDecode cfm,
+             encode#  = utf16_decode seen_bom,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = readIORef seen_bom,
              setState = writeIORef seen_bom
@@ -76,8 +77,8 @@ utf16_EF :: CodingFailureMode -> IO (TextEncoder Bool)
 utf16_EF cfm = do
   done_bom <- newIORef False
   return (BufferCodec {
-             encode   = utf16_encode done_bom,
-             recover  = recoverEncode cfm,
+             encode#  = utf16_encode done_bom,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = readIORef done_bom,
              setState = writeIORef done_bom
@@ -86,39 +87,41 @@ utf16_EF cfm = do
 utf16_encode :: IORef Bool -> EncodeBuffer
 utf16_encode done_bom input
   output at Buffer{ bufRaw=oraw, bufL=_, bufR=ow, bufSize=os }
+  st0
  = do
-  b <- readIORef done_bom
-  if b then utf16_native_encode input output
+  let !(# st1, b #) = unIO (readIORef done_bom) st0
+  if b then utf16_native_encode input output st1
        else if os - ow < 2
-               then return (OutputUnderflow,input,output)
+               then (# st1,OutputUnderflow,input,output #)
                else do
-                    writeIORef done_bom True
-                    writeWord8Buf oraw ow     bom1
-                    writeWord8Buf oraw (ow+1) bom2
-                    utf16_native_encode input output{ bufR = ow+2 }
+               let !(# st2, () #) = unIO (writeIORef done_bom True) st1
+                   !(# st3, () #) = unIO (writeWord8Buf oraw ow     bom1) st2
+                   !(# st4, () #) = unIO (writeWord8Buf oraw (ow+1) bom2) st3
+               utf16_native_encode input output{ bufR = ow+2 } st4
 
 utf16_decode :: IORef (Maybe DecodeBuffer) -> DecodeBuffer
 utf16_decode seen_bom
   input at Buffer{  bufRaw=iraw, bufL=ir, bufR=iw,  bufSize=_  }
   output
+  st0
  = do
-   mb <- readIORef seen_bom
+   let !(# st1, mb #) = unIO (readIORef seen_bom) st0
    case mb of
-     Just decode -> decode input output
+     Just decode -> decode input output st1
      Nothing ->
-       if iw - ir < 2 then return (InputUnderflow,input,output) else do
-       c0 <- readWord8Buf iraw ir
-       c1 <- readWord8Buf iraw (ir+1)
+       if iw - ir < 2 then (# st1,InputUnderflow,input,output #) else do
+       let !(# st2, c0 #) = unIO (readWord8Buf iraw  ir   ) st1
+           !(# st3, c1 #) = unIO (readWord8Buf iraw (ir+1)) st2
        case () of
-        _ | c0 == bomB && c1 == bomL -> do
-               writeIORef seen_bom (Just utf16be_decode)
-               utf16be_decode input{ bufL= ir+2 } output
-          | c0 == bomL && c1 == bomB -> do
-               writeIORef seen_bom (Just utf16le_decode)
-               utf16le_decode input{ bufL= ir+2 } output
-          | otherwise -> do
-               writeIORef seen_bom (Just utf16_native_decode)
-               utf16_native_decode input output
+        _ | c0 == bomB && c1 == bomL ->
+               let !(# st4, () #) = unIO (writeIORef seen_bom (Just utf16be_decode)) st3
+               in utf16be_decode input{ bufL= ir+2 } output st4
+          | c0 == bomL && c1 == bomB ->
+               let !(# st4, () #) = unIO (writeIORef seen_bom (Just utf16le_decode)) st3
+               in utf16le_decode input{ bufL= ir+2 } output st4
+          | otherwise ->
+               let !(# st4, () #) = unIO (writeIORef seen_bom (Just utf16_native_decode)) st3
+               in utf16_native_decode input output st4
 
 
 bomB, bomL, bom1, bom2 :: Word8
@@ -150,8 +153,8 @@ mkUTF16be cfm = TextEncoding { textEncodingName = "UTF-16BE",
 utf16be_DF :: CodingFailureMode -> IO (TextDecoder ())
 utf16be_DF cfm =
   return (BufferCodec {
-             encode   = utf16be_decode,
-             recover  = recoverDecode cfm,
+             encode#  = utf16be_decode,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -160,8 +163,8 @@ utf16be_DF cfm =
 utf16be_EF :: CodingFailureMode -> IO (TextEncoder ())
 utf16be_EF cfm =
   return (BufferCodec {
-             encode   = utf16be_encode,
-             recover  = recoverEncode cfm,
+             encode#  = utf16be_encode,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -179,8 +182,8 @@ mkUTF16le cfm = TextEncoding { textEncodingName = "UTF16-LE",
 utf16le_DF :: CodingFailureMode -> IO (TextDecoder ())
 utf16le_DF cfm =
   return (BufferCodec {
-             encode   = utf16le_decode,
-             recover  = recoverDecode cfm,
+             encode#  = utf16le_decode,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -189,8 +192,8 @@ utf16le_DF cfm =
 utf16le_EF :: CodingFailureMode -> IO (TextEncoder ())
 utf16le_EF cfm =
   return (BufferCodec {
-             encode   = utf16le_encode,
-             recover  = recoverEncode cfm,
+             encode#  = utf16le_encode,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -201,91 +204,104 @@ utf16be_decode :: DecodeBuffer
 utf16be_decode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-       loop !ir !ow
-         | ow >= os     = done OutputUnderflow ir ow
-         | ir >= iw     = done InputUnderflow ir ow
-         | ir + 1 == iw = done InputUnderflow ir ow
+       loop !ir !ow st0
+         | ow >= os     = done OutputUnderflow ir ow st0
+         | ir >= iw     = done InputUnderflow ir ow st0
+         | ir + 1 == iw = done InputUnderflow ir ow st0
          | otherwise = do
-              c0 <- readWord8Buf iraw ir
-              c1 <- readWord8Buf iraw (ir+1)
+              let !(# st1, c0 #) = unIO (readWord8Buf iraw  ir   ) st0
+                  !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
               let x1 = fromIntegral c0 `shiftL` 8 + fromIntegral c1
               if validate1 x1
-                 then do ow' <- writeCharBuf oraw ow (unsafeChr (fromIntegral x1))
-                         loop (ir+2) ow'
-                 else if iw - ir < 4 then done InputUnderflow ir ow else do
-                      c2 <- readWord8Buf iraw (ir+2)
-                      c3 <- readWord8Buf iraw (ir+3)
-                      let x2 = fromIntegral c2 `shiftL` 8 + fromIntegral c3
-                      if not (validate2 x1 x2) then invalid else do
-                      ow' <- writeCharBuf oraw ow (chr2 x1 x2)
-                      loop (ir+4) ow'
+                 then let !(# st3, ow' #) = unIO (writeCharBuf oraw ow (unsafeChr (fromIntegral x1))) st2
+                      in loop (ir+2) ow' st3
+                 else if iw - ir < 4 then done InputUnderflow ir ow st2 else do
+                      let !(# st3, c2 #) = unIO (readWord8Buf iraw (ir+2)) st2
+                          !(# st4, c3 #) = unIO (readWord8Buf iraw (ir+3)) st3
+                          x2 = fromIntegral c2 `shiftL` 8 + fromIntegral c3
+                      if not (validate2 x1 x2) then invalid st4 else do
+                      let !(# st5, ow' #) = unIO (writeCharBuf oraw ow (chr2 x1 x2)) st4
+                      loop (ir+4) ow' st5
          where
-           invalid = done InvalidSequence ir ow
+           invalid :: DecodingBuffer
+           invalid st' = done InvalidSequence ir ow st'
 
        -- lambda-lifted, to avoid thunks being built in the inner-loop:
-       done why !ir !ow = return (why,
-                                  if ir == iw then input{ bufL=0, bufR=0 }
-                                              else input{ bufL=ir },
-                                  output{ bufR=ow })
+       {-# NOINLINE done #-}
+       done :: CodingProgress -> Int -> Int -> DecodingBuffer
+       done why !ir !ow st' =
+         let !ri = if ir == iw then input { bufL = 0, bufR = 0 } else input { bufL = ir }
+             !ro = output{ bufR = ow }
+         in  (# st', why, ri, ro #)
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 utf16le_decode :: DecodeBuffer
 utf16le_decode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-       loop !ir !ow
-         | ow >= os     = done OutputUnderflow ir ow
-         | ir >= iw     = done InputUnderflow ir ow
-         | ir + 1 == iw = done InputUnderflow ir ow
+       loop :: Int -> Int -> DecodingBuffer
+       loop !ir !ow st0
+         | ow >= os     = done OutputUnderflow ir ow st0
+         | ir >= iw     = done InputUnderflow ir ow st0
+         | ir + 1 == iw = done InputUnderflow ir ow st0
          | otherwise = do
-              c0 <- readWord8Buf iraw ir
-              c1 <- readWord8Buf iraw (ir+1)
-              let x1 = fromIntegral c1 `shiftL` 8 + fromIntegral c0
+              let !(# st1, c0 #) = unIO (readWord8Buf iraw  ir   ) st0
+                  !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
+                  x1 = fromIntegral c1 `shiftL` 8 + fromIntegral c0
               if validate1 x1
-                 then do ow' <- writeCharBuf oraw ow (unsafeChr (fromIntegral x1))
-                         loop (ir+2) ow'
-                 else if iw - ir < 4 then done InputUnderflow ir ow else do
-                      c2 <- readWord8Buf iraw (ir+2)
-                      c3 <- readWord8Buf iraw (ir+3)
-                      let x2 = fromIntegral c3 `shiftL` 8 + fromIntegral c2
-                      if not (validate2 x1 x2) then invalid else do
-                      ow' <- writeCharBuf oraw ow (chr2 x1 x2)
-                      loop (ir+4) ow'
+                 then let !(# st3, ow' #) = unIO (writeCharBuf oraw ow (unsafeChr (fromIntegral x1))) st2
+                      in loop (ir+2) ow' st3
+                 else if iw - ir < 4 then done InputUnderflow ir ow st2 else do
+                      let !(# st3, c2 #) = unIO (readWord8Buf iraw (ir+2)) st2
+                          !(# st4, c3 #) = unIO (readWord8Buf iraw (ir+3)) st3
+                          x2 = fromIntegral c3 `shiftL` 8 + fromIntegral c2
+                      if not (validate2 x1 x2) then invalid st4 else do
+                      let !(# st5, ow' #) = unIO (writeCharBuf oraw ow (chr2 x1 x2)) st4
+                      loop (ir+4) ow' st5
          where
-           invalid = done InvalidSequence ir ow
+           invalid :: DecodingBuffer
+           invalid st' = done InvalidSequence ir ow st'
 
        -- lambda-lifted, to avoid thunks being built in the inner-loop:
-       done why !ir !ow = return (why,
-                                  if ir == iw then input{ bufL=0, bufR=0 }
-                                              else input{ bufL=ir },
-                                  output{ bufR=ow })
+       {-# NOINLINE done #-}
+       done :: CodingProgress -> Int -> Int -> DecodingBuffer
+       done why !ir !ow st' =
+         let !ri = if ir == iw then input{ bufL = 0, bufR = 0 } else input{ bufL = ir }
+             !ro = output{ bufR = ow }
+         in  (# st', why, ri, ro #)
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 utf16be_encode :: EncodeBuffer
 utf16be_encode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-      done why !ir !ow = return (why,
-                                 if ir == iw then input{ bufL=0, bufR=0 }
-                                             else input{ bufL=ir },
-                                 output{ bufR=ow })
-      loop !ir !ow
-        | ir >= iw     =  done InputUnderflow ir ow
-        | os - ow < 2  =  done OutputUnderflow ir ow
+      {-# NOINLINE done #-}
+      done :: CodingProgress -> Int -> Int -> EncodingBuffer
+      done why !ir !ow st' =
+        let !ri = if ir == iw then input{ bufL = 0, bufR = 0 } else input{ bufL=ir }
+            !ro = output{ bufR=ow }
+        in  (# st', why, ri, ro #)
+      loop :: Int -> Int -> EncodingBuffer
+      loop !ir !ow st0
+        | ir >= iw     =  done InputUnderflow ir ow st0
+        | os - ow < 2  =  done OutputUnderflow ir ow st0
         | otherwise = do
-           (c,ir') <- readCharBuf iraw ir
+           let !(# st1, (c,ir') #) = unIO (readCharBuf iraw ir) st0
            case ord c of
-             x | x < 0x10000 -> if isSurrogate c then done InvalidSequence ir ow else do
-                    writeWord8Buf oraw ow     (fromIntegral (x `shiftR` 8))
-                    writeWord8Buf oraw (ow+1) (fromIntegral x)
-                    loop ir' (ow+2)
+             x | x < 0x10000 -> if isSurrogate c then done InvalidSequence ir ow st1 else do
+                    let !(# st2, () #) = unIO (writeWord8Buf oraw ow     (fromIntegral (x `shiftR` 8))) st1
+                        !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) (fromIntegral x))              st2
+                    loop ir' (ow+2) st3
                | otherwise -> do
-                    if os - ow < 4 then done OutputUnderflow ir ow else do
+                    if os - ow < 4 then done OutputUnderflow ir ow st1 else do
                     let
                          n1 = x - 0x10000
                          c1 = fromIntegral (n1 `shiftR` 18 + 0xD8)
@@ -294,35 +310,39 @@ utf16be_encode
                          c3 = fromIntegral (n2 `shiftR` 8 + 0xDC)
                          c4 = fromIntegral n2
                     --
-                    writeWord8Buf oraw ow     c1
-                    writeWord8Buf oraw (ow+1) c2
-                    writeWord8Buf oraw (ow+2) c3
-                    writeWord8Buf oraw (ow+3) c4
-                    loop ir' (ow+4)
+                         !(# st2, () #) = unIO (writeWord8Buf oraw ow     c1) st1
+                         !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) c2) st2
+                         !(# st4, () #) = unIO (writeWord8Buf oraw (ow+2) c3) st3
+                         !(# st5, () #) = unIO (writeWord8Buf oraw (ow+3) c4) st4
+                    loop ir' (ow+4) st5
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 utf16le_encode :: EncodeBuffer
 utf16le_encode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-      done why !ir !ow = return (why,
-                                 if ir == iw then input{ bufL=0, bufR=0 }
-                                             else input{ bufL=ir },
-                                 output{ bufR=ow })
-      loop !ir !ow
-        | ir >= iw     =  done InputUnderflow ir ow
-        | os - ow < 2  =  done OutputUnderflow ir ow
+      {-# NOINLINE done #-}
+      done :: CodingProgress -> Int -> Int -> EncodingBuffer
+      done why !ir !ow st' =
+        let !ri = if ir == iw then input{ bufL = 0, bufR = 0 } else input{ bufL = ir }
+            !ro = output{ bufR = ow }
+        in  (# st', why, ri, ro #)
+      loop :: Int -> Int -> EncodingBuffer
+      loop !ir !ow st0
+        | ir >= iw     =  done InputUnderflow ir ow st0
+        | os - ow < 2  =  done OutputUnderflow ir ow st0
         | otherwise = do
-           (c,ir') <- readCharBuf iraw ir
+           let !(# st1, (c,ir') #) = unIO (readCharBuf iraw ir) st0
            case ord c of
-             x | x < 0x10000 -> if isSurrogate c then done InvalidSequence ir ow else do
-                    writeWord8Buf oraw ow     (fromIntegral x)
-                    writeWord8Buf oraw (ow+1) (fromIntegral (x `shiftR` 8))
-                    loop ir' (ow+2)
+             x | x < 0x10000 -> if isSurrogate c then done InvalidSequence ir ow st1 else do
+                    let !(# st2, () #) = unIO (writeWord8Buf oraw ow     (fromIntegral x)) st1
+                        !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) (fromIntegral (x `shiftR` 8))) st2
+                    loop ir' (ow+2) st3
                | otherwise ->
-                    if os - ow < 4 then done OutputUnderflow ir ow else do
+                    if os - ow < 4 then done OutputUnderflow ir ow st1 else do
                     let
                          n1 = x - 0x10000
                          c1 = fromIntegral (n1 `shiftR` 18 + 0xD8)
@@ -331,13 +351,13 @@ utf16le_encode
                          c3 = fromIntegral (n2 `shiftR` 8 + 0xDC)
                          c4 = fromIntegral n2
                     --
-                    writeWord8Buf oraw ow     c2
-                    writeWord8Buf oraw (ow+1) c1
-                    writeWord8Buf oraw (ow+2) c4
-                    writeWord8Buf oraw (ow+3) c3
-                    loop ir' (ow+4)
+                         !(# st2, () #) = unIO (writeWord8Buf oraw ow     c2) st1
+                         !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) c1) st2
+                         !(# st4, () #) = unIO (writeWord8Buf oraw (ow+2) c4) st3
+                         !(# st5, () #) = unIO (writeWord8Buf oraw (ow+3) c3) st4
+                    loop ir' (ow+4) st5
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 chr2 :: Word16 -> Word16 -> Char
 chr2 (W16# a#) (W16# b#) = C# (chr# (upper# +# lower# +# 0x10000#))


=====================================
libraries/base/GHC/IO/Encoding/UTF32.hs
=====================================
@@ -3,6 +3,7 @@
            , BangPatterns
            , NondecreasingIndentation
            , MagicHash
+           , UnboxedTuples
   #-}
 {-# OPTIONS_GHC  -funbox-strict-fields #-}
 
@@ -65,8 +66,8 @@ utf32_DF :: CodingFailureMode -> IO (TextDecoder (Maybe DecodeBuffer))
 utf32_DF cfm = do
   seen_bom <- newIORef Nothing
   return (BufferCodec {
-             encode   = utf32_decode seen_bom,
-             recover  = recoverDecode cfm,
+             encode#  = utf32_decode seen_bom,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = readIORef seen_bom,
              setState = writeIORef seen_bom
@@ -76,8 +77,8 @@ utf32_EF :: CodingFailureMode -> IO (TextEncoder Bool)
 utf32_EF cfm = do
   done_bom <- newIORef False
   return (BufferCodec {
-             encode   = utf32_encode done_bom,
-             recover  = recoverEncode cfm,
+             encode#  = utf32_encode done_bom,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = readIORef done_bom,
              setState = writeIORef done_bom
@@ -86,43 +87,45 @@ utf32_EF cfm = do
 utf32_encode :: IORef Bool -> EncodeBuffer
 utf32_encode done_bom input
   output at Buffer{ bufRaw=oraw, bufL=_, bufR=ow, bufSize=os }
+  st0
  = do
-  b <- readIORef done_bom
-  if b then utf32_native_encode input output
+  let !(# st1, b #) = unIO (readIORef done_bom) st0
+  if b then utf32_native_encode input output st1
        else if os - ow < 4
-               then return (OutputUnderflow, input,output)
+               then (# st1,OutputUnderflow,input,output #)
                else do
-                    writeIORef done_bom True
-                    writeWord8Buf oraw ow     bom0
-                    writeWord8Buf oraw (ow+1) bom1
-                    writeWord8Buf oraw (ow+2) bom2
-                    writeWord8Buf oraw (ow+3) bom3
-                    utf32_native_encode input output{ bufR = ow+4 }
+               let !(# st2, () #) = unIO (writeIORef done_bom True) st1
+                   !(# st3, () #) = unIO (writeWord8Buf oraw ow     bom0) st2
+                   !(# st4, () #) = unIO (writeWord8Buf oraw (ow+1) bom1) st3
+                   !(# st5, () #) = unIO (writeWord8Buf oraw (ow+2) bom2) st4
+                   !(# st6, () #) = unIO (writeWord8Buf oraw (ow+3) bom3) st5
+               utf32_native_encode input output{ bufR = ow+4 } st6
 
 utf32_decode :: IORef (Maybe DecodeBuffer) -> DecodeBuffer
 utf32_decode seen_bom
   input at Buffer{  bufRaw=iraw, bufL=ir, bufR=iw,  bufSize=_  }
   output
+  st0
  = do
-   mb <- readIORef seen_bom
+   let !(# st1, mb #) = unIO (readIORef seen_bom) st0
    case mb of
-     Just decode -> decode input output
+     Just decode -> decode input output st1
      Nothing ->
-       if iw - ir < 4 then return (InputUnderflow, input,output) else do
-       c0 <- readWord8Buf iraw ir
-       c1 <- readWord8Buf iraw (ir+1)
-       c2 <- readWord8Buf iraw (ir+2)
-       c3 <- readWord8Buf iraw (ir+3)
+       if iw - ir < 4 then (# st1,InputUnderflow,input,output #) else do
+       let !(# st2, c0 #) = unIO (readWord8Buf iraw  ir   ) st1
+           !(# st3, c1 #) = unIO (readWord8Buf iraw (ir+1)) st2
+           !(# st4, c2 #) = unIO (readWord8Buf iraw (ir+2)) st3
+           !(# st5, c3 #) = unIO (readWord8Buf iraw (ir+3)) st4
        case () of
-        _ | c0 == bom0 && c1 == bom1 && c2 == bom2 && c3 == bom3 -> do
-               writeIORef seen_bom (Just utf32be_decode)
-               utf32be_decode input{ bufL= ir+4 } output
-        _ | c0 == bom3 && c1 == bom2 && c2 == bom1 && c3 == bom0 -> do
-               writeIORef seen_bom (Just utf32le_decode)
-               utf32le_decode input{ bufL= ir+4 } output
-          | otherwise -> do
-               writeIORef seen_bom (Just utf32_native_decode)
-               utf32_native_decode input output
+        _ | c0 == bom0 && c1 == bom1 && c2 == bom2 && c3 == bom3 ->
+               let !(# st6, () #) = unIO (writeIORef seen_bom (Just utf32be_decode)) st5
+               in utf32be_decode input{ bufL= ir+4 } output st6
+        _ | c0 == bom3 && c1 == bom2 && c2 == bom1 && c3 == bom0 ->
+               let !(# st6, () #) = unIO (writeIORef seen_bom (Just utf32le_decode)) st5
+               in utf32le_decode input{ bufL= ir+4 } output st6
+          | otherwise ->
+               let !(# st6, () #) = unIO (writeIORef seen_bom (Just utf32_native_decode)) st5
+               in utf32_native_decode input output st6
 
 
 bom0, bom1, bom2, bom3 :: Word8
@@ -153,8 +156,8 @@ mkUTF32be cfm = TextEncoding { textEncodingName = "UTF-32BE",
 utf32be_DF :: CodingFailureMode -> IO (TextDecoder ())
 utf32be_DF cfm =
   return (BufferCodec {
-             encode   = utf32be_decode,
-             recover  = recoverDecode cfm,
+             encode#  = utf32be_decode,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -163,8 +166,8 @@ utf32be_DF cfm =
 utf32be_EF :: CodingFailureMode -> IO (TextEncoder ())
 utf32be_EF cfm =
   return (BufferCodec {
-             encode   = utf32be_encode,
-             recover  = recoverEncode cfm,
+             encode#  = utf32be_encode,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -183,8 +186,8 @@ mkUTF32le cfm = TextEncoding { textEncodingName = "UTF-32LE",
 utf32le_DF :: CodingFailureMode -> IO (TextDecoder ())
 utf32le_DF cfm =
   return (BufferCodec {
-             encode   = utf32le_decode,
-             recover  = recoverDecode cfm,
+             encode#  = utf32le_decode,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -193,8 +196,8 @@ utf32le_DF cfm =
 utf32le_EF :: CodingFailureMode -> IO (TextEncoder ())
 utf32le_EF cfm =
   return (BufferCodec {
-             encode   = utf32le_encode,
-             recover  = recoverEncode cfm,
+             encode#  = utf32le_encode,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -205,105 +208,122 @@ utf32be_decode :: DecodeBuffer
 utf32be_decode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-       loop !ir !ow
-         | ow >= os    = done OutputUnderflow ir ow
-         | iw - ir < 4 = done InputUnderflow  ir ow
+       loop :: Int -> Int -> DecodingBuffer
+       loop !ir !ow st0
+         | ow >= os    = done OutputUnderflow ir ow st0
+         | iw - ir < 4 = done InputUnderflow  ir ow st0
          | otherwise = do
-              c0 <- readWord8Buf iraw ir
-              c1 <- readWord8Buf iraw (ir+1)
-              c2 <- readWord8Buf iraw (ir+2)
-              c3 <- readWord8Buf iraw (ir+3)
+              let !(# st1, c0 #) = unIO (readWord8Buf iraw ir    ) st0
+                  !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
+                  !(# st3, c2 #) = unIO (readWord8Buf iraw (ir+2)) st2
+                  !(# st4, c3 #) = unIO (readWord8Buf iraw (ir+3)) st3
               let x1 = chr4 c0 c1 c2 c3
-              if not (validate x1) then invalid else do
-              ow' <- writeCharBuf oraw ow x1
-              loop (ir+4) ow'
+              if not (validate x1) then invalid st4 else do
+              let !(# st5, ow' #) = unIO (writeCharBuf oraw ow x1) st4
+              loop (ir+4) ow' st5
          where
-           invalid = done InvalidSequence ir ow
+           invalid :: DecodingBuffer
+           invalid st' = done InvalidSequence ir ow st'
 
        -- lambda-lifted, to avoid thunks being built in the inner-loop:
-       done why !ir !ow = return (why,
-                                  if ir == iw then input{ bufL=0, bufR=0 }
-                                              else input{ bufL=ir },
-                                  output{ bufR=ow })
+       {-# NOINLINE done #-}
+       done :: CodingProgress -> Int -> Int -> DecodingBuffer
+       done why !ir !ow st' =
+         let !ri = if ir == iw then input{ bufL=0, bufR=0 } else input{ bufL=ir }
+             !ro = output{ bufR=ow }
+         in  (# st', why, ri, ro #)
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 utf32le_decode :: DecodeBuffer
 utf32le_decode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-       loop !ir !ow
-         | ow >= os    = done OutputUnderflow ir ow
-         | iw - ir < 4 = done InputUnderflow  ir ow
+       loop :: Int -> Int -> DecodingBuffer
+       loop !ir !ow st0
+         | ow >= os    = done OutputUnderflow ir ow st0
+         | iw - ir < 4 = done InputUnderflow  ir ow st0
          | otherwise = do
-              c0 <- readWord8Buf iraw ir
-              c1 <- readWord8Buf iraw (ir+1)
-              c2 <- readWord8Buf iraw (ir+2)
-              c3 <- readWord8Buf iraw (ir+3)
+              let !(# st1, c0 #) = unIO (readWord8Buf iraw ir    ) st0
+                  !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
+                  !(# st3, c2 #) = unIO (readWord8Buf iraw (ir+2)) st2
+                  !(# st4, c3 #) = unIO (readWord8Buf iraw (ir+3)) st3
               let x1 = chr4 c3 c2 c1 c0
-              if not (validate x1) then invalid else do
-              ow' <- writeCharBuf oraw ow x1
-              loop (ir+4) ow'
+              if not (validate x1) then invalid st4 else do
+              let (# st5, ow' #) = unIO (writeCharBuf oraw ow x1) st4
+              loop (ir+4) ow' st5
          where
-           invalid = done InvalidSequence ir ow
+           invalid :: DecodingBuffer
+           invalid st' = done InvalidSequence ir ow st'
 
        -- lambda-lifted, to avoid thunks being built in the inner-loop:
-       done why !ir !ow = return (why,
-                                  if ir == iw then input{ bufL=0, bufR=0 }
-                                              else input{ bufL=ir },
-                                  output{ bufR=ow })
+       {-# NOINLINE done #-}
+       done :: CodingProgress -> Int -> Int -> DecodingBuffer
+       done why !ir !ow st' =
+         let !ri = if ir == iw then input{ bufL=0, bufR=0 } else input{ bufL=ir }
+             !ro = output{ bufR=ow }
+         in  (# st', why, ri, ro #)
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 utf32be_encode :: EncodeBuffer
 utf32be_encode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-      done why !ir !ow = return (why,
-                                 if ir == iw then input{ bufL=0, bufR=0 }
-                                             else input{ bufL=ir },
-                                 output{ bufR=ow })
-      loop !ir !ow
-        | ir >= iw    = done InputUnderflow  ir ow
-        | os - ow < 4 = done OutputUnderflow ir ow
+      {-# NOINLINE done #-}
+      done :: CodingProgress -> Int -> Int -> EncodingBuffer
+      done why !ir !ow st' =
+        let !ri = if ir == iw then input{ bufL=0, bufR=0 } else input{ bufL=ir }
+            !ro = output{ bufR=ow }
+        in  (# st', why, ri, ro #)
+      loop :: Int -> Int -> EncodingBuffer
+      loop !ir !ow st0
+        | ir >= iw    = done InputUnderflow  ir ow st0
+        | os - ow < 4 = done OutputUnderflow ir ow st0
         | otherwise = do
-           (c,ir') <- readCharBuf iraw ir
-           if isSurrogate c then done InvalidSequence ir ow else do
+           let !(# st1, (c,ir') #) = unIO (readCharBuf iraw ir) st0
+           if isSurrogate c then done InvalidSequence ir ow st1 else do
              let (c0,c1,c2,c3) = ord4 c
-             writeWord8Buf oraw ow     c0
-             writeWord8Buf oraw (ow+1) c1
-             writeWord8Buf oraw (ow+2) c2
-             writeWord8Buf oraw (ow+3) c3
-             loop ir' (ow+4)
+                 !(# st2, () #) = unIO (writeWord8Buf oraw ow     c0) st1
+                 !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) c1) st2
+                 !(# st4, () #) = unIO (writeWord8Buf oraw (ow+2) c2) st3
+                 !(# st5, () #) = unIO (writeWord8Buf oraw (ow+3) c3) st4
+             loop ir' (ow+4) st5
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 utf32le_encode :: EncodeBuffer
 utf32le_encode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-      done why !ir !ow = return (why,
-                                 if ir == iw then input{ bufL=0, bufR=0 }
-                                             else input{ bufL=ir },
-                                 output{ bufR=ow })
-      loop !ir !ow
-        | ir >= iw    = done InputUnderflow  ir ow
-        | os - ow < 4 = done OutputUnderflow ir ow
+      done :: CodingProgress -> Int -> Int -> EncodingBuffer
+      done why !ir !ow st' =
+        let !ri = if ir == iw then input{ bufL=0, bufR=0 } else input{ bufL=ir }
+            !ro = output{ bufR=ow }
+        in  (# st', why, ri, ro #)
+      loop :: Int -> Int -> EncodingBuffer
+      loop !ir !ow st0
+        | ir >= iw    = done InputUnderflow  ir ow st0
+        | os - ow < 4 = done OutputUnderflow ir ow st0
         | otherwise = do
-           (c,ir') <- readCharBuf iraw ir
-           if isSurrogate c then done InvalidSequence ir ow else do
+           let !(# st1, (c,ir') #) = unIO (readCharBuf iraw ir) st0
+           if isSurrogate c then done InvalidSequence ir ow st1 else do
              let (c0,c1,c2,c3) = ord4 c
-             writeWord8Buf oraw ow     c3
-             writeWord8Buf oraw (ow+1) c2
-             writeWord8Buf oraw (ow+2) c1
-             writeWord8Buf oraw (ow+3) c0
-             loop ir' (ow+4)
+                 !(# st2, () #) = unIO (writeWord8Buf oraw ow     c3) st1
+                 !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) c2) st2
+                 !(# st4, () #) = unIO (writeWord8Buf oraw (ow+2) c1) st3
+                 !(# st5, () #) = unIO (writeWord8Buf oraw (ow+3) c0) st4
+             loop ir' (ow+4) st5
     in
-    loop ir0 ow0
+    loop ir0 ow0 st
 
 chr4 :: Word8 -> Word8 -> Word8 -> Word8 -> Char
 chr4 (W8# x1#) (W8# x2#) (W8# x3#) (W8# x4#) =


=====================================
libraries/base/GHC/IO/Encoding/UTF8.hs
=====================================
@@ -3,6 +3,7 @@
            , BangPatterns
            , NondecreasingIndentation
            , MagicHash
+           , UnboxedTuples
   #-}
 {-# OPTIONS_GHC -funbox-strict-fields #-}
 
@@ -57,8 +58,8 @@ mkUTF8 cfm = TextEncoding { textEncodingName = "UTF-8",
 utf8_DF :: CodingFailureMode -> IO (TextDecoder ())
 utf8_DF cfm =
   return (BufferCodec {
-             encode   = utf8_decode,
-             recover  = recoverDecode cfm,
+             encode#  = utf8_decode,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -67,8 +68,8 @@ utf8_DF cfm =
 utf8_EF :: CodingFailureMode -> IO (TextEncoder ())
 utf8_EF cfm =
   return (BufferCodec {
-             encode   = utf8_encode,
-             recover  = recoverEncode cfm,
+             encode#  = utf8_encode,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = return (),
              setState = const $ return ()
@@ -86,8 +87,8 @@ utf8_bom_DF :: CodingFailureMode -> IO (TextDecoder Bool)
 utf8_bom_DF cfm = do
    ref <- newIORef True
    return (BufferCodec {
-             encode   = utf8_bom_decode ref,
-             recover  = recoverDecode cfm,
+             encode#  = utf8_bom_decode ref,
+             recover# = recoverDecode# cfm,
              close    = return (),
              getState = readIORef ref,
              setState = writeIORef ref
@@ -97,8 +98,8 @@ utf8_bom_EF :: CodingFailureMode -> IO (TextEncoder Bool)
 utf8_bom_EF cfm = do
    ref <- newIORef True
    return (BufferCodec {
-             encode   = utf8_bom_encode ref,
-             recover  = recoverEncode cfm,
+             encode#  = utf8_bom_encode ref,
+             recover# = recoverEncode# cfm,
              close    = return (),
              getState = readIORef ref,
              setState = writeIORef ref
@@ -108,39 +109,41 @@ utf8_bom_decode :: IORef Bool -> DecodeBuffer
 utf8_bom_decode ref
   input at Buffer{  bufRaw=iraw, bufL=ir, bufR=iw,  bufSize=_  }
   output
+  st0
  = do
-   first <- readIORef ref
+   let (# st1, first #) = unIO (readIORef ref) st0
    if not first
-      then utf8_decode input output
+      then utf8_decode input output st1
       else do
-       let no_bom = do writeIORef ref False; utf8_decode input output
-       if iw - ir < 1 then return (InputUnderflow,input,output) else do
-       c0 <- readWord8Buf iraw ir
+       let no_bom = let !(# st', () #) = unIO (writeIORef ref False) st1 in utf8_decode input output st'
+       if iw - ir < 1 then (# st1,InputUnderflow,input,output #) else do
+       let !(# st2, c0 #) = unIO (readWord8Buf iraw ir) st1
        if (c0 /= bom0) then no_bom else do
-       if iw - ir < 2 then return (InputUnderflow,input,output) else do
-       c1 <- readWord8Buf iraw (ir+1)
+       if iw - ir < 2 then (# st2,InputUnderflow,input,output #) else do
+       let !(# st3, c1 #) = unIO (readWord8Buf iraw (ir+1)) st2
        if (c1 /= bom1) then no_bom else do
-       if iw - ir < 3 then return (InputUnderflow,input,output) else do
-       c2 <- readWord8Buf iraw (ir+2)
+       if iw - ir < 3 then (# st3,InputUnderflow,input,output #) else do
+       let !(# st4, c2 #) = unIO (readWord8Buf iraw (ir+2)) st3
        if (c2 /= bom2) then no_bom else do
        -- found a BOM, ignore it and carry on
-       writeIORef ref False
-       utf8_decode input{ bufL = ir + 3 } output
+       let !(# st5, () #) = unIO (writeIORef ref False) st4
+       utf8_decode input{ bufL = ir + 3 } output st5
 
 utf8_bom_encode :: IORef Bool -> EncodeBuffer
 utf8_bom_encode ref input
   output at Buffer{ bufRaw=oraw, bufL=_, bufR=ow, bufSize=os }
+  st0
  = do
-  b <- readIORef ref
-  if not b then utf8_encode input output
+  let !(# st1, b #) = unIO (readIORef ref) st0
+  if not b then utf8_encode input output st1
            else if os - ow < 3
-                  then return (OutputUnderflow,input,output)
+                  then (# st1,OutputUnderflow,input,output #)
                   else do
-                    writeIORef ref False
-                    writeWord8Buf oraw ow     bom0
-                    writeWord8Buf oraw (ow+1) bom1
-                    writeWord8Buf oraw (ow+2) bom2
-                    utf8_encode input output{ bufR = ow+3 }
+                    let !(# st2, () #) = unIO (writeIORef ref False)           st1
+                        !(# st3, () #) = unIO (writeWord8Buf oraw ow     bom0) st2
+                        !(# st4, () #) = unIO (writeWord8Buf oraw (ow+1) bom1) st3
+                        !(# st5, () #) = unIO (writeWord8Buf oraw (ow+2) bom2) st4
+                    utf8_encode input output{ bufR = ow+3 } st5
 
 bom0, bom1, bom2 :: Word8
 bom0 = 0xef
@@ -151,111 +154,120 @@ utf8_decode :: DecodeBuffer
 utf8_decode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-       loop !ir !ow
-         | ow >= os = done OutputUnderflow ir ow
-         | ir >= iw = done InputUnderflow ir ow
+       loop :: Int -> Int -> DecodingBuffer
+       loop !ir !ow st0
+         | ow >= os = done OutputUnderflow ir ow st0
+         | ir >= iw = done InputUnderflow ir ow st0
          | otherwise = do
-              c0 <- readWord8Buf iraw ir
+              let !(# st1, c0 #) = unIO (readWord8Buf iraw ir) st0
               case c0 of
                 _ | c0 <= 0x7f -> do
-                           ow' <- writeCharBuf oraw ow (unsafeChr (fromIntegral c0))
-                           loop (ir+1) ow'
-                  | c0 >= 0xc0 && c0 <= 0xc1 -> invalid -- Overlong forms
+                           let !(# st2, ow' #) = unIO (writeCharBuf oraw ow (unsafeChr (fromIntegral c0))) st1
+                           loop (ir+1) ow' st2
+                  | c0 >= 0xc0 && c0 <= 0xc1 -> invalid st1 -- Overlong forms
                   | c0 >= 0xc2 && c0 <= 0xdf ->
-                           if iw - ir < 2 then done InputUnderflow ir ow else do
-                           c1 <- readWord8Buf iraw (ir+1)
-                           if (c1 < 0x80 || c1 >= 0xc0) then invalid else do
-                           ow' <- writeCharBuf oraw ow (chr2 c0 c1)
-                           loop (ir+2) ow'
+                           if iw - ir < 2 then done InputUnderflow ir ow st1 else do
+                           let !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
+                           if (c1 < 0x80 || c1 >= 0xc0) then invalid st2 else do
+                           let !(# st3, ow' #) = unIO (writeCharBuf oraw ow (chr2 c0 c1)) st2
+                           loop (ir+2) ow' st3
                   | c0 >= 0xe0 && c0 <= 0xef ->
                       case iw - ir of
-                        1 -> done InputUnderflow ir ow
+                        1 -> done InputUnderflow ir ow st1
                         2 -> do -- check for an error even when we don't have
                                 -- the full sequence yet (#3341)
-                           c1 <- readWord8Buf iraw (ir+1)
+                           let !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
                            if not (validate3 c0 c1 0x80)
-                              then invalid else done InputUnderflow ir ow
+                              then invalid st2 else done InputUnderflow ir ow st2
                         _ -> do
-                           c1 <- readWord8Buf iraw (ir+1)
-                           c2 <- readWord8Buf iraw (ir+2)
-                           if not (validate3 c0 c1 c2) then invalid else do
-                           ow' <- writeCharBuf oraw ow (chr3 c0 c1 c2)
-                           loop (ir+3) ow'
+                           let !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
+                           let !(# st3, c2 #) = unIO (readWord8Buf iraw (ir+2)) st2
+                           if not (validate3 c0 c1 c2) then invalid st3 else do
+                           let !(# st4, ow' #) = unIO (writeCharBuf oraw ow (chr3 c0 c1 c2)) st3
+                           loop (ir+3) ow' st4
                   | c0 >= 0xf0 ->
                       case iw - ir of
-                        1 -> done InputUnderflow ir ow
+                        1 -> done InputUnderflow ir ow st1
                         2 -> do -- check for an error even when we don't have
                                 -- the full sequence yet (#3341)
-                           c1 <- readWord8Buf iraw (ir+1)
+                           let (# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
                            if not (validate4 c0 c1 0x80 0x80)
-                              then invalid else done InputUnderflow ir ow
+                              then invalid st2 else done InputUnderflow ir ow st2
                         3 -> do
-                           c1 <- readWord8Buf iraw (ir+1)
-                           c2 <- readWord8Buf iraw (ir+2)
+                           let !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
+                               !(# st3, c2 #) = unIO (readWord8Buf iraw (ir+2)) st2
                            if not (validate4 c0 c1 c2 0x80)
-                              then invalid else done InputUnderflow ir ow
+                              then invalid st3 else done InputUnderflow ir ow st3
                         _ -> do
-                           c1 <- readWord8Buf iraw (ir+1)
-                           c2 <- readWord8Buf iraw (ir+2)
-                           c3 <- readWord8Buf iraw (ir+3)
-                           if not (validate4 c0 c1 c2 c3) then invalid else do
-                           ow' <- writeCharBuf oraw ow (chr4 c0 c1 c2 c3)
-                           loop (ir+4) ow'
+                           let !(# st2, c1 #) = unIO (readWord8Buf iraw (ir+1)) st1
+                               !(# st3, c2 #) = unIO (readWord8Buf iraw (ir+2)) st2
+                               !(# st4, c3 #) = unIO (readWord8Buf iraw (ir+3)) st3
+                           if not (validate4 c0 c1 c2 c3) then invalid st4 else do
+                           let !(# st5, ow' #) = unIO (writeCharBuf oraw ow (chr4 c0 c1 c2 c3)) st4
+                           loop (ir+4) ow' st5
                   | otherwise ->
-                           invalid
+                           invalid st1
          where
-           invalid = done InvalidSequence ir ow
+           invalid :: DecodingBuffer
+           invalid st' = done InvalidSequence ir ow st'
 
        -- lambda-lifted, to avoid thunks being built in the inner-loop:
-       done why !ir !ow = return (why,
-                                  if ir == iw then input{ bufL=0, bufR=0 }
-                                              else input{ bufL=ir },
-                                  output{ bufR=ow })
+       {-# NOINLINE done #-}
+       done :: CodingProgress -> Int -> Int -> DecodingBuffer
+       done why !ir !ow st' =
+         let !ri = if ir == iw then input{ bufL = 0, bufR = 0} else input{ bufL = ir }
+             !ro = output { bufR = ow }
+         in (# st', why, ri, ro #)
    in
-   loop ir0 ow0
+   loop ir0 ow0 st
 
 utf8_encode :: EncodeBuffer
 utf8_encode
   input at Buffer{  bufRaw=iraw, bufL=ir0, bufR=iw,  bufSize=_  }
   output at Buffer{ bufRaw=oraw, bufL=_,   bufR=ow0, bufSize=os }
+  st
  = let
-      done why !ir !ow = return (why,
-                                 if ir == iw then input{ bufL=0, bufR=0 }
-                                             else input{ bufL=ir },
-                                 output{ bufR=ow })
-      loop !ir !ow
-        | ow >= os = done OutputUnderflow ir ow
-        | ir >= iw = done InputUnderflow ir ow
+      {-# NOINLINE done #-}
+      done :: CodingProgress -> Int -> Int -> EncodingBuffer
+      done why !ir !ow st' =
+        let !ri = if ir == iw then input{ bufL = 0, bufR = 0 } else input{ bufL = ir }
+            !ro = output{ bufR = ow }
+        in  (# st', why, ri, ro #)
+      loop :: Int -> Int -> EncodingBuffer
+      loop !ir !ow st0
+        | ow >= os = done OutputUnderflow ir ow st0
+        | ir >= iw = done InputUnderflow ir ow st0
         | otherwise = do
-           (c,ir') <- readCharBuf iraw ir
+           let (# st1, (c,ir') #) = unIO (readCharBuf iraw ir) st0
            case ord c of
              x | x <= 0x7F   -> do
-                    writeWord8Buf oraw ow (fromIntegral x)
-                    loop ir' (ow+1)
+                     let !(# st2, () #) = unIO (writeWord8Buf oraw ow (fromIntegral x)) st1
+                     loop ir' (ow+1) st2
                | x <= 0x07FF ->
-                    if os - ow < 2 then done OutputUnderflow ir ow else do
+                    if os - ow < 2 then done OutputUnderflow ir ow st1 else do
                     let (c1,c2) = ord2 c
-                    writeWord8Buf oraw ow     c1
-                    writeWord8Buf oraw (ow+1) c2
-                    loop ir' (ow+2)
-               | x <= 0xFFFF -> if isSurrogate c then done InvalidSequence ir ow else do
-                    if os - ow < 3 then done OutputUnderflow ir ow else do
+                        !(# st2, () #) = unIO (writeWord8Buf oraw ow     c1) st1
+                        !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) c2) st2
+                    loop ir' (ow+2) st3
+               | x <= 0xFFFF -> if isSurrogate c then done InvalidSequence ir ow st1 else do
+                    if os - ow < 3 then done OutputUnderflow ir ow st1 else do
                     let (c1,c2,c3) = ord3 c
-                    writeWord8Buf oraw ow     c1
-                    writeWord8Buf oraw (ow+1) c2
-                    writeWord8Buf oraw (ow+2) c3
-                    loop ir' (ow+3)
+                        !(# st2, () #) = unIO (writeWord8Buf oraw ow     c1) st1
+                        !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) c2) st2
+                        !(# st4, () #) = unIO (writeWord8Buf oraw (ow+2) c3) st3
+                    loop ir' (ow+3) st4
                | otherwise -> do
-                    if os - ow < 4 then done OutputUnderflow ir ow else do
+                    if os - ow < 4 then done OutputUnderflow ir ow st1 else do
                     let (c1,c2,c3,c4) = ord4 c
-                    writeWord8Buf oraw ow     c1
-                    writeWord8Buf oraw (ow+1) c2
-                    writeWord8Buf oraw (ow+2) c3
-                    writeWord8Buf oraw (ow+3) c4
-                    loop ir' (ow+4)
+                        !(# st2, () #) = unIO (writeWord8Buf oraw ow     c1) st1
+                        !(# st3, () #) = unIO (writeWord8Buf oraw (ow+1) c2) st2
+                        !(# st4, () #) = unIO (writeWord8Buf oraw (ow+2) c3) st3
+                        !(# st5, () #) = unIO (writeWord8Buf oraw (ow+3) c4) st4
+                    loop ir' (ow+4) st5
    in
-   loop ir0 ow0
+   loop ir0 ow0 st
 
 -- -----------------------------------------------------------------------------
 -- UTF-8 primitives, lifted from Data.Text.Fusion.Utf8


=====================================
libraries/base/GHC/IO/Handle/Internals.hs
=====================================
@@ -59,7 +59,7 @@ module GHC.IO.Handle.Internals (
 import GHC.IO
 import GHC.IO.IOMode
 import GHC.IO.Encoding as Encoding
-import GHC.IO.Encoding.Types (CodeBuffer)
+import GHC.IO.Encoding.Types (CodeBuffer, encode, recover)
 import GHC.IO.Handle.Types
 import GHC.IO.Buffer
 import GHC.IO.BufferedIO (BufferedIO)


=====================================
libraries/filepath
=====================================
@@ -1 +1 @@
-Subproject commit 7139cd3383a2aae440a57b5604a8182d9a983715
+Subproject commit fefa911af3a67d913ef2f28cdf79a5c6b6633552


=====================================
libraries/haskeline
=====================================
@@ -1 +1 @@
-Subproject commit ad40faf532ca86ae6d0839a299234db2ce4fc424
+Subproject commit a3876935fb3a5c5969992b5abe58b84d63a68cab


=====================================
rts/RtsSymbols.c
=====================================
@@ -925,6 +925,7 @@ extern char **environ;
       SymI_HasProto(newArena)                                           \
       SymI_HasProto(arenaAlloc)                                         \
       SymI_HasProto(arenaFree)                                          \
+      SymI_HasProto(rts_clearMemory)                                    \
       RTS_USER_SIGNALS_SYMBOLS                                          \
       RTS_INTCHAR_SYMBOLS
 


=====================================
rts/include/RtsAPI.h
=====================================
@@ -599,6 +599,51 @@ extern StgWord base_GHCziTopHandler_runNonIO_closure[];
 
 /* ------------------------------------------------------------------------ */
 
+// This is a public RTS API function that does its best to zero out
+// unused RTS memory. rts_clearMemory() takes the storage manager
+// lock. It's only safe to call rts_clearMemory() when all mutators
+// have stopped and either minor/major garbage collection has just
+// been run.
+//
+// rts_clearMemory() works for all RTS ways on all platforms, though
+// the main intended use case is the pre-initialization of a
+// wasm32-wasi reactor module (#22920). A reactor module is like
+// shared library on other platforms, with foreign exported Haskell
+// functions as entrypoints. At run-time, the user calls hs_init_ghc()
+// to initialize the RTS, after that they can invoke Haskell
+// computation by calling the exported Haskell functions, persisting
+// the memory state across these invocations.
+//
+// Besides hs_init_ghc(), the user may want to invoke some Haskell
+// function to initialize some global state in the user code, this
+// global state is used by subsequent invocations. Now, it's possible
+// to run hs_init_ghc() & custom init logic in Haskell, then snapshot
+// the entire memory into a new wasm module! And the user can call the
+// new wasm module's exports directly, thus eliminating the
+// initialization overhead at run-time entirely.
+//
+// There's one problem though. After the custom init logic runs, the
+// RTS memory contains a lot of garbage data in various places. These
+// garbage data will be snapshotted into the new wasm module, causing
+// a significant size bloat. Therefore, we need an RTS API function
+// that zeros out unused RTS memory.
+//
+// At the end of the day, the custom init function will be a small C
+// function that first calls hs_init_ghc(), then calls a foreign
+// exported Haskell function to initialize whatever global state the
+// other Haskell functions need, followed by a hs_perform_gc() call to
+// do a major GC, and finally an rts_clearMemory() call to zero out
+// the unused RTS memory.
+//
+// Why add rts_clearMemory(), where there's the -DZ RTS flag that
+// zeros freed memory on GC? The -DZ flag actually fills freed memory
+// with a garbage byte like 0xAA, and the flag only works in debug
+// RTS. Why not add a new RTS flag that zeros freed memory on the go?
+// Because it only makes sense to do the zeroing once before
+// snapshotting the memory, but there's no point to pay for the
+// zeroing overhead at the new module's run-time.
+void rts_clearMemory(void);
+
 #if defined(__cplusplus)
 }
 #endif


=====================================
rts/sm/BlockAlloc.c
=====================================
@@ -1395,3 +1395,17 @@ reportUnmarkedBlocks (void)
 }
 
 #endif
+
+void clear_free_list(void) {
+    for (uint32_t node = 0; node < n_numa_nodes; ++node) {
+        for (bdescr *bd = free_mblock_list[node]; bd != NULL; bd = bd->link) {
+            clear_blocks(bd);
+        }
+
+        for (int ln = 0; ln < NUM_FREE_LISTS; ++ln) {
+            for (bdescr *bd = free_list[node][ln]; bd != NULL; bd = bd->link) {
+                clear_blocks(bd);
+            }
+        }
+    }
+}


=====================================
rts/sm/BlockAlloc.h
=====================================
@@ -32,4 +32,6 @@ void reportUnmarkedBlocks (void);
 extern W_ n_alloc_blocks;   // currently allocated blocks
 extern W_ hw_alloc_blocks;  // high-water allocated blocks
 
+RTS_PRIVATE void clear_free_list(void);
+
 #include "EndPrivate.h"


=====================================
rts/sm/NonMoving.h
=====================================
@@ -356,6 +356,10 @@ void print_thread_list(StgTSO* tso);
 
 #endif
 
+RTS_PRIVATE void clear_segment(struct NonmovingSegment*);
+
+RTS_PRIVATE void clear_segment_free_blocks(struct NonmovingSegment*);
+
 #include "EndPrivate.h"
 
 #endif // CMINUSMINUS


=====================================
rts/sm/NonMovingSweep.c
=====================================
@@ -106,14 +106,16 @@ void nonmovingGcCafs()
     debug_caf_list_snapshot = (StgIndStatic*)END_OF_CAF_LIST;
 }
 
-static void
+#endif
+
+void
 clear_segment(struct NonmovingSegment* seg)
 {
     size_t end = ((size_t)seg) + NONMOVING_SEGMENT_SIZE;
     memset(&seg->bitmap, 0, end - (size_t)&seg->bitmap);
 }
 
-static void
+void
 clear_segment_free_blocks(struct NonmovingSegment* seg)
 {
     unsigned int block_size = nonmovingSegmentBlockSize(seg);
@@ -125,8 +127,6 @@ clear_segment_free_blocks(struct NonmovingSegment* seg)
     }
 }
 
-#endif
-
 GNUC_ATTR_HOT void nonmovingSweep(void)
 {
     while (nonmovingHeap.sweep_list) {


=====================================
rts/sm/Storage.c
=====================================
@@ -1924,3 +1924,46 @@ The compacting collector does nothing to improve megablock
 level fragmentation. The role of the compacting GC is to remove object level
 fragmentation and to use less memory when collecting. - see #19248
 */
+
+void rts_clearMemory(void) {
+    ACQUIRE_SM_LOCK;
+
+    clear_free_list();
+
+    for (uint32_t i = 0; i < n_nurseries; ++i) {
+        for (bdescr *bd = nurseries[i].blocks; bd; bd = bd->link) {
+            clear_blocks(bd);
+        }
+    }
+
+    for (unsigned int i = 0; i < getNumCapabilities(); ++i) {
+        for (bdescr *bd = getCapability(i)->pinned_object_empty; bd; bd = bd->link) {
+            clear_blocks(bd);
+        }
+
+        for (bdescr *bd = gc_threads[i]->free_blocks; bd; bd = bd->link) {
+            clear_blocks(bd);
+        }
+    }
+
+    if (RtsFlags.GcFlags.useNonmoving)
+    {
+        for (struct NonmovingSegment *seg = nonmovingHeap.free; seg; seg = seg->link) {
+            clear_segment(seg);
+        }
+
+        for (int i = 0; i < NONMOVING_ALLOCA_CNT; ++i) {
+            struct NonmovingAllocator *alloc = nonmovingHeap.allocators[i];
+
+            for (struct NonmovingSegment *seg = alloc->active; seg; seg = seg->link) {
+                clear_segment_free_blocks(seg);
+            }
+
+            for (unsigned int j = 0; j < getNumCapabilities(); ++j) {
+                clear_segment_free_blocks(alloc->current[j]);
+            }
+        }
+    }
+
+    RELEASE_SM_LOCK;
+}


=====================================
rts/sm/Storage.h
=====================================
@@ -206,4 +206,8 @@ extern StgIndStatic * dyn_caf_list;
 extern StgIndStatic * debug_caf_list;
 extern StgIndStatic * revertible_caf_list;
 
+STATIC_INLINE void clear_blocks(bdescr *bd) {
+   memset(bd->start, 0, BLOCK_SIZE * bd->blocks);
+}
+
 #include "EndPrivate.h"


=====================================
testsuite/tests/ffi/should_run/ffi023_c.c
=====================================
@@ -5,5 +5,6 @@
 HsInt out (HsInt x)
 {
     performMajorGC();
+    rts_clearMemory();
     return incall(x);
 }



View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/1523be1cc721f1ee435f0640cf910c64e91a63f7...edd103973ad9fac0b4e3d3a36e9829384f060a14

-- 
View it on GitLab: https://gitlab.haskell.org/ghc/ghc/-/compare/1523be1cc721f1ee435f0640cf910c64e91a63f7...edd103973ad9fac0b4e3d3a36e9829384f060a14
You're receiving this email because of your account on gitlab.haskell.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/ghc-commits/attachments/20230215/6528e274/attachment-0001.html>