[Haskell-cafe] Grapheme length?

Viktor Dukhovni ietf-dane at dukhovni.org
Sat Feb 20 08:57:05 UTC 2021


On Sat, Feb 20, 2021 at 06:12:58AM -0200, Viktor Dukhovni wrote:

> Fortunately, these are also supported:
> 
>   https://hackage.haskell.org/package/text-icu-0.7.0.1/docs/Data-Text-ICU-Break.html#g:2

A complete example (the NFC normalisation may be overkill):

    {-# LANGUAGE BangPatterns #-}
    module Main (main) where

    import qualified Data.Text as T
    import qualified Data.Text.Lazy as LT
    import qualified Data.Text.Lazy.Builder as LT
    import qualified Data.Text.Lazy.IO as LT
    import qualified Data.Text.Lazy.Builder.Int as LT
    import Data.Text.ICU.Break
    import Data.Text.ICU.Normalize
    import Data.Text.ICU.Types (LocaleName(..))
    import System.Environment

    main :: IO ()
    main = do
        brkIter <- breakCharacter Current ""
        getArgs >>= mapM_ (go brkIter . normalize NFC . T.pack)
      where
        go :: BreakIterator () -> T.Text -> IO ()
        go b t = do
            setText b t
            len <- count b 0
            LT.putStrLn $ LT.toLazyText $
                LT.fromText t <>
                LT.fromString " has grapheme length: " <>
                LT.decimal len
          where
            count :: Int -> IO Int
            count !acc = next b >>= maybe (pure acc) (const $ count $ acc + 1)

-- 
	Viktor.


More information about the Haskell-Cafe mailing list