[GHC] #12665: Make Read instances faster, and make them fail fast
GHC
ghc-devs at haskell.org
Fri Oct 7 23:48:37 UTC 2016
#12665: Make Read instances faster, and make them fail fast
-------------------------------------+-------------------------------------
Reporter: dfeuer | Owner: dfeuer
Type: feature request | Status: new
Priority: high | Milestone: 8.2.1
Component: Core Libraries | Version: 8.0.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by dfeuer):
Replying to [comment:13 int-e]:
> One can also store the bases in the stack itself, see `valInteger''` in
the (updated) http://int-e.eu/~bf3/haskell/FromDigits.hs file. This causes
some extra allocations, so the code becomes slower, but almost
imperceptibly so.
I wrote up something essentially equivalent to `Numeric.readDec` based on
your latest code and the shape of the lexer:
{{{#!hs
decNat :: ReadP Natural
decNat = valNat 10 19 $ \c ->
let diff = charDiff c '0'
in if diff < 10
then Just diff
else Nothing
valNat :: Word -> Word -> (Char -> Maybe Word) -> ReadP Natural
valNat base chunkSize terp = do
s <- ReadP.look
valNat' base chunkSize terp s
{-# INLINE valNat #-}
data Stack = SNil | SSkip Natural !Stack | SCons Natural !Natural !Stack
{-# INLINE valNat' #-}
valNat' :: Word -> Word -> (Char -> Maybe Word) -> [Char] -> ReadP Natural
valNat' base chunkSize terp cs0@(c0 : _)
| Just _ <- terp c0 = goChunks SNil 0 0 cs0
where
b1 :: Natural
b1 = fromIntegral base * fromIntegral (base^(chunkSize-1))
goChunks :: Stack -> Word -> Word -> [Char] -> ReadP Natural
goChunks !s !l !chunk !ds
| l == chunkSize = goChunks (step s (fromIntegral chunk)) 0 0 ds
goChunks !s !l !chunk (c:cs)
| Just d <- terp c
= ReadP.get *> goChunks s (l+1) (chunk*base+d) cs
goChunks !s !l !chunk _ =
pure $ fromStack (fromIntegral chunk) (fromIntegral (base^l)) s
step :: Stack -> Natural -> Stack
step SNil d = SCons b1 d SNil
step (SSkip b s) d = SCons b d s
step (SCons b d' s) d = SSkip b (step' b s (d + d'*b))
step' :: Natural -> Stack -> Natural -> Stack
step' b SNil d = SCons (b*b) d SNil
step' _ (SSkip b s) d = SCons b d s
step' _ (SCons b d' s) d = SSkip b (step' b s (d + d' * b))
fromStack :: Natural -> Natural -> Stack -> Natural
fromStack d' _ SNil = d'
fromStack d' b' (SSkip b s) = fromStack d' b' s
fromStack d' b' (SCons b d s) = fromStack (d*b'+d') (b*b') s
valNat' _ _ _ _ = ReadP.pfail
-- We could write a parser more precisely imitating GHC's current
-- `Read` instance by using `valNat` and then performing a look-ahead
-- to check for illegal termination sequences involving `.` or `e`,
-- but I'm really not convinced that we should.
}}}
This seems to be very fast indeed when used for base 10 with a chunk size
of 19 (on a 64-bit system). Oddly, `Numeric.readDec` is much, much slower
than `read`, which looks likely to be a `RULES` issue. That makes it a bit
hard to compare the algorithms fairly. I haven't yet wired your simpler
version up in quite this fashion, and I'm not actually sure if it can be
wired up so, but it would be worth checking.
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/12665#comment:14>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list