[GHC] #15646: ghci takes super long time to find the type of large fractional number
GHC
ghc-devs at haskell.org
Sun Sep 23 09:44:38 UTC 2018
#15646: ghci takes super long time to find the type of large fractional number
-------------------------------------+-------------------------------------
Reporter: Johannkokos | Owner:
| JulianLeviston
Type: bug | Status: new
Priority: normal | Milestone: 8.6.1
Component: GHCi | Version: 8.4.3
Resolution: | Keywords: newcomer
Operating System: Unknown/Multiple | Architecture:
Type of failure: Compile-time | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by osa1):
So, to summarize the problem here:
- Semantics of a floating point literal (which is in our case in `XeY`
form) is defined as `fromRational (n % d)`.
- So we need to find and `n` and `d` such that `fromRational (n % d)`
gives us our float.
- As per Haskell 2010 chapter 6.4.1 (note that section 2.5 only gives the
syntax, not the type) `fromRational` has type `Fractional a => Rational ->
a` so `fromRational (n % d)` has type `Fractional a => a`
- So `XeY` form always has type `Fractional a => a` !
- Currently we tokenize an `XeY` we call `readRational__` on the string
`"XeY"`, which returns this: `(n%1)*(10^^(k-d))`. `n`, `k` and `d` are
parsed from the input so this is very cheap (in our example `n` is `X`,
`k` is `Y`, `d` is `0`).
- However, evaluating `(10^^_) :: Rational` part of the expression
takes long time and uses a lot of memory. Indeed, as comment:10 says, if
you have 1000000000 as the exponent then representation of this number
takes GBs of memory so this is expected.
- comment:12 asks why rendering `Infinity` takes this long: that's because
we generate the `Infinity` value during desugaring (after parsing), and
for this we have to first generate the `Rational` value for the literal
first (which is what is taking all the time and using all the memory). You
can observe this if you run ghci with `-fdump-parsed -fdump-ds` and
evaluate `1e100000 :: Float`.
Perhaps it makes sense to distinguish these two problems:
- Type checking is too slow
- Generating `Infinity :: Float` is too slow
(a) in comment:10 fixes (1) but not (2).
(c) in comment:10 fixes (1), and it may also fix (2) if we could somehow
use this new constructor fields to return `Infinity` without calculating
`10^^e :: Integer` first (as far as I understand the new consturctor will
look like: `ExpRational n e -- stands for (n%1)*(10^^e)`).
However, (c) means a change in a public type, and we'd need to evaluate
performance changes etc. caused by turning a product type to a sum type
(more optimizations apply to product types than sum types! at least until
we improve demand analysis for sum types and use unboxed sums in WW).
Also, I'm not sure if adding one more constructor to a library type just
for this is a good idea. We'd need to think about how/whether to expose
this to users (perhaps we do want to expose it in `Data.Ratio` somehow).
This means long discussions and bikeshedding etc.
What to do? I think delaying evaluation `readRational` until it's needed
by making some code lazier would work. Looking at the lexer, we generate a
`ITrational` for this literal like this:
{{{
tok_float str = ITrational $! readFractionalLit str
readFractionalLit :: String -> FractionalLit
readFractionalLit str = ((FL $! (SourceText str)) $! is_neg) $!
readRational str
where is_neg = case str of ('-':_) -> True
}}}
Notice how we're very deliberately strict here. My guess is that simply
removing strictness around `readRational` (e.g. the `$!` in
`readFractionalLit`) may fix this. If it doesn't then some other code down
the line needs to be made lazier too. However I'm not sure if this causes
other problems (the fact that the thunk for `readRational str` will keep
the `str` alive is not a problem because the `str` only holds the string
for the literal). We'll have to evaluate this change in strictness
somehow.
If we follow this route then we'd also need to add a test to make sure
typing this expression remains fast (strictness properties are hard to
maintain ...).
Here's another idea that is like (c) but does not need changes in a public
type: duplicate the `Ratio` type so that only the version used by GHC has
the special constructor. The code gen and `Data.Ratio` will still use the
current type. Now we can use the special constructor in lexing to avoid
evaluating a huge `Integer`, and only when desugaring we do some kind of
evaluation (this is where we can perhaps be smart and generate `Infinity`
efficiently, or in the worst case we end up with the current situation,
but with fast type checking).
None of these solutions strike me as particularly satisfying though...
--
Ticket URL: <http://ghc.haskell.org/trac/ghc/ticket/15646#comment:13>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
More information about the ghc-tickets
mailing list