<div dir="auto">Hello, <div dir="auto"><br><div dir="auto">I also think that the instances for the bounded types are pretty unfortunate, but the change might have unintended consequences. I am not particularly opposed to it though. </div><div dir="auto"><br></div><div dir="auto">One thing to consider, though, is that it might be more productive to change the other parsing libraries (parsec, etc). For example, I almost never use ReadP for actual parsing that requires validation: it is slow and has no error reporting.  I've only really used it for quick and dirty serialization in combination with Show, and there the problem is less likely to happen.</div><div dir="auto"><br></div><div dir="auto">Cheers, </div><div dir="auto">Iavor</div></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Mon, Jul 21, 2025, 11:54 AM Andreas Klebinger via ghc-devs <<a href="mailto:ghc-devs@haskell.org">ghc-devs@haskell.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">For base introducing a new function `readBoundedNum :: (Bounded a, Num <br>

a) => String -> a` or similar seems very reasonable to me.<br>

Changing "read" to throw an exception or similar after decades less so.<br>

<br>

<br>

On 20/07/2025 22:08, Viktor Dukhovni wrote:<br>

> On Sun, Jul 20, 2025 at 09:12:20PM +0200, Stefan Klinger wrote:<br>

><br>

>> I'd like to bring to your attention a discussion that I have started<br>

>> over at Haskell-cafe [1].  I was complaining about the silent overflow<br>

>> of parsers for bounded integers:<br>

>><br>

>>      > read "298" :: Word8<br>

>>      42<br>

> FWIW, there haven't AFAIK any complaints about ByteString's readInt,<br>

> readWord, readInteger, readNatural and various sized variants having<br>

> overflow checks.  But these have always been more like `reads` than<br>

> `read`, returning `Maybe (a, ByteString)`, so perhaps somewhat more<br>

> oriented towards detecting unexpected excess input, as well as for<br>

> some time now range overflow.  So there's some precedent for overflow<br>

> checking, but...<br>

><br>

> It is also fair to point out that once an Int or other bounded integral<br>

> type is read, arithmetic with that type (addition, subtraction and<br>

> multiplication) silently overflows.  And so silent overflow in `read`<br>

> is not inconsistent with the type's semantics.<br>

><br>

> If converting strings to numbers is in support of string-oriented<br>

> network protocols (e.g. the SIZE ESMTP extension), then one really<br>

> should make an effort to avoid silent overflow, but in that context the<br>

> various ByteString read methods are already available.<br>

><br>

> That said, if various middleware libraries hide overflows, because under<br>

> the covers thay're using `read`, that could be a problem, so we do want<br>

> the ecosystem at large to make sensible choices about when silent<br>

> overflow may or may not be appropriate.  Perhaps that means having<br>

> both wrapping and overflow-checked implementations available, and<br>

> clear docs with each about its behaviour and the corresponding<br>

> alternative.<br>

><br>

>> I find this unsatisfying, and I have demonstrated a solution [2] that<br>

>> seems correct and performant.<br>

> A few of quick observations about [2]:<br>

><br>

>      - It disallows expliccit leading "+" (just like "read", but perhaps<br>

>        that should be tolerated).<br>

><br>

>      - It disallows multiple leading zeros, perhaps these should be<br>

>        tolerated.<br>

><br>

>      - It disallows "-0", perhaps these should be tolerated, as well<br>

>        as "-0000", "-000001", ...  (With lazy ByteStrings, which might<br>

>        never terminate, there is a generous, but sensible limit on<br>

>        the number of leading zeros allowed).<br>

><br>

>      - One way to avoid difficulties with handling negative minBound is<br>

>        to parse signed values via the corresponding unsigned type, which<br>

>        can accommodate `-minBound` as a positive value, and then negate<br>

>        the final result.  This makse possible sharing the low-level<br>

>        digit-by-digit code between the positive and negative cases.<br>

><br>

> If parsing of Integer and Natual is also in scope, I would expect that<br>

> it avoids doing multi-precision arithmetic for each digit, parsing<br>

> groups of digits into ~Word sized blocks, and merge the blocks<br>

> hierarchically with only a logarithmic number of MP multiplies.<br>

><br>

_______________________________________________<br>

ghc-devs mailing list<br>

<a href="mailto:ghc-devs@haskell.org" target="_blank" rel="noreferrer">ghc-devs@haskell.org</a><br>

<a href="http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs" rel="noreferrer noreferrer" target="_blank">http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs</a><br>

</blockquote></div>