From anthony.d.clayden at gmail.com Mon Jun 14 10:47:39 2021 From: anthony.d.clayden at gmail.com (Anthony Clayden) Date: Mon, 14 Jun 2021 22:47:39 +1200 Subject: smelly code in input.c In-Reply-To: References: Message-ID: (I can't say for sure I have a bug, because I don't grok the code well enough to figure out how to make something go wrong, but ...) In /src/input.c line 1714 https://github.com/FranklinChen/hugs98-plus-Sep2006/blob/master/src/input.c#L1714 , > if (c0=='.' && isIn(c0,(SMALL|LARGE|SYMBOL))) { It looks wrong to be testing `c0` twice, that test will always come out False. (Or if '.' counts as a SYMBOL, then always True.) I guess the second test should be lookahead `isIn(c1, ...)`. That follows the code pattern nearby above line 1688, and especially 1698. I think it'll mean the compiler won't handle multi-qualified names like `Mod1.Sub2.Subsub3.Foo`. Whereas `Prelude.True` (just a single qualifier) is ok. Can anyone confirm my suspicion and/or suggest a definitive test? (Reason for asking: I'm trying to persuade Hugs to differentiate tight-binding dot as an operator vs space-surrounded dot as function composition. In particular so I can write `record.label` as field access. I'd also like to write `record.#label` as a TRex field access.) AntC -------------- next part -------------- An HTML attachment was scrubbed... URL: From iavor.diatchki at gmail.com Mon Jun 14 15:13:24 2021 From: iavor.diatchki at gmail.com (Iavor Diatchki) Date: Mon, 14 Jun 2021 08:13:24 -0700 Subject: smelly code in input.c In-Reply-To: References: Message-ID: Hi, I don't not know that code either, but looking at the comments and the surrounding code, my guess is that the 2nd `c0` should be `c1`, and it is checking for something like `.` followed by either a lower case or upper case or symbol operator. -Iavor On Mon, Jun 14, 2021 at 3:48 AM Anthony Clayden wrote: > (I can't say for sure I have a bug, because I don't grok the code well > enough to figure out how to make something go wrong, but ...) > > In /src/input.c line 1714 > https://github.com/FranklinChen/hugs98-plus-Sep2006/blob/master/src/input.c#L1714 > , > > > if (c0=='.' && isIn(c0,(SMALL|LARGE|SYMBOL))) { > > It looks wrong to be testing `c0` twice, that test will always come out > False. (Or if '.' counts as a SYMBOL, then always True.) I guess the second > test should be lookahead `isIn(c1, ...)`. That follows the code pattern > nearby above line 1688, and especially 1698. > > I think it'll mean the compiler won't handle multi-qualified names like > `Mod1.Sub2.Subsub3.Foo`. Whereas `Prelude.True` (just a single qualifier) > is ok. > > Can anyone confirm my suspicion and/or suggest a definitive test? > > (Reason for asking: I'm trying to persuade Hugs to differentiate > tight-binding dot as an operator vs space-surrounded dot as function > composition. In particular so I can write `record.label` as field access. > I'd also like to write `record.#label` as a TRex field access.) > > AntC > > > > > _______________________________________________ > Hugs-Bugs mailing list > Hugs-Bugs at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/hugs-bugs -------------- next part -------------- An HTML attachment was scrubbed... URL: From anthony.d.clayden at gmail.com Tue Jun 15 07:33:20 2021 From: anthony.d.clayden at gmail.com (Anthony Clayden) Date: Tue, 15 Jun 2021 19:33:20 +1200 Subject: smelly code in input.c In-Reply-To: References: Message-ID: Thanks Iavor, nothing ventured nothing gained I went ahead and made that change. I can't find any difference in behaviour for well-formed code. I can't find any difference in whether code is accepted/rejected. But there's a difference in error messages for ill-formed code, specifically ill-formed qualified vars/constrs: * standard-issue Hugs reports Undefined qualified variable "Mod1.Sub2.Subsub3. " -- that is, shows trailing spaces and newline after the trailing dot. * modified Hugs reports Syntax error in expression (unexpected `;', possibly due to bad layout) -- or unexpected `}' -- these are pseudo- semicolon/closing brace, not actually in the file -- that is, it's 'munched' the trailing spaces and newline, then bumped into start of next statement -- so layout-control inserts the pseudo-'s Then perhaps that smelly code is deliberate, to give a more helpful rejection message? hmm hmm I still can't see why/how the parsing works for well-formed qualified vars/constrs. But best to let sleeping dogs lie. AntC On Tue, 15 Jun 2021 at 03:13, Iavor Diatchki wrote: > I don't not know that code either, but looking at the comments and the > surrounding code, my guess is that the 2nd `c0` should be `c1`, > and it is checking for something like `.` followed by either a lower case > or upper case or symbol operator. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: