From gitlab at gitlab.haskell.org Thu Apr 4 22:38:54 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Thu, 04 Apr 2019 18:38:54 -0400 Subject: [Git][ghc/ghc][wip/D5082] 125 commits: Update Trac ticket URLs to point to GitLab Message-ID: <5ca6877e4c4fd_62b3dd442401616768@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: 610ec224 by Ryan Scott at 2019-03-15T14:17:54Z Update Trac ticket URLs to point to GitLab This moves all URL references to Trac tickets to their corresponding GitLab counterparts. - - - - - 97032ed9 by Simon Peyton Jones at 2019-03-15T14:24:01Z Report better suggestion for GADT data constructor This addresses issue #16427. An easy fix. - - - - - 83e09d3c by Peter Trommler at 2019-03-15T14:30:08Z PPC NCG: Use liveness information in CmmCall We make liveness information for global registers available on `JMP` and `BCTR`, which were the last instructions missing. With complete liveness information we do not need to reserve global registers in `freeReg` anymore. Moreover we assign R9 and R10 to callee saves registers. Cleanup by removing `Reg_Su`, which was unused, from `freeReg` and removing unused register definitions. The calculation of the number of floating point registers is too conservative. Just follow X86 and specify the constants directly. Overall on PowerPC this results in 0.3 % smaller code size in nofib while runtime is slightly better in some tests. - - - - - 57201beb by Simon Peyton Jones at 2019-03-15T14:36:14Z Add flavours link - - - - - 4927117c by Simon Peyton Jones at 2019-03-16T12:08:25Z Improve error recovery in the typechecker Issue #16418 showed that we were carrying on too eagerly after a bogus type signature was identified (a bad telescope in fact), leading to a subsequent crash. This led me in to a maze of twisty little passages in the typechecker's error recovery, and I ended up doing some refactoring in TcRnMonad. Some specfifics * TcRnMonad.try_m is now called attemptM. * I switched the order of the result pair in tryTc, to make it consistent with other similar functions. * The actual exception used in the Tc monad is irrelevant so, to avoid polluting type signatures, I made tcTryM, a simple wrapper around tryM, and used it. The more important changes are in * TcSimplify.captureTopConstraints, where we should have been calling simplifyTop rather than reportUnsolved, so that levity defaulting takes place properly. * TcUnify.emitResidualTvConstraint, where we need to set the correct status for a new implication constraint. (Previously we ended up with an Insoluble constraint wrapped in an Unsolved implication, which meant that insolubleWC gave the wrong answer. - - - - - 600a1ac3 by Simon Peyton Jones at 2019-03-16T12:08:25Z Add location to the extra-constraints wildcard The extra-constraints wildcard had lost its location (issue #16431). Happily this is easy to fix. Lots of error improvements. - - - - - 1c1b63d6 by Ben Gamari at 2019-03-16T23:13:36Z compiler: Disable atomic renaming on Windows As discussed in #16450, this feature regresses CI on Windows, causing non-deterministic failures due to missing files. - - - - - 6764da43 by Ben Gamari at 2019-03-16T23:16:56Z gitlab-ci: Explicitly set bindist tarball name - - - - - ad79ccd9 by Ben Gamari at 2019-03-16T23:17:46Z gitlab-ci: Generate documentation tarball - - - - - 3f2291e4 by Ben Gamari at 2019-03-16T23:17:46Z gitlab-ci: Generate source tarballs - - - - - cb61371e by Ben Gamari at 2019-03-17T09:05:10Z ghc-heap: Introduce closureSize This function allows the user to compute the (non-transitive) size of a heap object in words. The "closure" in the name is admittedly confusing but we are stuck with this nomenclature at this point. - - - - - c01d5af3 by Michael Sloan at 2019-03-18T02:23:19Z Extract out use of UnboxedTuples from GHCi.Leak See #13101 + #15454 for motivation. This change reduces the number of modules that need to be compiled to object code when loading GHC into GHCi. - - - - - 6113d0d4 by Radosław Rowicki at 2019-03-18T02:29:25Z Update bug tracker link to point to gitlab instead of deprecated trac - - - - - b8326897 by Ben Gamari at 2019-03-18T03:16:12Z gitlab-ci: Always build fedora27 This ends up being much easier to use than Debian 9 under NixOS. - - - - - acf2129d by Ben Gamari at 2019-03-18T03:17:36Z gitlab-ci: Implement head.hackage jobs - - - - - 71648c35 by Ben Gamari at 2019-03-20T03:04:18Z gitlab-ci: Implement support for i386/Windows bindists - - - - - d94ca74f by Tamar Christina at 2019-03-20T03:10:23Z err: clean up error handler - - - - - 398f2cbc by Ben Gamari at 2019-03-20T03:16:32Z Bump Cabal submodule to 3.0 Metric Increase: haddock.Cabal - - - - - 89a201e8 by Takenobu Tani at 2019-03-20T03:22:36Z users-guide: Update Wiki URLs to point to GitLab The user's guide uses the `ghc-wiki` macro, and substitution rules are complicated. So I manually edited `.rst` files without sed. I changed `Commentary/Latedmd` only to a different page. It is more appropriate as an example. [ci skip] - - - - - 98ff1a56 by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Replace nOfThem by replicate - - - - - 6a47414f by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Fix typos - - - - - 1e26e60d by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Simplify monadic code - - - - - c045bd7c by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Remove deprecated reinitializeGlobals - - - - - 6d19ad72 by Ben Gamari at 2019-03-20T03:34:49Z gitlab-ci: Bump docker images To install lndir and un-break the source distribution job. - - - - - c7a84a60 by Matthew Pickering at 2019-03-20T03:34:50Z Update .gitlab-ci.yml - - - - - db136237 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T16219 and cabal09 as broken on Windows See #16386. - - - - - 7cd8e330 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Fix expected output on Windows for various ghci tests Broke as -Wimplicit-kind-vars no longer exists. Specifically ghci024, ghci057 and T9293. - - - - - 23b639fd by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T5836 as broken on Windows See #16387. - - - - - a1bda08d by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T15904 as broken on Windows It seems to look for some sort of manifest file. See #16388. - - - - - b7f5d552 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T16190 as broken on Windows There seems to be some filepath funniness due to TH embedding going on here. See #16389. - - - - - a0c31f78 by Ben Gamari at 2019-03-20T22:41:32Z testsuite/plugins: Add multi_cpu_race modifier on Windows A few tests previously failed with various failure modes. For instance, `plugin-recomp-change` fails with: ``` Wrong exit code for plugin-recomp-change()(expected 0 , actual 2 ) Stderr ( plugin-recomp-change ): Simple Plugin Passes Queried Got options: Simple Plugin Pass Run C://GitLabRunner//builds//8fc0e283//0//ghc//ghc//inplace//mingw//bin/ld.exe: cannot find -lHSplugin-recompilation-0.1-CPeObcGoBuvHdwBnpK9jQq collect2.exe: error: ld returned 1 exit status `gcc.exe' failed in phase `Linker'. (Exit code: 1) make[2]: *** [Makefile:112: plugin-recomp-change] Error 1 *** unexpected failure for plugin-recomp-change(normal) ``` It's unclear whether the ghc-pkg concurrency issue mentioned in all.T is the culprit but the set of tests that fail overlaps strongly with the set of tests that lack the `multi_cpu_race` modifier. Let's see if adding it fixes them. - - - - - 88a6e9a4 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T10672 as broken This test, which is only run on Windows, seems to be reliably timing out. See #16390. - - - - - f4d3aaaf by Ben Gamari at 2019-03-20T22:41:32Z testsuite/plugins: Increase compile timeout on Windows I think the linker is routinely eating through the timeout, leading to many spurious failures. - - - - - ae382245 by Ben Gamari at 2019-03-20T22:41:32Z rts/RtsSymbols: Drop __mingw_vsnwprintf As described in #16387, this is already defined by mingw and consequently defining it in the RTS as well leads to multiple definition errors from the RTS linker at runtime. - - - - - f79f93e4 by Ben Gamari at 2019-03-20T22:41:32Z Don't mark cabal09 as broken It doesn't fail reliably. - - - - - d98cb763 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Don't mark T5836 as broken I believe removing __mingw_vsnwprintf from RtsSymbols fixed #16387. - - - - - 8c1a2743 by Ben Gamari at 2019-03-20T22:41:32Z Try again - - - - - 3394a7cd by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Display observed exit code on failure due to bad exit code - - - - - 36818759 by Artem Pyanykh at 2019-03-20T23:52:39Z Adjust section placement and relocation logic for Mach-O 1. Place each section on a separate page to ensure required alignment (wastes lots ot space, needs to be improved). 2. Unwire relocation logic from macho sections (the most fiddly part is adjusting internal relocations). Other todos: 0. Add a test for section alignment. 1. Investigate 32bit relocations! 2. Fix memory leak in ZEROPAGE section allocation. 3. Fix creating redundant jump islands for GOT. 4. Investigate more compact section placement. - - - - - 78c61acf by Artem Pyanykh at 2019-03-20T23:52:39Z Use segments for section layout - - - - - 7bbfb789 by Artem Pyanykh at 2019-03-20T23:52:39Z Address some todos and fixmes - - - - - 3cdcc0b5 by Artem Pyanykh at 2019-03-20T23:52:39Z Add a linker test re: section alignment - - - - - cb745c84 by Artem Pyanykh at 2019-03-20T23:52:39Z Add missing levels to SegmentProt enum - - - - - d950f11e by Artem Pyanykh at 2019-03-20T23:52:39Z Directly test section alignment, fix internal reloc probing length - - - - - 3fb10fcf by Artem Pyanykh at 2019-03-20T23:52:39Z Gracefully handle error condition in Mach-O relocateSection - - - - - dc713c71 by Ben Gamari at 2019-03-20T23:58:49Z ci: Move validate-x86_64-linux-deb9 to full-build stage The `build` stage is meant to be a minimal smoke test to weed out broken commits. The `validate-x86_64-linux-deb9` build will generally catch a subset of issues caught by `validate-x86_64-linux-deb9-debug` so only the latter should be in `build`. - - - - - 505c5ab2 by Ben Gamari at 2019-03-20T23:58:49Z ci: Add some descriptions of the stages - - - - - 646e3dc2 by Sebastian Graf at 2019-03-21T00:04:49Z Add a bench flavour to Hadrian - - - - - 8d18a873 by Ryan Scott at 2019-03-21T00:10:57Z Reject nested predicates in impredicativity checking When GHC attempts to unify a metavariable with a type containing foralls, it will be rejected as an occurrence of impredicativity. GHC was /not/ extending the same treatment to predicate types, such as in the following (erroneous) example from #11514: ```haskell foo :: forall a. (Show a => a -> a) -> () foo = undefined ``` This will attempt to instantiate `undefined` at `(Show a => a -> a) -> ()`, which is impredicative. This patch catches impredicativity arising from predicates in this fashion. Since GHC is pickier about impredicative instantiations, some test cases needed to be updated to be updated so as not to fall afoul of the new validity check. (There were a surprising number of impredicative uses of `undefined`!) Moreover, the `T14828` test case now has slightly less informative types shown with `:print`. This is due to a a much deeper issue with the GHCi debugger (see #14828). Fixes #11514. - - - - - 7b213b8d by Ömer Sinan Ağacan at 2019-03-21T00:17:05Z Print test suite results ("unexpected failures" etc.) in sorted order Fixes #16425 - - - - - f199a843 by Simon Jakobi at 2019-03-21T00:23:15Z Check.hs: Fix a few typos - - - - - 07d44ed1 by Ben Gamari at 2019-03-21T00:29:20Z base: Depend upon shlwapi on Windows As noted in #16466, `System.Environment.getExecutablePath` depends upon `PathFileExistsW` which is defined by `shlwapi`. Fixes #16466. - - - - - 1382d09e by Ryan Scott at 2019-03-21T00:35:28Z Remove unused XArrApp and XArrForm extension points !301 removed the `HsArrApp` and `HsArrForm` constructors, which renders the corresponding extension points `XArrApp` and `XArrForm` useless. This patch finally rips them out. - - - - - 3423664b by Peter Trommler at 2019-03-21T00:41:35Z Fix specification of load_load_barrier [skip-ci] - - - - - 84c77a67 by Alexandre Esteves at 2019-03-21T21:43:03Z Fix typo [skip ci] - - - - - 7092b2de by Matthew Pickering at 2019-03-22T03:38:58Z Only run check-makefiles.py linter in testsuite dir - - - - - 322239de by Matthew Pickering at 2019-03-22T03:38:58Z Run linters on merge requests It seems that it has failed to execute at all since it was implemented. We now run the linters on merge requests. - - - - - 8f8d532c by Ben Gamari at 2019-03-22T03:45:03Z gitlab-ci: Do full `perf` build when building Windows releases - - - - - 2ef72d3f by Ben Gamari at 2019-03-22T03:45:03Z gitlab-ci: Pass --target explicitly to configure on Windows Otherwise configure fails in the 32-bit case with ``` This GHC (c:/GitLabRunner/builds/8fc0e283/0/ghc/ghc/toolchain/bin/ghc) does not generate code for the build platform GHC target platform : x86_64-unknown-mingw32 Desired build platform : i386-unknown-mingw32 ``` - - - - - 8b14f536 by Ben Gamari at 2019-03-22T03:51:08Z Bump cabal submodule Due to https://github.com/haskell/cabal/issues/5953. - - - - - dbe4557f by Matthew Pickering at 2019-03-22T14:02:32Z CI: Allow failure in packaging step This depends on the windows build which is still allowed to fail. If that job fails then the packaging job will also fail. - - - - - 366f1c68 by Ben Gamari at 2019-03-22T14:08:38Z gitlab: Deploy documentation snapshot via GitLab Pages - - - - - d608d543 by Tamar Christina at 2019-03-22T14:14:45Z Force LF line ending for md5sum [skip-ci] - - - - - cd07086a by Ben Gamari at 2019-03-22T14:34:51Z gitlab-ci: Fix linters - - - - - ab51bee4 by Herbert Valerio Riedel at 2019-03-22T14:34:51Z base: Remove `Monad(fail)` method and reexport `MonadFail(fail)` instead As per https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail Coauthored-by: Ben Gamari <ben at well-typed.com> - - - - - 266b49ca by Ben Gamari at 2019-03-22T22:33:20Z gitlab-ci: Clean up linter I'm not sure why these steps were done but they seem counterproductive and unnecessary. - - - - - 44b08ede by Ben Gamari at 2019-03-22T22:38:11Z gitlab-ci: Fix YAML syntax - - - - - 971f4530 by Ben Gamari at 2019-03-22T22:49:34Z gitlab-ci: Compute merge base against remote tracking branch Previously we would use the local branch with the name `$CI_MERGE_REQUEST_TARGET_BRANCH_NAME` to compute the merge base when linting. However, this branch isn't necessarily up-to-date. We should rather use `origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME`. - - - - - 8d01b572 by Ben Gamari at 2019-03-23T16:37:56Z gitlab-ci: Explicitly fetch target branch `git fetch`, which we used previously, doesn't update the remote tracking branches. - - - - - cd85f8a7 by Ben Gamari at 2019-03-24T12:46:13Z gitlab-ci: Allow linters to fail for now They are broken and I don't have time to fix them at the moment. - - - - - d763b2e7 by Haskell-mouse at 2019-03-25T18:02:22Z User's Guide: extensions compatibility Adds the mention that extensions "AllowAmbiguousTypes" and "RankNTypes" are not always compatible with each other. Specifies the conditions and causes of failing in resolving of ambiguity. - - - - - 200d65ef by Matthew Pickering at 2019-03-25T18:02:25Z Check hadrian/ghci.sh script output to determine pass/fail ghci always exits with exit code 0 so you have to check the output to see if the modules loaded succesfully. - - - - - 8e07368f by Matthew Pickering at 2019-03-25T18:02:27Z Refactor ./hadrian/ghci.sh for better error messages By separating these two lines, if the first command fails then `ghci` is not loaded. Before it would still load ghci but display lots of errors about not being able to find modules. - - - - - 3769e3a8 by Takenobu Tani at 2019-03-25T18:02:29Z Update Wiki URLs to point to GitLab This moves all URL references to Trac Wiki to their corresponding GitLab counterparts. This substitution is classified as follows: 1. Automated substitution using sed with Ben's mapping rule [1] Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy... New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy... 2. Manual substitution for URLs containing `#` index Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz 3. Manual substitution for strings starting with `Commentary` Old: Commentary/XxxYyy... New: commentary/xxx-yyy... See also !539 [1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json - - - - - b9da2868 by Ryan Scott at 2019-03-25T18:02:33Z Correct duplicate 4.12.0.0 entry in base's changelog See #16490. [ci skip] - - - - - ab41c1b4 by Andrey Mokhov at 2019-03-27T11:20:03Z Hadrian: Bump Shake to 0.17.6 The new release of Shake comes with these relevant features: * use symlinks for --shared * add --compact for a Bazel/Buck style output - - - - - 646f2e79 by Andrey Mokhov at 2019-03-27T11:20:03Z Hadrian: trace the execution of expensive Cabal calls We use Cabal to parse, configure, register and copy packages, which are expensive operations that are currently not visible to Shake's profiling infrastructure. By using `traced` we tell Shake to add these IO actions to the profiling report, helping us to identify performance bottlenecks. We use short tracing keys, as recommended in Shake docs: the name of the current target is already available in the rest of the profiling information. - - - - - fb12f53c by Alp Mestanogullari at 2019-03-27T11:20:05Z Hadrian: introduce an easy way for users to build with -split-sections Any user can now trivially build any number of Haskell packages with `-split-sections` by using `splitSections`/`splitSectionsIf` on any existing or new flavour: -- build all packages but the ghc library with -split-sections splitSections :: Flavour -> Flavour -- build all packages that satisfy the given predicate -- with --split-sections splitSectionsIf :: (Package -> Bool) -> Flavour -> Flavour See the new section in `doc/user-settings.md`. - - - - - 3dec527a by David Eichmann at 2019-03-27T11:20:09Z Hadrian: don't use -zorigin on darwin. - - - - - 5730f863 by Ömer Sinan Ağacan at 2019-03-27T11:20:10Z Minor refactoring in copy array primops: - `emitCopySmallArray` now checks size before generating code and doesn't generate any code when size is 0. `emitCopyArray` already does this so this makes small/large array cases the same in argument checking. - In both `emitCopySmallArray` and `emitCopyArray` read the `dflags` after checking the argument. - - - - - 4acdb769 by Chaitanya Koparkar at 2019-03-27T11:20:11Z Fix a few broken Trac links [skip ci] This patch only attempts to fix links that don't automatically re-direct to the correct URL. - - - - - 97ad5cfb by Artem Pelenitsyn at 2019-03-29T18:18:12Z Add some tips to the Troubleshooting section of README - - - - - 8a20bfc2 by Michael Peyton Jones at 2019-03-29T18:18:14Z Visibility: handle multiple units with the same name Fixes #16228. The included test case is adapted from the reproduction in the issue, and fails without this patch. ------ We compute an initial visilibity mapping for units based on what is present in the package databases. To seed this, we compute a set of all the package configs to add visibilities for. However, this set was keyed off the unit's *package name*. This is correct, since we compare packages across databases by version. However, we would only ever consider a single, most-preferable unit from the database in which it was found. The effect of this was that only one of the libraries in a Cabal package would be added to this initial set. This would cause attempts to use modules from the omitted libraries to fail, claiming that the package was hidden (even though `ghc-pkg` would correctly show it as visible). A solution is to do the selection of the most preferable packages separately, and then be sure to consider exposing all units in the same package in the same package db. We can do this by picking a most-preferable unit for each package name, and then considering exposing all units that are equi-preferable with that unit. ------ Why wasn't this bug apparent to all people trying to use sub-libraries in Cabal? The answer is that Cabal explicitly passes `-package` and `-package-id` flags for all the packages it wants to use, rather than relying on the state of the package database. So this bug only really affects people who are trying to use package databases produced by Cabal outside of Cabal itself. One particular example of this is the way that the Nixpkgs Haskell infrastructure provides wrapped GHCs: typically these are equipped with a package database containing all the needed package dependencies, and the user is not expected to pass `-package` flags explicitly. - - - - - 754b5455 by Artem Pelenitsyn at 2019-03-29T18:18:20Z docs: make nfib compute the Fibonacci sequence [skipci] - - - - - 1a567133 by Ben Gamari at 2019-03-29T18:18:20Z ci: Check that changelogs don't contain "TBA" This ensures that the release dates in the library changelogs are properly set. - - - - - 6e15ca54 by Ben Gamari at 2019-03-29T18:18:22Z Bump transformers to 0.5.6.2 See #16199. - - - - - 6f7115df by Ben Gamari at 2019-03-30T11:42:38Z ci: Ensure index.html is preserved in documentation tarball - - - - - 33173a51 by Alexandre at 2019-04-01T07:32:28Z Add support for bitreverse primop This commit includes the necessary changes in code and documentation to support a primop that reverses a word's bits. It also includes a test. - - - - - a3971b4e by Alexandre at 2019-04-01T07:32:28Z Bump ghc-prim's version where needed - - - - - 061276ea by Michael Sloan at 2019-04-01T07:32:30Z Remove unnecessary uses of UnboxedTuples pragma (see #13101 / #15454) Also removes a couple unnecessary MagicHash pragmas - - - - - e468c613 by David Eichmann at 2019-04-01T07:32:34Z Support Shake's --lint-fsatrace feature. Using this feature requires fsatrace (e.g. https://github.com/jacereda/fsatrace). Simply use the `--lint-fsatrace` option when running hadrian. Shake version >= 0.17.7 is required to support linting out of tree build dirs. - - - - - 1e9e4197 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add merge request template for backports for 8.8 - - - - - 55650d14 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add some simply issue templates - - - - - 27b99ed8 by Takenobu Tani at 2019-04-01T07:32:36Z Clean up URLs to point to GitLab This moves URL references to old Trac to their corresponding GitLab counterparts. This patch does not update the submodule library, such as libraries/Cabal. See also !539, !606, !618 [ci skip] - - - - - 18d1555d by Adam Sandberg Eriksson at 2019-04-01T07:32:38Z configure: document the use of the LD variable - - - - - 10352efa by Ben Gamari at 2019-04-01T22:22:34Z gitlab: Add feature request MR template - - - - - 1e52054b by Ben Gamari at 2019-04-01T23:16:21Z gitlab: Move feature request template to issue_templates Whoops. - - - - - e5c21ca9 by Ben Gamari at 2019-04-01T23:16:25Z gitlab: Mention ~"user facing" label - - - - - 39282422 by Ryan Scott at 2019-04-02T00:01:38Z Bump array submodule This bumps `array` to version 0.5.4.0 so that we can distinguish it with `MIN_VERSION_array` (as it introduces some changes to the `Show` instance for `UArray`). - - - - - 7cf5ba3d by Michal Terepeta at 2019-04-02T00:07:49Z Improve performance of newSmallArray# This: - Hoists part of the condition outside of the initialization loop in `stg_newSmallArrayzh`. - Annotates one of the unlikely branches as unlikely, also in `stg_newSmallArrayzh`. - Adds a couple of annotations to `allocateMightFail` indicating which branches are likely to be taken. Together this gives about 5% improvement. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - dd9c82ef by David Eichmann at 2019-04-02T00:13:55Z Hadrian: correct deps for ghc builder. Previously, when needing ghc as a builder, the ghcDeps (Files the GHC binary depends on) for the current stage were needed. This is incorrect as the previous stage's ghc is used for building. This commit fixes the issue, needing the previous stage's ghcDeps. - - - - - 345306d3 by Alexandre Baldé at 2019-04-02T16:34:30Z Fix formatting issue in ghc-prim's changelog [skip ci] - - - - - f54b5124 by David Eichmann at 2019-04-02T16:40:39Z Hadrian: traceAllow deep dependencies when compilling haskell object files. - - - - - d132b30a by David Eichmann at 2019-04-02T16:40:39Z Hadrian: lint ignore autom4te and ghc-pkg cache files. - - - - - bf734195 by Simon Marlow at 2019-04-02T16:46:46Z Add myself to libraries/ghci - - - - - 5a75ccd0 by klebinger.andreas at gmx.at at 2019-04-03T04:34:57Z Fix faulty substitutions in StgCse (#11532). `substBndr` should rename bindings which shadow existing ids. However while it was renaming the bindings it was not adding proper substitutions for renamed bindings. Instead of adding a substitution of the form `old -> new` for renamed bindings it mistakenly added `old -> old` if no replacement had taken place while adding none if `old` had been renamed. As a byproduct this should improve performance, as we no longer add useless substitutions for unshadowed bindings. - - - - - 2ec749b5 by Nathan Collins at 2019-04-03T04:41:05Z users-guide: Fix typo - - - - - ea192a09 by Andrew Martin at 2019-04-03T04:41:05Z base: Add documentation that liftA2 used to not be a typeclass method - - - - - 733f1b52 by Frank Steffahn at 2019-04-03T04:41:05Z users-guide: Typo in Users Guide, Glasgow Exts - - - - - 3364def0 by Ben Gamari at 2019-04-03T04:41:05Z integer-gmp: Write friendlier documentation for Integer - - - - - dd3a3d08 by Ben Gamari at 2019-04-03T04:41:05Z integer-simple: Add documentation for Integer type - - - - - 722fdddf by Chris Martin at 2019-04-03T04:41:05Z Correct two misspellings of "separately" - - - - - bf6dbe3d by Chris Martin at 2019-04-03T04:41:05Z Inline the definition of 'ap' in the Monad laws The law as it is currently written is meaningless, because nowhere have we defined the implementation of 'ap'. The reader of the Control.Monad documentation is provided with only a type signature, > ap :: Monad m => m (a -> b) -> m a -> m b an informal description, > In many situations, the liftM operations can be replaced by uses of > ap, which promotes function application. and a relationship between 'ap' and the 'liftM' functions > return f `ap` x1 `ap` ... `ap` xn > is equivalent to > liftMn f x1 x2 ... xn Without knowing how 'ap' is defined, a law involving 'ap' cannot provide any guidance for how to write a lawful Monad instance, nor can we conclude anything from the law. I suspect that a reader equipped with the understanding that 'ap' was defined prior to the invention of the Applicative class could deduce that 'ap' must be defined in terms of (>>=), but nowhere as far as I can tell have we written this down explicitly for readers without the benefit of historical context. If the law is meant to express a relationship among (<*>), (>>=), and 'return', it seems that it is better off making this statement directly, sidestepping 'ap' altogether. - - - - - 7b090b53 by Ben Gamari at 2019-04-03T07:57:40Z configure: Always use AC_LINK_ELSEIF when testing against assembler This fixes #16440, where the build system incorrectly concluded that the `.subsections_via_symbols` assembler directive was supported on a Linux system. This was caused by the fact that gcc was invoked with `-flto`; when so-configured gcc does not call the assembler but rather simply serialises its AST for compilation during the final link. This is described in Note [autoconf assembler checks and -flto]. - - - - - 4626cf21 by Sebastian Graf at 2019-04-03T08:03:47Z Fix Uncovered set of literal patterns Issues #16289 and #15713 are proof that the pattern match checker did an unsound job of estimating the value set abstraction corresponding to the uncovered set. The reason is that the fix from #11303 introducing `NLit` was incomplete: The `LitCon` case desugared to `Var` rather than `LitVar`, which would have done the necessary case splitting analogous to the `ConVar` case. This patch rectifies that by introducing the fresh unification variable in `LitCon` in value abstraction position rather than pattern postition, recording a constraint equating it to the constructor expression rather than the literal. Fixes #16289 and #15713. - - - - - 6f13e7b1 by Ben Gamari at 2019-04-03T12:12:26Z gitlab-ci: Build hyperlinked sources for releases Fixes #16445. - - - - - 895394c2 by Ben Gamari at 2019-04-03T12:15:06Z gitlab: Fix label names in issue templates - - - - - 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - b4076930 by John Ericson at 2019-04-04T17:02:27Z settings.in: Reformat We're about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. - - - - - 5e5e3ea7 by John Ericson at 2019-04-04T17:02:27Z Generate settings by make/hadrian instead of configure This allows it to eventually become stage-specific - - - - - 855ce3d2 by John Ericson at 2019-04-04T17:02:27Z Remove settings.in It is no longer needed - - - - - feafe679 by Joachim Breitner at 2019-04-04T22:35:16Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - ce8bf6b9 by John Ericson at 2019-04-04T22:36:24Z Remove cGhcEnableTablesNextToCode Get "Tables next to code" from the settings file instead. - - - - - 30 changed files: - .circleci/config.yml - + .gitattributes - .gitignore - .gitlab-ci.yml - + .gitlab/issue_templates/bug.md - + .gitlab/issue_templates/feature_request.md - .gitlab/linters/check-makefiles.py - + .gitlab/merge_request_templates/backport-for-8.8.md - .gitlab/merge_request_templates/merge-request.md - + .gitlab/start-head.hackage.sh - .gitlab/win32-init.sh - .mailmap - ANNOUNCE - CODEOWNERS - HACKING.md - INSTALL.md - MAKEHELP.md - Makefile - README.md - aclocal.m4 - bindisttest/Makefile - bindisttest/ghc.mk - boot - compiler/Makefile - compiler/basicTypes/Demand.hs - compiler/basicTypes/Id.hs - compiler/basicTypes/Lexeme.hs - compiler/basicTypes/MkId.hs - compiler/basicTypes/Module.hs - compiler/basicTypes/NameCache.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/76642e9a71d1cb6e11c5a04ed043309341071796...ce8bf6b948d8b6d4f06a760ce15f20ef17ffd23e -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/76642e9a71d1cb6e11c5a04ed043309341071796...ce8bf6b948d8b6d4f06a760ce15f20ef17ffd23e You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 4 20:13:26 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 04 Apr 2019 16:13:26 -0400 Subject: [Git][ghc/ghc][master] 2 commits: Fix #16518 with some more kind-splitting smarts Message-ID: <5ca66566638f6_62b33fa2c61251f81587397@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 9 changed files: - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/types/Type.hs - + testsuite/tests/deriving/should_compile/T16518.hs - testsuite/tests/deriving/should_compile/all.T - + testsuite/tests/ghci/scripts/T16527.hs - + testsuite/tests/ghci/scripts/T16527.script - + testsuite/tests/ghci/scripts/T16527.stdout - testsuite/tests/ghci/scripts/all.T Changes: ===================================== compiler/hsSyn/HsUtils.hs ===================================== @@ -106,7 +106,7 @@ import TcEvidence import RdrName import Var import TyCoRep -import Type ( tyConArgFlags ) +import Type ( appTyArgFlags, splitAppTys, tyConArgFlags ) import TysWiredIn ( unitTy ) import TcType import DataCon @@ -665,7 +665,6 @@ typeToLHsType ty , hst_xforall = noExt , hst_body = go tau }) go (TyVarTy tv) = nlHsTyVar (getRdrName tv) - go (AppTy t1 t2) = nlHsAppTy (go t1) (go t2) go (LitTy (NumTyLit n)) = noLoc $ HsTyLit NoExt (HsNumTy NoSourceText n) go (LitTy (StrTyLit s)) @@ -674,27 +673,35 @@ typeToLHsType ty | tyConAppNeedsKindSig True tc (length args) -- We must produce an explicit kind signature here to make certain -- programs kind-check. See Note [Kind signatures in typeToLHsType]. - = nlHsParTy $ noLoc $ HsKindSig NoExt lhs_ty (go (tcTypeKind ty)) - | otherwise = lhs_ty + = nlHsParTy $ noLoc $ HsKindSig NoExt ty' (go (tcTypeKind ty)) + | otherwise = ty' where - arg_flags :: [ArgFlag] - arg_flags = tyConArgFlags tc args - - lhs_ty :: LHsType GhcPs - lhs_ty = foldl' (\f (arg, flag) -> - let arg' = go arg in - case flag of - Inferred -> f - Specified -> f `nlHsAppKindTy` arg' - Required -> f `nlHsAppTy` arg') - (nlHsTyVar (getRdrName tc)) - (zip args arg_flags) + ty' :: LHsType GhcPs + ty' = go_app (nlHsTyVar (getRdrName tc)) args (tyConArgFlags tc args) + go ty@(AppTy {}) = go_app (go head) args (appTyArgFlags head args) + where + head :: Type + args :: [Type] + (head, args) = splitAppTys ty go (CastTy ty _) = go ty go (CoercionTy co) = pprPanic "toLHsSigWcType" (ppr co) -- Source-language types have _invisible_ kind arguments, -- so we must remove them here (#8563) + go_app :: LHsType GhcPs -- The type being applied + -> [Type] -- The argument types + -> [ArgFlag] -- The argument types' visibilities + -> LHsType GhcPs + go_app head args arg_flags = + foldl' (\f (arg, flag) -> + let arg' = go arg in + case flag of + Inferred -> f + Specified -> f `nlHsAppKindTy` arg' + Required -> f `nlHsAppTy` arg') + head (zip args arg_flags) + go_tv :: TyVar -> LHsTyVarBndr GhcPs go_tv tv = noLoc $ KindedTyVar noExt (noLoc (getRdrName tv)) (go (tyVarKind tv)) ===================================== compiler/iface/IfaceSyn.hs ===================================== @@ -1034,7 +1034,7 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- a compound field type is if it's preceded by a bang pattern. pprFieldArgTy (bang, ty) = ppr_arg_ty (bang_prec bang) bang ty -- If not using record syntax, a compound field type might need to be - -- parenthesize if one of the following holds: + -- parenthesized if one of the following holds: -- -- 1. We're using Haskell98 syntax. -- 2. The field type is preceded with a bang pattern. @@ -1046,18 +1046,23 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- If we're displaying the fields GADT-style, e.g., -- -- data Foo a where - -- MkFoo :: Maybe a -> Foo + -- MkFoo :: (Int -> Int) -> Maybe a -> Foo -- - -- Then there is no inherent need to parenthesize compound fields like - -- `Maybe a` (bang patterns notwithstanding). If we're displaying the - -- fields Haskell98-style, e.g., + -- Then we use `funPrec`, since that will ensure `Int -> Int` gets the + -- parentheses that it requires, but simple compound types like `Maybe a` + -- (which don't require parentheses in a function argument position) won't + -- get them, assuming that there are no bang patterns (see bang_prec). -- - -- data Foo a = MkFoo (Maybe a) + -- If we're displaying the fields Haskell98-style, e.g., -- - -- Then we *must* parenthesize compound fields like (Maybe a). + -- data Foo a = MkFoo (Int -> Int) (Maybe a) + -- + -- Then not only must we parenthesize `Int -> Int`, we must also + -- parenthesize compound fields like (Maybe a). Therefore, we pick + -- `appPrec`, which has higher precedence than `funPrec`. gadt_prec :: PprPrec gadt_prec - | gadt_style = topPrec + | gadt_style = funPrec | otherwise = appPrec -- The presence of bang patterns or UNPACK annotations requires ===================================== compiler/types/Type.hs ===================================== @@ -1698,6 +1698,21 @@ fun_kind_arg_flags = go emptyTCvSubst subst' = extendTvSubst subst tv arg_ty go subst (TyVarTy tv) arg_tys | Just ki <- lookupTyVar subst tv = go subst ki arg_tys + -- This FunTy case is important to handle kinds with nested foralls, such + -- as this kind (inspired by #16518): + -- + -- forall {k1} k2. k1 -> k2 -> forall k3. k3 -> Type + -- + -- Here, we want to get the following ArgFlags: + -- + -- [Inferred, Specified, Required, Required, Specified, Required] + -- forall {k1}. forall k2. k1 -> k2 -> forall k3. k3 -> Type + go subst (FunTy{ft_af = af, ft_res = res_ki}) (_:arg_tys) + = argf : go subst res_ki arg_tys + where + argf = case af of + VisArg -> Required + InvisArg -> Inferred go _ _ arg_tys = map (const Required) arg_tys -- something is ill-kinded. But this can happen -- when printing errors. Assume everything is Required. ===================================== testsuite/tests/deriving/should_compile/T16518.hs ===================================== @@ -0,0 +1,36 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} +{-# LANGUAGE InstanceSigs #-} +{-# LANGUAGE PolyKinds #-} +{-# LANGUAGE RankNTypes #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE StandaloneDeriving #-} +{-# LANGUAGE TypeApplications #-} +{-# LANGUAGE TypeOperators #-} +module T16518 where + +import Data.Coerce +import Data.Kind +import Data.Type.Equality + +----- + +class HTestEquality1 (f :: forall k. k -> Type) where + hTestEquality1 :: forall k1 k2 (a :: k1) (b :: k2). + f a -> f b -> Maybe (a :~~: b) +newtype T1 :: (forall k. k -> Type) -> (forall k. k -> Type) where + MkT1 :: forall (f :: forall k. k -> Type) k (a :: k). f a -> T1 f a + +deriving instance forall (f :: forall k. k -> Type). + HTestEquality1 f => HTestEquality1 (T1 f) + +----- + +class HTestEquality2 (f :: forall k -> k -> Type) where + hTestEquality2 :: forall k1 k2 (a :: k1) (b :: k2). + f k1 a -> f k2 b -> Maybe (a :~~: b) +newtype T2 :: (forall k -> k -> Type) -> (forall k -> k -> Type) where + MkT2 :: forall (f :: forall k -> k -> Type) k (a :: k). f k a -> T2 f k a + +deriving instance forall (f :: forall k -> k -> Type). + HTestEquality2 f => HTestEquality2 (T2 f) ===================================== testsuite/tests/deriving/should_compile/all.T ===================================== @@ -115,3 +115,4 @@ test('T15290d', normal, compile, ['']) test('T15398', normal, compile, ['']) test('T15637', normal, compile, ['']) test('T16179', normal, compile, ['']) +test('T16518', normal, compile, ['']) ===================================== testsuite/tests/ghci/scripts/T16527.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE RankNTypes #-} +module T16527 where + +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T ===================================== testsuite/tests/ghci/scripts/T16527.script ===================================== @@ -0,0 +1,2 @@ +:load T16527 +:info T ===================================== testsuite/tests/ghci/scripts/T16527.stdout ===================================== @@ -0,0 +1,4 @@ +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T + -- Defined at T16527.hs:5:1 ===================================== testsuite/tests/ghci/scripts/all.T ===================================== @@ -294,3 +294,4 @@ test('T11606', normal, ghci_script, ['T11606.script']) test('T16089', normal, ghci_script, ['T16089.script']) test('T14828', normal, ghci_script, ['T14828.script']) test('T16376', normal, ghci_script, ['T16376.script']) +test('T16527', normal, ghci_script, ['T16527.script']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 4 19:16:34 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 04 Apr 2019 15:16:34 -0400 Subject: [Git][ghc/ghc][master] 2 commits: Fix #16518 with some more kind-splitting smarts Message-ID: <5ca65812f01b6_62b33fa2c48e2eb0158416@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 9 changed files: - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/types/Type.hs - + testsuite/tests/deriving/should_compile/T16518.hs - testsuite/tests/deriving/should_compile/all.T - + testsuite/tests/ghci/scripts/T16527.hs - + testsuite/tests/ghci/scripts/T16527.script - + testsuite/tests/ghci/scripts/T16527.stdout - testsuite/tests/ghci/scripts/all.T Changes: ===================================== compiler/hsSyn/HsUtils.hs ===================================== @@ -106,7 +106,7 @@ import TcEvidence import RdrName import Var import TyCoRep -import Type ( tyConArgFlags ) +import Type ( appTyArgFlags, splitAppTys, tyConArgFlags ) import TysWiredIn ( unitTy ) import TcType import DataCon @@ -665,7 +665,6 @@ typeToLHsType ty , hst_xforall = noExt , hst_body = go tau }) go (TyVarTy tv) = nlHsTyVar (getRdrName tv) - go (AppTy t1 t2) = nlHsAppTy (go t1) (go t2) go (LitTy (NumTyLit n)) = noLoc $ HsTyLit NoExt (HsNumTy NoSourceText n) go (LitTy (StrTyLit s)) @@ -674,27 +673,35 @@ typeToLHsType ty | tyConAppNeedsKindSig True tc (length args) -- We must produce an explicit kind signature here to make certain -- programs kind-check. See Note [Kind signatures in typeToLHsType]. - = nlHsParTy $ noLoc $ HsKindSig NoExt lhs_ty (go (tcTypeKind ty)) - | otherwise = lhs_ty + = nlHsParTy $ noLoc $ HsKindSig NoExt ty' (go (tcTypeKind ty)) + | otherwise = ty' where - arg_flags :: [ArgFlag] - arg_flags = tyConArgFlags tc args - - lhs_ty :: LHsType GhcPs - lhs_ty = foldl' (\f (arg, flag) -> - let arg' = go arg in - case flag of - Inferred -> f - Specified -> f `nlHsAppKindTy` arg' - Required -> f `nlHsAppTy` arg') - (nlHsTyVar (getRdrName tc)) - (zip args arg_flags) + ty' :: LHsType GhcPs + ty' = go_app (nlHsTyVar (getRdrName tc)) args (tyConArgFlags tc args) + go ty@(AppTy {}) = go_app (go head) args (appTyArgFlags head args) + where + head :: Type + args :: [Type] + (head, args) = splitAppTys ty go (CastTy ty _) = go ty go (CoercionTy co) = pprPanic "toLHsSigWcType" (ppr co) -- Source-language types have _invisible_ kind arguments, -- so we must remove them here (#8563) + go_app :: LHsType GhcPs -- The type being applied + -> [Type] -- The argument types + -> [ArgFlag] -- The argument types' visibilities + -> LHsType GhcPs + go_app head args arg_flags = + foldl' (\f (arg, flag) -> + let arg' = go arg in + case flag of + Inferred -> f + Specified -> f `nlHsAppKindTy` arg' + Required -> f `nlHsAppTy` arg') + head (zip args arg_flags) + go_tv :: TyVar -> LHsTyVarBndr GhcPs go_tv tv = noLoc $ KindedTyVar noExt (noLoc (getRdrName tv)) (go (tyVarKind tv)) ===================================== compiler/iface/IfaceSyn.hs ===================================== @@ -1034,7 +1034,7 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- a compound field type is if it's preceded by a bang pattern. pprFieldArgTy (bang, ty) = ppr_arg_ty (bang_prec bang) bang ty -- If not using record syntax, a compound field type might need to be - -- parenthesize if one of the following holds: + -- parenthesized if one of the following holds: -- -- 1. We're using Haskell98 syntax. -- 2. The field type is preceded with a bang pattern. @@ -1046,18 +1046,23 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- If we're displaying the fields GADT-style, e.g., -- -- data Foo a where - -- MkFoo :: Maybe a -> Foo + -- MkFoo :: (Int -> Int) -> Maybe a -> Foo -- - -- Then there is no inherent need to parenthesize compound fields like - -- `Maybe a` (bang patterns notwithstanding). If we're displaying the - -- fields Haskell98-style, e.g., + -- Then we use `funPrec`, since that will ensure `Int -> Int` gets the + -- parentheses that it requires, but simple compound types like `Maybe a` + -- (which don't require parentheses in a function argument position) won't + -- get them, assuming that there are no bang patterns (see bang_prec). -- - -- data Foo a = MkFoo (Maybe a) + -- If we're displaying the fields Haskell98-style, e.g., -- - -- Then we *must* parenthesize compound fields like (Maybe a). + -- data Foo a = MkFoo (Int -> Int) (Maybe a) + -- + -- Then not only must we parenthesize `Int -> Int`, we must also + -- parenthesize compound fields like (Maybe a). Therefore, we pick + -- `appPrec`, which has higher precedence than `funPrec`. gadt_prec :: PprPrec gadt_prec - | gadt_style = topPrec + | gadt_style = funPrec | otherwise = appPrec -- The presence of bang patterns or UNPACK annotations requires ===================================== compiler/types/Type.hs ===================================== @@ -1698,6 +1698,21 @@ fun_kind_arg_flags = go emptyTCvSubst subst' = extendTvSubst subst tv arg_ty go subst (TyVarTy tv) arg_tys | Just ki <- lookupTyVar subst tv = go subst ki arg_tys + -- This FunTy case is important to handle kinds with nested foralls, such + -- as this kind (inspired by #16518): + -- + -- forall {k1} k2. k1 -> k2 -> forall k3. k3 -> Type + -- + -- Here, we want to get the following ArgFlags: + -- + -- [Inferred, Specified, Required, Required, Specified, Required] + -- forall {k1}. forall k2. k1 -> k2 -> forall k3. k3 -> Type + go subst (FunTy{ft_af = af, ft_res = res_ki}) (_:arg_tys) + = argf : go subst res_ki arg_tys + where + argf = case af of + VisArg -> Required + InvisArg -> Inferred go _ _ arg_tys = map (const Required) arg_tys -- something is ill-kinded. But this can happen -- when printing errors. Assume everything is Required. ===================================== testsuite/tests/deriving/should_compile/T16518.hs ===================================== @@ -0,0 +1,36 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} +{-# LANGUAGE InstanceSigs #-} +{-# LANGUAGE PolyKinds #-} +{-# LANGUAGE RankNTypes #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE StandaloneDeriving #-} +{-# LANGUAGE TypeApplications #-} +{-# LANGUAGE TypeOperators #-} +module T16518 where + +import Data.Coerce +import Data.Kind +import Data.Type.Equality + +----- + +class HTestEquality1 (f :: forall k. k -> Type) where + hTestEquality1 :: forall k1 k2 (a :: k1) (b :: k2). + f a -> f b -> Maybe (a :~~: b) +newtype T1 :: (forall k. k -> Type) -> (forall k. k -> Type) where + MkT1 :: forall (f :: forall k. k -> Type) k (a :: k). f a -> T1 f a + +deriving instance forall (f :: forall k. k -> Type). + HTestEquality1 f => HTestEquality1 (T1 f) + +----- + +class HTestEquality2 (f :: forall k -> k -> Type) where + hTestEquality2 :: forall k1 k2 (a :: k1) (b :: k2). + f k1 a -> f k2 b -> Maybe (a :~~: b) +newtype T2 :: (forall k -> k -> Type) -> (forall k -> k -> Type) where + MkT2 :: forall (f :: forall k -> k -> Type) k (a :: k). f k a -> T2 f k a + +deriving instance forall (f :: forall k -> k -> Type). + HTestEquality2 f => HTestEquality2 (T2 f) ===================================== testsuite/tests/deriving/should_compile/all.T ===================================== @@ -115,3 +115,4 @@ test('T15290d', normal, compile, ['']) test('T15398', normal, compile, ['']) test('T15637', normal, compile, ['']) test('T16179', normal, compile, ['']) +test('T16518', normal, compile, ['']) ===================================== testsuite/tests/ghci/scripts/T16527.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE RankNTypes #-} +module T16527 where + +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T ===================================== testsuite/tests/ghci/scripts/T16527.script ===================================== @@ -0,0 +1,2 @@ +:load T16527 +:info T ===================================== testsuite/tests/ghci/scripts/T16527.stdout ===================================== @@ -0,0 +1,4 @@ +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T + -- Defined at T16527.hs:5:1 ===================================== testsuite/tests/ghci/scripts/all.T ===================================== @@ -294,3 +294,4 @@ test('T11606', normal, ghci_script, ['T11606.script']) test('T16089', normal, ghci_script, ['T16089.script']) test('T14828', normal, ghci_script, ['T14828.script']) test('T16376', normal, ghci_script, ['T16376.script']) +test('T16527', normal, ghci_script, ['T16527.script']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 4 19:17:40 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 04 Apr 2019 15:17:40 -0400 Subject: [Git][ghc/ghc][master] 2 commits: Fix #16518 with some more kind-splitting smarts Message-ID: <5ca65854d17cb_62b33fa2c48e2eb015843c0@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 9 changed files: - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/types/Type.hs - + testsuite/tests/deriving/should_compile/T16518.hs - testsuite/tests/deriving/should_compile/all.T - + testsuite/tests/ghci/scripts/T16527.hs - + testsuite/tests/ghci/scripts/T16527.script - + testsuite/tests/ghci/scripts/T16527.stdout - testsuite/tests/ghci/scripts/all.T Changes: ===================================== compiler/hsSyn/HsUtils.hs ===================================== @@ -106,7 +106,7 @@ import TcEvidence import RdrName import Var import TyCoRep -import Type ( tyConArgFlags ) +import Type ( appTyArgFlags, splitAppTys, tyConArgFlags ) import TysWiredIn ( unitTy ) import TcType import DataCon @@ -665,7 +665,6 @@ typeToLHsType ty , hst_xforall = noExt , hst_body = go tau }) go (TyVarTy tv) = nlHsTyVar (getRdrName tv) - go (AppTy t1 t2) = nlHsAppTy (go t1) (go t2) go (LitTy (NumTyLit n)) = noLoc $ HsTyLit NoExt (HsNumTy NoSourceText n) go (LitTy (StrTyLit s)) @@ -674,27 +673,35 @@ typeToLHsType ty | tyConAppNeedsKindSig True tc (length args) -- We must produce an explicit kind signature here to make certain -- programs kind-check. See Note [Kind signatures in typeToLHsType]. - = nlHsParTy $ noLoc $ HsKindSig NoExt lhs_ty (go (tcTypeKind ty)) - | otherwise = lhs_ty + = nlHsParTy $ noLoc $ HsKindSig NoExt ty' (go (tcTypeKind ty)) + | otherwise = ty' where - arg_flags :: [ArgFlag] - arg_flags = tyConArgFlags tc args - - lhs_ty :: LHsType GhcPs - lhs_ty = foldl' (\f (arg, flag) -> - let arg' = go arg in - case flag of - Inferred -> f - Specified -> f `nlHsAppKindTy` arg' - Required -> f `nlHsAppTy` arg') - (nlHsTyVar (getRdrName tc)) - (zip args arg_flags) + ty' :: LHsType GhcPs + ty' = go_app (nlHsTyVar (getRdrName tc)) args (tyConArgFlags tc args) + go ty@(AppTy {}) = go_app (go head) args (appTyArgFlags head args) + where + head :: Type + args :: [Type] + (head, args) = splitAppTys ty go (CastTy ty _) = go ty go (CoercionTy co) = pprPanic "toLHsSigWcType" (ppr co) -- Source-language types have _invisible_ kind arguments, -- so we must remove them here (#8563) + go_app :: LHsType GhcPs -- The type being applied + -> [Type] -- The argument types + -> [ArgFlag] -- The argument types' visibilities + -> LHsType GhcPs + go_app head args arg_flags = + foldl' (\f (arg, flag) -> + let arg' = go arg in + case flag of + Inferred -> f + Specified -> f `nlHsAppKindTy` arg' + Required -> f `nlHsAppTy` arg') + head (zip args arg_flags) + go_tv :: TyVar -> LHsTyVarBndr GhcPs go_tv tv = noLoc $ KindedTyVar noExt (noLoc (getRdrName tv)) (go (tyVarKind tv)) ===================================== compiler/iface/IfaceSyn.hs ===================================== @@ -1034,7 +1034,7 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- a compound field type is if it's preceded by a bang pattern. pprFieldArgTy (bang, ty) = ppr_arg_ty (bang_prec bang) bang ty -- If not using record syntax, a compound field type might need to be - -- parenthesize if one of the following holds: + -- parenthesized if one of the following holds: -- -- 1. We're using Haskell98 syntax. -- 2. The field type is preceded with a bang pattern. @@ -1046,18 +1046,23 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- If we're displaying the fields GADT-style, e.g., -- -- data Foo a where - -- MkFoo :: Maybe a -> Foo + -- MkFoo :: (Int -> Int) -> Maybe a -> Foo -- - -- Then there is no inherent need to parenthesize compound fields like - -- `Maybe a` (bang patterns notwithstanding). If we're displaying the - -- fields Haskell98-style, e.g., + -- Then we use `funPrec`, since that will ensure `Int -> Int` gets the + -- parentheses that it requires, but simple compound types like `Maybe a` + -- (which don't require parentheses in a function argument position) won't + -- get them, assuming that there are no bang patterns (see bang_prec). -- - -- data Foo a = MkFoo (Maybe a) + -- If we're displaying the fields Haskell98-style, e.g., -- - -- Then we *must* parenthesize compound fields like (Maybe a). + -- data Foo a = MkFoo (Int -> Int) (Maybe a) + -- + -- Then not only must we parenthesize `Int -> Int`, we must also + -- parenthesize compound fields like (Maybe a). Therefore, we pick + -- `appPrec`, which has higher precedence than `funPrec`. gadt_prec :: PprPrec gadt_prec - | gadt_style = topPrec + | gadt_style = funPrec | otherwise = appPrec -- The presence of bang patterns or UNPACK annotations requires ===================================== compiler/types/Type.hs ===================================== @@ -1698,6 +1698,21 @@ fun_kind_arg_flags = go emptyTCvSubst subst' = extendTvSubst subst tv arg_ty go subst (TyVarTy tv) arg_tys | Just ki <- lookupTyVar subst tv = go subst ki arg_tys + -- This FunTy case is important to handle kinds with nested foralls, such + -- as this kind (inspired by #16518): + -- + -- forall {k1} k2. k1 -> k2 -> forall k3. k3 -> Type + -- + -- Here, we want to get the following ArgFlags: + -- + -- [Inferred, Specified, Required, Required, Specified, Required] + -- forall {k1}. forall k2. k1 -> k2 -> forall k3. k3 -> Type + go subst (FunTy{ft_af = af, ft_res = res_ki}) (_:arg_tys) + = argf : go subst res_ki arg_tys + where + argf = case af of + VisArg -> Required + InvisArg -> Inferred go _ _ arg_tys = map (const Required) arg_tys -- something is ill-kinded. But this can happen -- when printing errors. Assume everything is Required. ===================================== testsuite/tests/deriving/should_compile/T16518.hs ===================================== @@ -0,0 +1,36 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} +{-# LANGUAGE InstanceSigs #-} +{-# LANGUAGE PolyKinds #-} +{-# LANGUAGE RankNTypes #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE StandaloneDeriving #-} +{-# LANGUAGE TypeApplications #-} +{-# LANGUAGE TypeOperators #-} +module T16518 where + +import Data.Coerce +import Data.Kind +import Data.Type.Equality + +----- + +class HTestEquality1 (f :: forall k. k -> Type) where + hTestEquality1 :: forall k1 k2 (a :: k1) (b :: k2). + f a -> f b -> Maybe (a :~~: b) +newtype T1 :: (forall k. k -> Type) -> (forall k. k -> Type) where + MkT1 :: forall (f :: forall k. k -> Type) k (a :: k). f a -> T1 f a + +deriving instance forall (f :: forall k. k -> Type). + HTestEquality1 f => HTestEquality1 (T1 f) + +----- + +class HTestEquality2 (f :: forall k -> k -> Type) where + hTestEquality2 :: forall k1 k2 (a :: k1) (b :: k2). + f k1 a -> f k2 b -> Maybe (a :~~: b) +newtype T2 :: (forall k -> k -> Type) -> (forall k -> k -> Type) where + MkT2 :: forall (f :: forall k -> k -> Type) k (a :: k). f k a -> T2 f k a + +deriving instance forall (f :: forall k -> k -> Type). + HTestEquality2 f => HTestEquality2 (T2 f) ===================================== testsuite/tests/deriving/should_compile/all.T ===================================== @@ -115,3 +115,4 @@ test('T15290d', normal, compile, ['']) test('T15398', normal, compile, ['']) test('T15637', normal, compile, ['']) test('T16179', normal, compile, ['']) +test('T16518', normal, compile, ['']) ===================================== testsuite/tests/ghci/scripts/T16527.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE RankNTypes #-} +module T16527 where + +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T ===================================== testsuite/tests/ghci/scripts/T16527.script ===================================== @@ -0,0 +1,2 @@ +:load T16527 +:info T ===================================== testsuite/tests/ghci/scripts/T16527.stdout ===================================== @@ -0,0 +1,4 @@ +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T + -- Defined at T16527.hs:5:1 ===================================== testsuite/tests/ghci/scripts/all.T ===================================== @@ -294,3 +294,4 @@ test('T11606', normal, ghci_script, ['T11606.script']) test('T16089', normal, ghci_script, ['T16089.script']) test('T14828', normal, ghci_script, ['T14828.script']) test('T16376', normal, ghci_script, ['T16376.script']) +test('T16527', normal, ghci_script, ['T16527.script']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 4 19:15:14 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 04 Apr 2019 15:15:14 -0400 Subject: [Git][ghc/ghc][master] 2 commits: Fix #16518 with some more kind-splitting smarts Message-ID: <5ca657c22cf67_62b33fa2ec9147f81583813@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 9 changed files: - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/types/Type.hs - + testsuite/tests/deriving/should_compile/T16518.hs - testsuite/tests/deriving/should_compile/all.T - + testsuite/tests/ghci/scripts/T16527.hs - + testsuite/tests/ghci/scripts/T16527.script - + testsuite/tests/ghci/scripts/T16527.stdout - testsuite/tests/ghci/scripts/all.T Changes: ===================================== compiler/hsSyn/HsUtils.hs ===================================== @@ -106,7 +106,7 @@ import TcEvidence import RdrName import Var import TyCoRep -import Type ( tyConArgFlags ) +import Type ( appTyArgFlags, splitAppTys, tyConArgFlags ) import TysWiredIn ( unitTy ) import TcType import DataCon @@ -665,7 +665,6 @@ typeToLHsType ty , hst_xforall = noExt , hst_body = go tau }) go (TyVarTy tv) = nlHsTyVar (getRdrName tv) - go (AppTy t1 t2) = nlHsAppTy (go t1) (go t2) go (LitTy (NumTyLit n)) = noLoc $ HsTyLit NoExt (HsNumTy NoSourceText n) go (LitTy (StrTyLit s)) @@ -674,27 +673,35 @@ typeToLHsType ty | tyConAppNeedsKindSig True tc (length args) -- We must produce an explicit kind signature here to make certain -- programs kind-check. See Note [Kind signatures in typeToLHsType]. - = nlHsParTy $ noLoc $ HsKindSig NoExt lhs_ty (go (tcTypeKind ty)) - | otherwise = lhs_ty + = nlHsParTy $ noLoc $ HsKindSig NoExt ty' (go (tcTypeKind ty)) + | otherwise = ty' where - arg_flags :: [ArgFlag] - arg_flags = tyConArgFlags tc args - - lhs_ty :: LHsType GhcPs - lhs_ty = foldl' (\f (arg, flag) -> - let arg' = go arg in - case flag of - Inferred -> f - Specified -> f `nlHsAppKindTy` arg' - Required -> f `nlHsAppTy` arg') - (nlHsTyVar (getRdrName tc)) - (zip args arg_flags) + ty' :: LHsType GhcPs + ty' = go_app (nlHsTyVar (getRdrName tc)) args (tyConArgFlags tc args) + go ty@(AppTy {}) = go_app (go head) args (appTyArgFlags head args) + where + head :: Type + args :: [Type] + (head, args) = splitAppTys ty go (CastTy ty _) = go ty go (CoercionTy co) = pprPanic "toLHsSigWcType" (ppr co) -- Source-language types have _invisible_ kind arguments, -- so we must remove them here (#8563) + go_app :: LHsType GhcPs -- The type being applied + -> [Type] -- The argument types + -> [ArgFlag] -- The argument types' visibilities + -> LHsType GhcPs + go_app head args arg_flags = + foldl' (\f (arg, flag) -> + let arg' = go arg in + case flag of + Inferred -> f + Specified -> f `nlHsAppKindTy` arg' + Required -> f `nlHsAppTy` arg') + head (zip args arg_flags) + go_tv :: TyVar -> LHsTyVarBndr GhcPs go_tv tv = noLoc $ KindedTyVar noExt (noLoc (getRdrName tv)) (go (tyVarKind tv)) ===================================== compiler/iface/IfaceSyn.hs ===================================== @@ -1034,7 +1034,7 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- a compound field type is if it's preceded by a bang pattern. pprFieldArgTy (bang, ty) = ppr_arg_ty (bang_prec bang) bang ty -- If not using record syntax, a compound field type might need to be - -- parenthesize if one of the following holds: + -- parenthesized if one of the following holds: -- -- 1. We're using Haskell98 syntax. -- 2. The field type is preceded with a bang pattern. @@ -1046,18 +1046,23 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- If we're displaying the fields GADT-style, e.g., -- -- data Foo a where - -- MkFoo :: Maybe a -> Foo + -- MkFoo :: (Int -> Int) -> Maybe a -> Foo -- - -- Then there is no inherent need to parenthesize compound fields like - -- `Maybe a` (bang patterns notwithstanding). If we're displaying the - -- fields Haskell98-style, e.g., + -- Then we use `funPrec`, since that will ensure `Int -> Int` gets the + -- parentheses that it requires, but simple compound types like `Maybe a` + -- (which don't require parentheses in a function argument position) won't + -- get them, assuming that there are no bang patterns (see bang_prec). -- - -- data Foo a = MkFoo (Maybe a) + -- If we're displaying the fields Haskell98-style, e.g., -- - -- Then we *must* parenthesize compound fields like (Maybe a). + -- data Foo a = MkFoo (Int -> Int) (Maybe a) + -- + -- Then not only must we parenthesize `Int -> Int`, we must also + -- parenthesize compound fields like (Maybe a). Therefore, we pick + -- `appPrec`, which has higher precedence than `funPrec`. gadt_prec :: PprPrec gadt_prec - | gadt_style = topPrec + | gadt_style = funPrec | otherwise = appPrec -- The presence of bang patterns or UNPACK annotations requires ===================================== compiler/types/Type.hs ===================================== @@ -1698,6 +1698,21 @@ fun_kind_arg_flags = go emptyTCvSubst subst' = extendTvSubst subst tv arg_ty go subst (TyVarTy tv) arg_tys | Just ki <- lookupTyVar subst tv = go subst ki arg_tys + -- This FunTy case is important to handle kinds with nested foralls, such + -- as this kind (inspired by #16518): + -- + -- forall {k1} k2. k1 -> k2 -> forall k3. k3 -> Type + -- + -- Here, we want to get the following ArgFlags: + -- + -- [Inferred, Specified, Required, Required, Specified, Required] + -- forall {k1}. forall k2. k1 -> k2 -> forall k3. k3 -> Type + go subst (FunTy{ft_af = af, ft_res = res_ki}) (_:arg_tys) + = argf : go subst res_ki arg_tys + where + argf = case af of + VisArg -> Required + InvisArg -> Inferred go _ _ arg_tys = map (const Required) arg_tys -- something is ill-kinded. But this can happen -- when printing errors. Assume everything is Required. ===================================== testsuite/tests/deriving/should_compile/T16518.hs ===================================== @@ -0,0 +1,36 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} +{-# LANGUAGE InstanceSigs #-} +{-# LANGUAGE PolyKinds #-} +{-# LANGUAGE RankNTypes #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE StandaloneDeriving #-} +{-# LANGUAGE TypeApplications #-} +{-# LANGUAGE TypeOperators #-} +module T16518 where + +import Data.Coerce +import Data.Kind +import Data.Type.Equality + +----- + +class HTestEquality1 (f :: forall k. k -> Type) where + hTestEquality1 :: forall k1 k2 (a :: k1) (b :: k2). + f a -> f b -> Maybe (a :~~: b) +newtype T1 :: (forall k. k -> Type) -> (forall k. k -> Type) where + MkT1 :: forall (f :: forall k. k -> Type) k (a :: k). f a -> T1 f a + +deriving instance forall (f :: forall k. k -> Type). + HTestEquality1 f => HTestEquality1 (T1 f) + +----- + +class HTestEquality2 (f :: forall k -> k -> Type) where + hTestEquality2 :: forall k1 k2 (a :: k1) (b :: k2). + f k1 a -> f k2 b -> Maybe (a :~~: b) +newtype T2 :: (forall k -> k -> Type) -> (forall k -> k -> Type) where + MkT2 :: forall (f :: forall k -> k -> Type) k (a :: k). f k a -> T2 f k a + +deriving instance forall (f :: forall k -> k -> Type). + HTestEquality2 f => HTestEquality2 (T2 f) ===================================== testsuite/tests/deriving/should_compile/all.T ===================================== @@ -115,3 +115,4 @@ test('T15290d', normal, compile, ['']) test('T15398', normal, compile, ['']) test('T15637', normal, compile, ['']) test('T16179', normal, compile, ['']) +test('T16518', normal, compile, ['']) ===================================== testsuite/tests/ghci/scripts/T16527.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE RankNTypes #-} +module T16527 where + +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T ===================================== testsuite/tests/ghci/scripts/T16527.script ===================================== @@ -0,0 +1,2 @@ +:load T16527 +:info T ===================================== testsuite/tests/ghci/scripts/T16527.stdout ===================================== @@ -0,0 +1,4 @@ +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T + -- Defined at T16527.hs:5:1 ===================================== testsuite/tests/ghci/scripts/all.T ===================================== @@ -294,3 +294,4 @@ test('T11606', normal, ghci_script, ['T11606.script']) test('T16089', normal, ghci_script, ['T16089.script']) test('T14828', normal, ghci_script, ['T14828.script']) test('T16376', normal, ghci_script, ['T16376.script']) +test('T16527', normal, ghci_script, ['T16527.script']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 4 19:21:16 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 04 Apr 2019 15:21:16 -0400 Subject: [Git][ghc/ghc][master] 2 commits: Fix #16518 with some more kind-splitting smarts Message-ID: <5ca6592c86099_62b33fa2ee0d5dc4158522@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 9 changed files: - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/types/Type.hs - + testsuite/tests/deriving/should_compile/T16518.hs - testsuite/tests/deriving/should_compile/all.T - + testsuite/tests/ghci/scripts/T16527.hs - + testsuite/tests/ghci/scripts/T16527.script - + testsuite/tests/ghci/scripts/T16527.stdout - testsuite/tests/ghci/scripts/all.T Changes: ===================================== compiler/hsSyn/HsUtils.hs ===================================== @@ -106,7 +106,7 @@ import TcEvidence import RdrName import Var import TyCoRep -import Type ( tyConArgFlags ) +import Type ( appTyArgFlags, splitAppTys, tyConArgFlags ) import TysWiredIn ( unitTy ) import TcType import DataCon @@ -665,7 +665,6 @@ typeToLHsType ty , hst_xforall = noExt , hst_body = go tau }) go (TyVarTy tv) = nlHsTyVar (getRdrName tv) - go (AppTy t1 t2) = nlHsAppTy (go t1) (go t2) go (LitTy (NumTyLit n)) = noLoc $ HsTyLit NoExt (HsNumTy NoSourceText n) go (LitTy (StrTyLit s)) @@ -674,27 +673,35 @@ typeToLHsType ty | tyConAppNeedsKindSig True tc (length args) -- We must produce an explicit kind signature here to make certain -- programs kind-check. See Note [Kind signatures in typeToLHsType]. - = nlHsParTy $ noLoc $ HsKindSig NoExt lhs_ty (go (tcTypeKind ty)) - | otherwise = lhs_ty + = nlHsParTy $ noLoc $ HsKindSig NoExt ty' (go (tcTypeKind ty)) + | otherwise = ty' where - arg_flags :: [ArgFlag] - arg_flags = tyConArgFlags tc args - - lhs_ty :: LHsType GhcPs - lhs_ty = foldl' (\f (arg, flag) -> - let arg' = go arg in - case flag of - Inferred -> f - Specified -> f `nlHsAppKindTy` arg' - Required -> f `nlHsAppTy` arg') - (nlHsTyVar (getRdrName tc)) - (zip args arg_flags) + ty' :: LHsType GhcPs + ty' = go_app (nlHsTyVar (getRdrName tc)) args (tyConArgFlags tc args) + go ty@(AppTy {}) = go_app (go head) args (appTyArgFlags head args) + where + head :: Type + args :: [Type] + (head, args) = splitAppTys ty go (CastTy ty _) = go ty go (CoercionTy co) = pprPanic "toLHsSigWcType" (ppr co) -- Source-language types have _invisible_ kind arguments, -- so we must remove them here (#8563) + go_app :: LHsType GhcPs -- The type being applied + -> [Type] -- The argument types + -> [ArgFlag] -- The argument types' visibilities + -> LHsType GhcPs + go_app head args arg_flags = + foldl' (\f (arg, flag) -> + let arg' = go arg in + case flag of + Inferred -> f + Specified -> f `nlHsAppKindTy` arg' + Required -> f `nlHsAppTy` arg') + head (zip args arg_flags) + go_tv :: TyVar -> LHsTyVarBndr GhcPs go_tv tv = noLoc $ KindedTyVar noExt (noLoc (getRdrName tv)) (go (tyVarKind tv)) ===================================== compiler/iface/IfaceSyn.hs ===================================== @@ -1034,7 +1034,7 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- a compound field type is if it's preceded by a bang pattern. pprFieldArgTy (bang, ty) = ppr_arg_ty (bang_prec bang) bang ty -- If not using record syntax, a compound field type might need to be - -- parenthesize if one of the following holds: + -- parenthesized if one of the following holds: -- -- 1. We're using Haskell98 syntax. -- 2. The field type is preceded with a bang pattern. @@ -1046,18 +1046,23 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- If we're displaying the fields GADT-style, e.g., -- -- data Foo a where - -- MkFoo :: Maybe a -> Foo + -- MkFoo :: (Int -> Int) -> Maybe a -> Foo -- - -- Then there is no inherent need to parenthesize compound fields like - -- `Maybe a` (bang patterns notwithstanding). If we're displaying the - -- fields Haskell98-style, e.g., + -- Then we use `funPrec`, since that will ensure `Int -> Int` gets the + -- parentheses that it requires, but simple compound types like `Maybe a` + -- (which don't require parentheses in a function argument position) won't + -- get them, assuming that there are no bang patterns (see bang_prec). -- - -- data Foo a = MkFoo (Maybe a) + -- If we're displaying the fields Haskell98-style, e.g., -- - -- Then we *must* parenthesize compound fields like (Maybe a). + -- data Foo a = MkFoo (Int -> Int) (Maybe a) + -- + -- Then not only must we parenthesize `Int -> Int`, we must also + -- parenthesize compound fields like (Maybe a). Therefore, we pick + -- `appPrec`, which has higher precedence than `funPrec`. gadt_prec :: PprPrec gadt_prec - | gadt_style = topPrec + | gadt_style = funPrec | otherwise = appPrec -- The presence of bang patterns or UNPACK annotations requires ===================================== compiler/types/Type.hs ===================================== @@ -1698,6 +1698,21 @@ fun_kind_arg_flags = go emptyTCvSubst subst' = extendTvSubst subst tv arg_ty go subst (TyVarTy tv) arg_tys | Just ki <- lookupTyVar subst tv = go subst ki arg_tys + -- This FunTy case is important to handle kinds with nested foralls, such + -- as this kind (inspired by #16518): + -- + -- forall {k1} k2. k1 -> k2 -> forall k3. k3 -> Type + -- + -- Here, we want to get the following ArgFlags: + -- + -- [Inferred, Specified, Required, Required, Specified, Required] + -- forall {k1}. forall k2. k1 -> k2 -> forall k3. k3 -> Type + go subst (FunTy{ft_af = af, ft_res = res_ki}) (_:arg_tys) + = argf : go subst res_ki arg_tys + where + argf = case af of + VisArg -> Required + InvisArg -> Inferred go _ _ arg_tys = map (const Required) arg_tys -- something is ill-kinded. But this can happen -- when printing errors. Assume everything is Required. ===================================== testsuite/tests/deriving/should_compile/T16518.hs ===================================== @@ -0,0 +1,36 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE GeneralizedNewtypeDeriving #-} +{-# LANGUAGE InstanceSigs #-} +{-# LANGUAGE PolyKinds #-} +{-# LANGUAGE RankNTypes #-} +{-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE StandaloneDeriving #-} +{-# LANGUAGE TypeApplications #-} +{-# LANGUAGE TypeOperators #-} +module T16518 where + +import Data.Coerce +import Data.Kind +import Data.Type.Equality + +----- + +class HTestEquality1 (f :: forall k. k -> Type) where + hTestEquality1 :: forall k1 k2 (a :: k1) (b :: k2). + f a -> f b -> Maybe (a :~~: b) +newtype T1 :: (forall k. k -> Type) -> (forall k. k -> Type) where + MkT1 :: forall (f :: forall k. k -> Type) k (a :: k). f a -> T1 f a + +deriving instance forall (f :: forall k. k -> Type). + HTestEquality1 f => HTestEquality1 (T1 f) + +----- + +class HTestEquality2 (f :: forall k -> k -> Type) where + hTestEquality2 :: forall k1 k2 (a :: k1) (b :: k2). + f k1 a -> f k2 b -> Maybe (a :~~: b) +newtype T2 :: (forall k -> k -> Type) -> (forall k -> k -> Type) where + MkT2 :: forall (f :: forall k -> k -> Type) k (a :: k). f k a -> T2 f k a + +deriving instance forall (f :: forall k -> k -> Type). + HTestEquality2 f => HTestEquality2 (T2 f) ===================================== testsuite/tests/deriving/should_compile/all.T ===================================== @@ -115,3 +115,4 @@ test('T15290d', normal, compile, ['']) test('T15398', normal, compile, ['']) test('T15637', normal, compile, ['']) test('T16179', normal, compile, ['']) +test('T16518', normal, compile, ['']) ===================================== testsuite/tests/ghci/scripts/T16527.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE RankNTypes #-} +module T16527 where + +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T ===================================== testsuite/tests/ghci/scripts/T16527.script ===================================== @@ -0,0 +1,2 @@ +:load T16527 +:info T ===================================== testsuite/tests/ghci/scripts/T16527.stdout ===================================== @@ -0,0 +1,4 @@ +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T + -- Defined at T16527.hs:5:1 ===================================== testsuite/tests/ghci/scripts/all.T ===================================== @@ -294,3 +294,4 @@ test('T11606', normal, ghci_script, ['T11606.script']) test('T16089', normal, ghci_script, ['T16089.script']) test('T14828', normal, ghci_script, ['T14828.script']) test('T16376', normal, ghci_script, ['T16376.script']) +test('T16527', normal, ghci_script, ['T16527.script']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/75abaaead796415cf2b5da610f4b3ee75b9d7759...51fd357119b357c52e990ccce9059c423cc49406 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 4 20:35:23 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 04 Apr 2019 16:35:23 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] gitlab-ci: Backport from master Message-ID: <5ca66a8b4a717_62b33fa2c4ea8fac159558b@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: 1e1331a7 by Ben Gamari at 2019-04-04T20:34:19Z gitlab-ci: Backport from master - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -1,17 +1,31 @@ variables: GIT_SSL_NO_VERIFY: "1" + # Commit of ghc/ci-images repository from which to pull Docker images + DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + before_script: - python3 .gitlab/fix-submodules.py - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" stages: - - lint - - build - - full-build - - cleanup # See Note [Cleanup on Windows] + - lint # Source linting + - build # A quick smoke-test to weed out broken commits + - full-build # Build all the things + - cleanup # See Note [Cleanup on Windows] + - packaging # Source distribution, etc. + - hackage # head.hackage testing + - deploy # push documentation + +.only-default: &only-default + only: + - master + - /ghc-[0-9]+\.[0-9]+/ + - merge_requests + - tags ############################################################ # Runner Tags @@ -30,44 +44,106 @@ stages: ############################################################ ghc-linters: + allow_failure: true stage: lint - image: ghcci/linters:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" script: - - | - if [ -n "$CI_MERGE_REQUEST_ID" ]; then - base="$(git merge-base $CI_MERGE_REQUEST_BRANCH_NAME HEAD)" - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) - submodchecker .git $(git rev-list $base..$CI_COMMIT_SHA) - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA - fi + - git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME + - base="$(git merge-base FETCH_HEAD $CI_COMMIT_SHA)" + - "echo Merge base $base" + # - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) + - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) + - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + dependencies: [] + tags: + - lint + only: + refs: + - merge_requests + +# We allow the submodule checker to fail when run on merge requests (to +# accomodate, e.g., haddock changes not yet upstream) but not on `master` or +# Marge jobs. +.lint-submods: + stage: lint + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" + script: + - submodchecker .git $(git rev-list $base..$CI_COMMIT_SHA) + dependencies: [] + tags: + - lint + +lint-submods: + extends: .lint-submods + only: + refs: + - master + - /ghc-[0-9]+\.[0-9]+/ + - wip/marge_bot_batch_merge_job + +lint-submods-mr: + extends: .lint-submods + allow_failure: true + only: + refs: + - merge_requests + +.lint-changelogs: + stage: lint + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" dependencies: [] tags: - lint + script: + - | + grep TBA libraries/*/changelog.md && ( + echo "Error: Found \"TBA\"s in changelogs." + exit 1 + ) + +lint-changelogs: + extends: .lint-changelogs + allow_failure: true + only: + refs: + - /ghc-[0-9]+\.[0-9]+/ + +lint-release-changelogs: + extends: .lint-changelogs + only: + - tags + ############################################################ # Validation via Pipelines (hadrian) ############################################################ .validate-hadrian: + <<: *only-default allow_failure: true script: + - cabal update - git clean -xdf && git submodule foreach git clean -xdf - bash .circleci/prepare-system.sh - if [[ -d ./cabal-cache ]]; then cp -R ./.cabal-cache ~/.cabal-cache; fi - ./boot - ./configure $CONFIGURE_ARGS - - hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` + - hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --docs=no-sphinx binary-dist + - mv _build/bindist/ghc*.tar.xz ghc.tar.xz cache: key: hadrian paths: - cabal-cache + artifacts: + when: always + paths: + - ghc.tar.xz validate-x86_64-linux-deb8-hadrian: extends: .validate-hadrian stage: build - image: ghcci/x86_64-linux-deb8:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb8:$DOCKER_REV" before_script: # workaround for docker permissions - sudo chown ghc:ghc -R . @@ -75,14 +151,17 @@ validate-x86_64-linux-deb8-hadrian: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" tags: - x86_64-linux + ############################################################ # Validation via Pipelines (make) ############################################################ .validate: + <<: *only-default variables: TEST_TYPE: test before_script: @@ -92,22 +171,25 @@ validate-x86_64-linux-deb8-hadrian: - ./configure $CONFIGURE_ARGS - | THREADS=`mk/detect-cpu-count.sh` - make V=0 -j$THREADS + make V=0 -j$THREADS WERROR=-Werror - | make binary-dist TAR_COMP_OPTS="-1" - mv ghc-*.tar.xz ghc.tar.xz - | THREADS=`mk/detect-cpu-count.sh` - make $TEST_TYPE THREADS=$THREADS JUNIT_FILE=../../junit.xml + make $TEST_TYPE THREADS=$THREADS JUNIT_FILE=../../junit.xml METRICS_FILE=$METRICS_FILE dependencies: [] artifacts: reports: junit: junit.xml expire_in: 2 week paths: - - ghc.tar.xz + - ghc-*.tar.xz - junit.xml +################################# +# x86_64-darwin +################################# + validate-x86_64-darwin: extends: .validate stage: full-build @@ -115,17 +197,20 @@ validate-x86_64-darwin: - x86_64-darwin variables: GHC_VERSION: 8.6.3 + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-apple-darwin.tar.xz" MACOSX_DEPLOYMENT_TARGET: "10.7" # Only Sierra and onwards supports clock_gettime. See #12858 ac_cv_func_clock_gettime: "no" LANG: "en_US.UTF-8" CONFIGURE_ARGS: --with-intree-gmp + TEST_ENV: "x86_64-darwin" before_script: - git clean -xdf && git submodule foreach git clean -xdf - python3 .gitlab/fix-submodules.py - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" - bash .gitlab/darwin-init.sh - PATH="`pwd`/toolchain/bin:$PATH" @@ -150,6 +235,12 @@ validate-x86_64-darwin: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" + # Build hyperlinked sources for documentation when building releases + - | + if [[ -n "$CI_COMMIT_TAG" ]]; then + echo "EXTRA_HADDOCK_OPTS += --hyperlinked-source --quickjump" >> mk/build.mk + fi - bash .circleci/prepare-system.sh # workaround for docker permissions @@ -162,22 +253,31 @@ validate-x86_64-darwin: - cabal-cache - toolchain -validate-aarch64-linux-deb9: +################################# +# aarch64-linux-deb9 +################################# + +.build-aarch64-linux-deb9: extends: .validate-linux stage: full-build - image: ghcci/aarch64-linux-deb9:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/aarch64-linux-deb9:$DOCKER_REV" allow_failure: true - artifacts: - when: always - expire_in: 2 week + variables: + TEST_ENV: "aarch64-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-aarch64-linux-deb9.tar.xz" cache: key: linux-aarch64-deb9 tags: - aarch64-linux +validate-aarch64-linux-deb9: + extends: .build-aarch64-linux-deb9 + artifacts: + when: always + expire_in: 2 week + nightly-aarch64-linux-deb9: - extends: validate-aarch64-linux-deb9 - stage: full-build + extends: .build-aarch64-linux-deb9 artifacts: expire_in: 2 year variables: @@ -186,22 +286,28 @@ nightly-aarch64-linux-deb9: variables: - $NIGHTLY -validate-i386-linux-deb9: +################################# +# i386-linux-deb9 +################################# + +.build-i386-linux-deb9: extends: .validate-linux stage: full-build - image: ghcci/i386-linux-deb9:0.1 - allow_failure: true + image: "registry.gitlab.haskell.org/ghc/ci-images/i386-linux-deb9:$DOCKER_REV" + variables: + TEST_ENV: "i386-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-i386-deb9-linux.tar.xz" + cache: + key: linux-i386-deb9 + +validate-i386-linux-deb9: + extends: .build-i386-linux-deb9 artifacts: when: always expire_in: 2 week - cache: - key: linux-i386-deb9 nightly-i386-linux-deb9: - extends: .validate-linux - stage: full-build - image: ghcci/i386-linux-deb9:0.1 - allow_failure: true + extends: .build-i386-linux-deb9 variables: TEST_TYPE: slowtest artifacts: @@ -210,22 +316,29 @@ nightly-i386-linux-deb9: only: variables: - $NIGHTLY + +################################# +# x86_64-linux-deb9 +################################# + +.build-x86_64-linux-deb9: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb9:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb9-linux.tar.xz" cache: - key: linux-i386-deb9 + key: linux-x86_64-deb9 validate-x86_64-linux-deb9: - extends: .validate-linux - stage: build - image: ghcci/x86_64-linux-deb9:0.2 + extends: .build-x86_64-linux-deb9 artifacts: when: always expire_in: 2 week - cache: - key: linux-x86_64-deb9 nightly-x86_64-linux-deb9: - extends: validate-x86_64-linux-deb9 - stage: build + extends: .build-x86_64-linux-deb9 artifacts: expire_in: 2 year variables: @@ -234,70 +347,93 @@ nightly-x86_64-linux-deb9: variables: - $NIGHTLY +# N.B. Has DEBUG assertions enabled in stage2 +validate-x86_64-linux-deb9-debug: + extends: .build-x86_64-linux-deb9 + stage: build + variables: + BUILD_FLAVOUR: validate + TEST_ENV: "x86_64-linux-deb9-debug" + validate-x86_64-linux-deb9-llvm: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build - allow_failure: true - image: ghcci/x86_64-linux-deb9:0.2 variables: BUILD_FLAVOUR: perf-llvm - cache: - key: linux-x86_64-deb9 - -validate-x86_64-linux-deb8: - extends: .validate-linux - stage: full-build - image: ghcci/x86_64-linux-deb8:0.1 - cache: - key: linux-x86_64-deb8 - artifacts: - when: always - expire_in: 2 week - -validate-x86_64-linux-fedora27: - extends: .validate-linux - stage: full-build - image: ghcci/x86_64-linux-fedora27:0.1 - cache: - key: linux-x86_64-fedora27 - artifacts: - when: always - expire_in: 2 week + TEST_ENV: "x86_64-linux-deb9-llvm" validate-x86_64-linux-deb9-integer-simple: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build variables: INTEGER_LIBRARY: integer-simple - image: ghcci/x86_64-linux-deb9:0.2 - cache: - key: linux-x86_64-deb9 + TEST_ENV: "x86_64-linux-deb9-integer-simple" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb9-linux-integer-simple.tar.xz" nightly-x86_64-linux-deb9-integer-simple: - extends: validate-x86_64-linux-deb9-integer-simple + extends: .build-x86_64-linux-deb9 stage: full-build - artifacts: - expire_in: 2 year variables: + INTEGER_LIBRARY: integer-simple + TEST_ENV: "x86_64-linux-deb9-integer-simple" TEST_TYPE: slowtest + artifacts: + expire_in: 2 year only: variables: - $NIGHTLY validate-x86_64-linux-deb9-unreg: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build variables: CONFIGURE_ARGS: --enable-unregisterised - image: ghcci/x86_64-linux-deb9:0.2 + TEST_ENV: "x86_64-linux-deb9-unreg" + + +################################# +# x86_64-linux-deb8 +################################# + +release-x86_64-linux-deb8: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb8:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-deb8" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb8-linux.tar.xz" + only: + - tags cache: - key: linux-x86_64-deb9 + key: linux-x86_64-deb8 + artifacts: + when: always + expire_in: 2 week + + +################################# +# x86_64-linux-fedora27 +################################# + +validate-x86_64-linux-fedora27: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-fedora27:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-fedora27" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-fedora27-linux.tar.xz" + cache: + key: linux-x86_64-fedora27 + artifacts: + when: always + expire_in: 2 week ############################################################ # Validation via Pipelines (Windows) ############################################################ -.validate-windows: +.build-windows: + <<: *only-default before_script: - git clean -xdf - git submodule foreach git clean -xdf @@ -314,72 +450,131 @@ validate-x86_64-linux-deb9-unreg: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" - bash .gitlab/win32-init.sh after_script: - rd /s /q tmp - robocopy /np /nfl /ndl /e "%APPDATA%\cabal" cabal-cache - bash -c 'make clean || true' dependencies: [] + variables: + FORCE_SYMLINKS: 1 + LANG: "en_US.UTF-8" cache: paths: - cabal-cache - ghc-8.6.2 - ghc-tarballs -validate-x86_64-windows-hadrian: - extends: .validate-windows +.build-windows-hadrian: + extends: .build-windows stage: full-build variables: GHC_VERSION: "8.6.2" - LANG: "en_US.UTF-8" script: - | - set MSYSTEM=MINGW64 python boot bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex' - mkdir -p _build - cp -R inplace/mingw _build/mingw - - bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --flavour=Quick" - # FIXME: Bindist disabled due to #16073 - #- bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh binary-dist" + - bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --flavour=Quick --docs=no-sphinx binary-dist" + - mv _build/bindist/ghc*.tar.xz ghc.tar.xz # FIXME: Testsuite disabled due to #16156. - #- bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' - cache: - key: x86_64-windows + # - bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' tags: - x86_64-windows + artifacts: + when: always + paths: + - ghc.tar.xz -validate-x86_64-windows: - extends: .validate-windows +validate-x86_64-windows-hadrian: + extends: .build-windows-hadrian + variables: + MSYSTEM: MINGW64 + cache: + key: x86_64-windows-hadrian + +nightly-i386-windows-hadrian: + extends: .build-windows-hadrian + variables: + MSYSTEM: MINGW32 + only: + variables: + - $NIGHTLY + cache: + key: i386-windows-hadrian + +.build-windows-make: + extends: .build-windows stage: full-build # due to #16084 allow_failure: true variables: + BUILD_FLAVOUR: "quick" GHC_VERSION: "8.6.2" - LANG: "en_US.UTF-8" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-mingw32.tar.xz" script: - | - set MSYSTEM=MINGW64 python boot - bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex' - - bash -c "echo include mk/flavours/quick.mk > mk/build.mk" + bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex $CONFIGURE_ARGS' + - bash -c "echo include mk/flavours/${BUILD_FLAVOUR}.mk > mk/build.mk" - bash -c "PATH=`pwd`/toolchain/bin:$PATH make -j`mk/detect-cpu-count.sh`" - - | - bash -c "make binary-dist TAR_COMP_OPTS=-1" - mv ghc-*.tar.xz ghc.tar.xz + - bash -c "PATH=`pwd`/toolchain/bin:$PATH make binary-dist TAR_COMP_OPTS=-1" - bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' - cache: - key: x86_64-windows tags: - x86_64-windows artifacts: when: always + expire_in: 2 week reports: junit: junit.xml paths: - - ghc.tar.xz + - ghc-*.tar.xz - junit.xml +validate-x86_64-windows: + extends: .build-windows-make + variables: + MSYSTEM: MINGW64 + CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" + cache: + key: x86_64-windows + +# Normal Windows validate builds are profiled; that won't do for releases. +release-x86_64-windows: + extends: validate-x86_64-windows + variables: + MSYSTEM: MINGW64 + BUILD_FLAVOUR: "perf" + CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" + only: + - tags + +release-i386-windows: + extends: .build-windows-make + only: + - tags + variables: + MSYSTEM: MINGW32 + BUILD_FLAVOUR: "perf" + CONFIGURE_ARGS: "--target=i386-unknown-mingw32" + cache: + key: i386-windows + +nightly-i386-windows: + extends: .build-windows-make + only: + variables: + - $NIGHTLY + variables: + MSYSTEM: MINGW32 + CONFIGURE_ARGS: "--target=i386-unknown-mingw32" + cache: + key: i386-windows + +############################################################ +# Cleanup +############################################################ + # Note [Cleaning up after shell executor] # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # @@ -393,6 +588,7 @@ validate-x86_64-windows: # See Note [Cleanup after shell executor] cleanup-windows: + <<: *only-default stage: cleanup tags: - x86_64-windows @@ -415,10 +611,12 @@ cleanup-windows: # See Note [Cleanup after shell executor] cleanup-darwin: + <<: *only-default stage: cleanup tags: - x86_64-darwin when: always + dependencies: [] before_script: - echo "Time to clean up" script: @@ -430,3 +628,106 @@ cleanup-darwin: - rm -Rf $BUILD_DIR/* - exit 0 +############################################################ +# Packaging +############################################################ + +doc-tarball: + <<: *only-default + stage: packaging + tags: + - x86_64-linux + image: ghcci/x86_64-linux-deb9:0.2 + dependencies: + - validate-x86_64-linux-deb9 + - validate-x86_64-windows + artifacts: + paths: + - haddock.html.tar.xz + - libraries.html.tar.xz + - users_guide.html.tar.xz + - index.html + - "*.pdf" + script: + - rm -Rf docs + - bash -ex distrib/mkDocs/mkDocs ghc-x86_64-deb9-linux.tar.xz ghc-x86_64-mingw32.tar.xz + - ls -lh + - mv docs/*.tar.xz docs/index.html . + +source-tarball: + stage: packaging + tags: + - x86_64-linux + image: ghcci/x86_64-linux-deb9:0.2 + only: + - tags + artifacts: + paths: + - ghc-*.tar.xz + - version + script: + - mk/get-win32-tarballs.sh download all + - ./boot + - ./configure + - make sdist + - mv sdistprep/*.xz . + - make show VALUE=version > version + + +############################################################ +# Testing via head.hackage +############################################################ + +# Triggering jobs in the ghc/head.hackage project requires that we have a job +# token for that repository. Furthermore the head.hackage CI job must have +# access to an unprivileged access token with the ability to query the ghc/ghc +# project such that it can find the job ID of the fedora27 job for the current +# pipeline. + +.hackage: + <<: *only-default + stage: hackage + image: ghcci/x86_64-linux-deb9:0.2 + tags: + - x86_64-linux + dependencies: [] + variables: + HEAD_HACKAGE_PROJECT_ID: "78" + script: + - bash .gitlab/start-head.hackage.sh + +hackage: + extends: .hackage + when: manual + +hackage-label: + extends: .hackage + only: + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*user-facing.*/ + +nightly-hackage: + extends: .hackage + only: + variables: + - $NIGHTLY + +pages: + stage: deploy + dependencies: + - doc-tarball + image: ghcci/x86_64-linux-deb9:0.2 + tags: + - x86_64-linux + script: + - mkdir -p public/doc + - tar -xf haddock.html.tar.xz -C public/doc + - tar -xf libraries.html.tar.xz -C public/doc + - tar -xf users_guide.html.tar.xz -C public/doc + - cp -f index.html public/doc + only: + - master + artifacts: + paths: + - public + View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1e1331a7a27515cb43e4afcfd26a3050533cca43 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1e1331a7a27515cb43e4afcfd26a3050533cca43 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 4 20:39:37 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 04 Apr 2019 16:39:37 -0400 Subject: [Git][ghc/ghc][ghc-8.8] 3 commits: Correct duplicate 4.12.0.0 entry in base's changelog Message-ID: <5ca66b8910e18_62b33fa2ef2ce6e81599977@gitlab.haskell.org.mail> Ben Gamari pushed to branch ghc-8.8 at Glasgow Haskell Compiler / GHC Commits: db5a43a9 by Ryan Scott at 2019-04-02T18:22:28Z Correct duplicate 4.12.0.0 entry in base's changelog See #16490. [ci skip] - - - - - c0661417 by Ben Gamari at 2019-04-02T18:22:48Z Bump transformers to 0.5.6.2 See #16199. - - - - - d90dcd4a by Ryan Scott at 2019-04-02T18:24:17Z Bump array submodule This bumps `array` to version 0.5.4.0 so that we can distinguish it with `MIN_VERSION_array` (as it introduces some changes to the `Show` instance for `UArray`). - - - - - 3 changed files: - libraries/array - libraries/base/changelog.md - libraries/transformers Changes: ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 82a29b198a67f54de2f72d96ca077800afbfbe4f +Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 ===================================== libraries/base/changelog.md ===================================== @@ -1,6 +1,6 @@ # Changelog for [`base` package](http://hackage.haskell.org/package/base) -## 4.12.0.0 *TBA* +## 4.13.0.0 *TBA* * Bundled with GHC *TBA* * Fix `Show` instance of `Data.Fixed`: Negative numbers are now parenthesized ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit 49655191d33912815a9389b764e2d89e92140938 +Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/9ba0a4bd6eacaa8aeb5513658a10a17d52c32e8f...d90dcd4ac41c2c1310e1496104c8aeaf76c51b1c -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/9ba0a4bd6eacaa8aeb5513658a10a17d52c32e8f...d90dcd4ac41c2c1310e1496104c8aeaf76c51b1c You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 4 23:20:39 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Thu, 04 Apr 2019 19:20:39 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca69147c1793_62b33fa2be3bc73416212fc@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: e7dbc8a5 by Joachim Breitner at 2019-04-04T23:19:35Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,30 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + -- This code must not be called. You either need to + -- add your architecture as a distinct case or + -- use non-TABLES_NEXT_TO_CODE mode + code' <- if tables_next_to_code + then mkJumpToAddr <$> mPlatform <*> pure entry_addr of + Just code' -> pure $ Just code' + Nothing -> fail "Unknown obscure arch is not supported" + case pure Nothing + let entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode + code = if tables_next_to_code then Just code' else Nothing } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +82,34 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +mPlatform :: Maybe Arch +mPlatform = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +269,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +312,38 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 +sizeOfEntryCode :: Bool -> Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = 0 | otherwise = case mkJumpToAddr undefined of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode tables_next_to_code) -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +352,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e7dbc8a59a61a2753fcc8f78b539e847924a38c1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e7dbc8a59a61a2753fcc8f78b539e847924a38c1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 00:58:00 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Thu, 04 Apr 2019 20:58:00 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca6a818a7011_62b33fa2ef4f1b281630789@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: 9866be10 by Joachim Breitner at 2019-04-05T00:57:36Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,30 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + -- This code must not be called. You either need to + -- add your architecture as a distinct case or + -- use non-TABLES_NEXT_TO_CODE mode + code' <- if tables_next_to_code + then Just <$> case mkJumpToAddr <$> mPlatform <*> pure entry_addr of + Just code' -> pure code' + Nothing -> fail "Unknown obscure arch is not supported" + case pure Nothing + let entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode + code = if tables_next_to_code then Just code' else Nothing } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +82,34 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +mPlatform :: Maybe Arch +mPlatform = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +269,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +312,38 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 +sizeOfEntryCode :: Bool -> Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = 0 | otherwise = case mkJumpToAddr undefined of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode tables_next_to_code) -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +352,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9866be10719870b22aa69618bddceac3bae2f66f -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9866be10719870b22aa69618bddceac3bae2f66f You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 02:20:13 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Thu, 04 Apr 2019 22:20:13 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca6bb5de7743_62b33fa2ee7bcb241645566@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: e1ad6994 by Joachim Breitner at 2019-04-05T02:19:59Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,30 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + -- This code must not be called. You either need to + -- add your architecture as a distinct case or + -- use non-TABLES_NEXT_TO_CODE mode + code' <- if tables_next_to_code + then Just <$> case mkJumpToAddr <$> mPlatform <*> pure entry_addr of + Just code' -> pure code' + Nothing -> fail "Unknown obscure arch is not supported" + else pure Nothing + let entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode + code = if tables_next_to_code then Just code' else Nothing } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +82,34 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +mPlatform :: Maybe Arch +mPlatform = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +269,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +312,38 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 +sizeOfEntryCode :: Bool -> Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = 0 | otherwise = case mkJumpToAddr undefined of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode tables_next_to_code) -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +352,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e1ad69945967b4895c96b2f5a789c7263eb1ccf8 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e1ad69945967b4895c96b2f5a789c7263eb1ccf8 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 03:05:57 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 04 Apr 2019 23:05:57 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 10 commits: testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways Message-ID: <5ca6c6155598c_62b33fa2c8ae78601664623@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 450c6023 by Ben Gamari at 2019-04-05T01:57:05Z testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways As noted in #16535. - - - - - 38d76e73 by Ömer Sinan Ağacan at 2019-04-05T02:01:04Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 2b864ade by Ben Gamari at 2019-04-05T02:02:12Z testsuite: Make closureSize less sensitive to optimisation - - - - - dbb861e7 by Ben Gamari at 2019-04-05T02:19:43Z process: Skip process005 in ghci way - - - - - 87ae3083 by Ben Gamari at 2019-04-05T02:25:23Z testsuite: Mark T13167 as broken in threaded2 As noted in #16536. - - - - - e5046f42 by Ben Gamari at 2019-04-05T02:38:06Z testsuite: Mark T13910 as broken in optasm Due to #16537. - - - - - c843bbf6 by Ben Gamari at 2019-04-05T02:45:43Z testsuite: Mark T14272 as broken in optasm - - - - - fee4a7c0 by Ben Gamari at 2019-04-05T02:50:00Z testsuite: Mark T14761c as broken in hpc and optasm ways As noted in #16540. - - - - - 5726aa13 by Ben Gamari at 2019-04-05T02:55:00Z testsuite: Mark T16180 as broken in ghci way As noted in #16541. - - - - - b58908e0 by Ben Gamari at 2019-04-05T03:01:46Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - 10 changed files: - libraries/base/tests/all.T - libraries/ghc-heap/tests/closure_size.hs - libraries/process - testsuite/driver/testlib.py - testsuite/tests/dependent/should_compile/all.T - testsuite/tests/primops/should_run/all.T - testsuite/tests/simplCore/should_compile/all.T - testsuite/tests/th/all.T - testsuite/tests/typecheck/should_compile/all.T - testsuite/tests/typecheck/should_run/all.T Changes: ===================================== libraries/base/tests/all.T ===================================== @@ -202,7 +202,7 @@ test('T8089', compile_and_run, ['']) test('hWaitForInput-accurate-socket', reqlib('unix'), compile_and_run, ['']) test('T8684', expect_broken(8684), compile_and_run, ['']) -test('hWaitForInput-accurate-stdin', normal, compile_and_run, ['']) +test('hWaitForInput-accurate-stdin', [expect_broken_for(16535, ['threaded1', 'threaded2']), omit_ways(['ghci']), compile_and_run, ['']) test('hWaitForInput-accurate-pipe', reqlib('unix'), compile_and_run, ['']) test('T9826',normal, compile_and_run,['']) test('T9848', @@ -233,5 +233,5 @@ test('T3474', test('T14425', normal, compile_and_run, ['']) test('T10412', normal, compile_and_run, ['']) test('T13896', normal, compile_and_run, ['']) -test('T13167', normal, compile_and_run, ['']) +test('T13167', expect_broken_for(16536, ['threaded2']), compile_and_run, ['']) test('T15349', [exit_code(1), expect_broken_for(15349, 'ghci')], compile_and_run, ['']) ===================================== libraries/ghc-heap/tests/closure_size.hs ===================================== @@ -1,6 +1,7 @@ {-# LANGUAGE BangPatterns #-} {-# LANGUAGE TypeApplications #-} {-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE MagicHash #-} import Control.Monad import Type.Reflection @@ -17,12 +18,17 @@ assertSize !x expected = do putStrLn $ prettyCallStack callStack {-# NOINLINE assertSize #-} -pap :: Int -> Char -> Int +pap :: Int -> Maybe Char -> Int pap x _ = x {-# NOINLINE pap #-} main :: IO () main = do + -- Ensure that GHC can't turn PAP into a FUN (see #16531) + let x :: Int + x = 42 + {-# NOINLINE x #-} + assertSize 'a' 2 assertSize (Just ()) 2 assertSize (Nothing :: Maybe ()) 2 @@ -30,5 +36,5 @@ main = do assertSize ((1,2,3) :: (Int,Int,Int)) 4 assertSize (id :: Int -> Int) 1 assertSize (fst :: (Int,Int) -> Int) 1 - assertSize (pap 1) 2 + assertSize (pap x) 2 ===================================== libraries/process ===================================== @@ -1 +1 @@ -Subproject commit d860209e53c1b40b7c251fc8378886bbcb394402 +Subproject commit 1a6197ff2112ed9849589b348981754ee1d3ca23 ===================================== testsuite/driver/testlib.py ===================================== @@ -1408,7 +1408,6 @@ def simple_run(name, way, prog, extra_run_opts): return failBecause('bad stderr') if not (opts.ignore_stdout or stdout_ok(name, way)): return failBecause('bad stdout') - check_hp = '-h' in my_rts_flags and opts.check_hp check_prof = '-p' in my_rts_flags ===================================== testsuite/tests/dependent/should_compile/all.T ===================================== @@ -40,7 +40,7 @@ test('T12742', normal, compile, ['']) # (1) Use -fexternal-interpreter, or # (2) Build the program twice: once with -dynamic, and then # with -prof using -osuf to set a different object file suffix. -test('T13910', omit_ways(['profasm']), compile, ['']) +test('T13910', [expect_broken_for(16537, ['optasm']), omit_ways(['profasm'])], compile, ['']) test('T13938', [extra_files(['T13938a.hs'])], makefile_test, ['T13938']) test('T14556', normal, compile, ['']) test('T14720', normal, compile, ['']) ===================================== testsuite/tests/primops/should_run/all.T ===================================== @@ -20,7 +20,9 @@ test('CmpInt8', normal, compile_and_run, ['']) test('CmpWord8', normal, compile_and_run, ['']) test('ShowPrim', normal, compile_and_run, ['']) -test('ArithInt16', normal, compile_and_run, ['']) -test('ArithWord16', normal, compile_and_run, ['']) +# These two tests use unboxed tuples, which GHCi doesn't support +test('ArithInt16', omit_ways(['ghci']), compile_and_run, ['']) +test('ArithWord16', omit_ways(['ghci']), compile_and_run, ['']) + test('CmpInt16', normal, compile_and_run, ['']) -test('CmpWord16', normal, compile_and_run, ['']) \ No newline at end of file +test('CmpWord16', normal, compile_and_run, ['']) ===================================== testsuite/tests/simplCore/should_compile/all.T ===================================== @@ -275,7 +275,7 @@ test('T14140', normal, makefile_test, ['T14140']) -test('T14272', normal, compile, ['']) +test('T14272', expect_broken_for(16539, ['optasm']), compile, ['']) test('T14270a', normal, compile, ['']) test('T14152', [ only_ways(['optasm']), check_errmsg(r'dead code') ], compile, ['-ddump-simpl']) test('T14152a', [extra_files(['T14152.hs']), pre_cmd('cp T14152.hs T14152a.hs'), ===================================== testsuite/tests/th/all.T ===================================== @@ -467,7 +467,7 @@ test('T15437', expect_broken(15437), multimod_compile, test('T15985', normal, compile, ['']) test('T16133', normal, compile_fail, ['']) test('T15471', normal, multimod_compile, ['T15471.hs', '-v0']) -test('T16180', normal, compile_and_run, ['-package ghc']) +test('T16180', expect_broken_for(16541, ['ghci']), compile_and_run, ['-package ghc']) test('T16183', normal, compile, ['-v0 -ddump-splices -dsuppress-uniques']) test('T16195', normal, multimod_compile, ['T16195.hs', '-v0']) test('T16293b', normal, compile, ['']) ===================================== testsuite/tests/typecheck/should_compile/all.T ===================================== @@ -659,7 +659,7 @@ test('T15586', normal, compile, ['']) test('T15368', normal, compile, ['-fdefer-type-errors']) test('T15645', normal, compile, ['-Wwarn=missing-monadfail-instances']) test('T15778', normal, compile, ['']) -test('T14761c', normal, compile, ['']) +test('T14761c', expect_broken_for(16540, ['hpc', 'optasm']), compile, ['']) test('T16008', normal, compile, ['']) test('T16033', normal, compile, ['']) test('T16141', normal, compile, ['-O']) ===================================== testsuite/tests/typecheck/should_run/all.T ===================================== @@ -35,7 +35,7 @@ test('tcrun018', normal, compile_and_run, ['']) test('tcrun019', normal, compile_and_run, ['']) test('tcrun020', normal, compile_and_run, ['']) test('tcrun021', normal, compile_and_run, ['-package containers']) -test('tcrun022', omit_ways(['ghci']), compile_and_run, ['-O']) +test('tcrun022', omit_ways(['hpc' 'ghci']), compile_and_run, ['-O']) test('tcrun023', [when(doing_ghci(), extra_hc_opts('-fobject-code'))], compile_and_run, ['-O']) test('tcrun024', normal, compile_and_run, ['-O']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/54687d5a058347a56ed4a0b23297080f891a24d5...b58908e0ab93b49303fb7a06db38b397f4a3d684 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/54687d5a058347a56ed4a0b23297080f891a24d5...b58908e0ab93b49303fb7a06db38b397f4a3d684 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 03:09:10 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Thu, 04 Apr 2019 23:09:10 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca6c6d6eba4b_62b33fa2ef3c167c16657e2@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: a478a206 by Joachim Breitner at 2019-04-05T03:07:40Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,31 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where - entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + let entry_addr = interpConstrEntry !! ptrtag + code' <- if tables_next_to_code + then Just <$> case mkJumpToAddr <$> mPlatform <*> pure entry_addr of + Just code' -> pure code' + Nothing -> + -- This code must not be called. You either need to add your + -- architecture as a distinct case to 'Arch' and 'mPlatform', or use + -- non-TABLES_NEXT_TO_CODE mode. + fail "Unknown obscure arch is not supported" + else pure Nothing + let itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode + code = if tables_next_to_code then Just code' else Nothing } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +83,35 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +-- | 'Just' if it's a known OS, or 'Nothing' otherwise. +mPlatform :: Maybe Arch +mPlatform = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +271,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +314,38 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 +sizeOfEntryCode :: Bool -> Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = 0 | otherwise = case mkJumpToAddr undefined of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode tables_next_to_code) -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +354,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/a478a206e0d35673c85d4e583eb8cadfd4a17350 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/a478a206e0d35673c85d4e583eb8cadfd4a17350 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 08:47:19 2019 From: gitlab at gitlab.haskell.org (Sebastian Graf) Date: Fri, 05 Apr 2019 04:47:19 -0400 Subject: [Git][ghc/ghc][wip/T15753] Make `singleConstructor` cope with pattern synonyms Message-ID: <5ca71617d5829_62b33fa2ee0d5dc4173195e@gitlab.haskell.org.mail> Sebastian Graf pushed to branch wip/T15753 at Glasgow Haskell Compiler / GHC Commits: e50c78f3 by Sebastian Graf at 2019-04-05T08:47:10Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. - - - - - 5 changed files: - compiler/deSugar/Check.hs - + testsuite/tests/pmcheck/should_compile/T15753a.hs - + testsuite/tests/pmcheck/should_compile/T15753b.hs - + testsuite/tests/pmcheck/should_compile/T15884.hs - testsuite/tests/pmcheck/should_compile/all.T Changes: ===================================== compiler/deSugar/Check.hs ===================================== @@ -4,9 +4,13 @@ Author: George Karachalias Pattern Matching Coverage Checking. -} -{-# LANGUAGE CPP, GADTs, DataKinds, KindSignatures #-} -{-# LANGUAGE TupleSections #-} -{-# LANGUAGE ViewPatterns #-} +{-# LANGUAGE CPP #-} +{-# LANGUAGE GADTs #-} +{-# LANGUAGE DataKinds #-} +{-# LANGUAGE KindSignatures #-} +{-# LANGUAGE TupleSections #-} +{-# LANGUAGE ViewPatterns #-} +{-# LANGUAGE MultiWayIf #-} module Check ( -- Checking and printing @@ -55,7 +59,7 @@ import qualified GHC.LanguageExtensions as LangExt import Data.List (find) import Data.Maybe (catMaybes, isJust, fromMaybe) -import Control.Monad (forM, when, forM_, zipWithM) +import Control.Monad (forM, when, forM_, zipWithM, filterM) import Coercion import TcEvidence import TcSimplify (tcNormalise) @@ -289,6 +293,14 @@ data PmResult = , pmresultUncovered :: UncoveredCandidates , pmresultInaccessible :: [Located [LPat GhcTc]] } +instance Outputable PmResult where + ppr pmr = hang (text "PmResult") 2 $ vcat + [ text "pmresultProvenance" <+> ppr (pmresultProvenance pmr) + , text "pmresultRedundant" <+> ppr (pmresultRedundant pmr) + , text "pmresultUncovered" <+> ppr (pmresultUncovered pmr) + , text "pmresultInaccessible" <+> ppr (pmresultInaccessible pmr) + ] + -- | Either a list of patterns that are not covered, or their type, in case we -- have no patterns at hand. Not having patterns at hand can arise when -- handling EmptyCase expressions, in two cases: @@ -303,6 +315,10 @@ data PmResult = data UncoveredCandidates = UncoveredPatterns Uncovered | TypeOfUncovered Type +instance Outputable UncoveredCandidates where + ppr (UncoveredPatterns uc) = text "UnPat" <+> ppr uc + ppr (TypeOfUncovered ty) = text "UnTy" <+> ppr ty + -- | The empty pattern check result emptyPmResult :: PmResult emptyPmResult = PmResult FromBuiltin [] (UncoveredPatterns []) [] @@ -987,7 +1003,7 @@ translatePat fam_insts pat = case pat of | otherwise -> do ps <- translatePat fam_insts p (xp,xe) <- mkPmId2Forms ty - let g = mkGuard ps (mkHsWrap wrapper (unLoc xe)) + g <- mkGuard ps (mkHsWrap wrapper (unLoc xe)) return [xp,g] -- (n + k) ===> x (True <- x >= k) (n <- x-k) @@ -997,10 +1013,11 @@ translatePat fam_insts pat = case pat of ViewPat arg_ty lexpr lpat -> do ps <- translatePat fam_insts (unLoc lpat) -- See Note [Guards and Approximation] - case all cantFailPattern ps of + res <- allM cantFailPattern ps + case res of True -> do (xp,xe) <- mkPmId2Forms arg_ty - let g = mkGuard ps (HsApp noExt lexpr xe) + g <- mkGuard ps (HsApp noExt lexpr xe) return [xp,g] False -> mkCanFailPmPat arg_ty @@ -1234,7 +1251,7 @@ translateConPatVec fam_insts univ_tys ex_tvs c (RecCon (HsRecFields fs _)) -- Translate a single match translateMatch :: FamInstEnvs -> LMatch GhcTc (LHsExpr GhcTc) -> DsM (PatVec,[PatVec]) -translateMatch fam_insts (dL->L _ (Match { m_pats = lpats, m_grhss = grhss })) = +translateMatch fam_insts (dL->L _ (m at Match { m_pats = lpats, m_grhss = grhss })) = do pats' <- concat <$> translatePatVec fam_insts pats guards' <- mapM (translateGuards fam_insts) guards @@ -1255,41 +1272,38 @@ translateMatch _ _ = panic "translateMatch" translateGuards :: FamInstEnvs -> [GuardStmt GhcTc] -> DsM PatVec translateGuards fam_insts guards = do all_guards <- concat <$> mapM (translateGuard fam_insts) guards - return (replace_unhandled all_guards) - -- It should have been (return all_guards) but it is too expressive. + let + shouldKeep :: Pattern -> DsM Bool + shouldKeep p + | PmVar {} <- p = pure True + | PmCon {} <- p = (&&) + <$> singleMatchConstructor (pm_con_con p) (pm_con_arg_tys p) + <*> allM shouldKeep (pm_con_args p) + shouldKeep (PmGrd _pv _e) + | isNotPmExprOther e = pure True -- expensive but we want it + | otherwise = allM shouldKeep pv + shouldKeep _other_pat = pure False -- let the rest.. + + all_handled <- allM shouldKeep all_guards + -- It should have been @pure all_guards@ but it is too expressive. -- Since the term oracle does not handle all constraints we generate, -- we (hackily) replace all constraints the oracle cannot handle with a - -- single one (we need to know if there is a possibility of falure). + -- single one (we need to know if there is a possibility of failure). -- See Note [Guards and Approximation] for all guard-related approximations -- we implement. - where - replace_unhandled :: PatVec -> PatVec - replace_unhandled gv - | any_unhandled gv = fake_pat : [ p | p <- gv, shouldKeep p ] - | otherwise = gv - - any_unhandled :: PatVec -> Bool - any_unhandled gv = any (not . shouldKeep) gv - - shouldKeep :: Pattern -> Bool - shouldKeep p - | PmVar {} <- p = True - | PmCon {} <- p = singleConstructor (pm_con_con p) - && all shouldKeep (pm_con_args p) - shouldKeep (PmGrd pv e) - | all shouldKeep pv = True - | isNotPmExprOther e = True -- expensive but we want it - shouldKeep _other_pat = False -- let the rest.. + if all_handled + then pure all_guards + else do + kept <- filterM shouldKeep all_guards + pure (fake_pat : kept) -- | Check whether a pattern can fail to match -cantFailPattern :: Pattern -> Bool -cantFailPattern p - | PmVar {} <- p = True - | PmCon {} <- p = singleConstructor (pm_con_con p) - && all cantFailPattern (pm_con_args p) -cantFailPattern (PmGrd pv _e) - = all cantFailPattern pv -cantFailPattern _ = False +cantFailPattern :: Pattern -> DsM Bool +cantFailPattern PmVar {} = pure True +cantFailPattern PmCon { pm_con_con = c, pm_con_arg_tys = tys, pm_con_args = ps} + = (&&) <$> singleMatchConstructor c tys <*> allM cantFailPattern ps +cantFailPattern (PmGrd pv _e) = allM cantFailPattern pv +cantFailPattern _ = pure False -- | Translate a guard statement to Pattern translateGuard :: FamInstEnvs -> GuardStmt GhcTc -> DsM PatVec @@ -1312,7 +1326,8 @@ translateLet _binds = return [] translateBind :: FamInstEnvs -> LPat GhcTc -> LHsExpr GhcTc -> DsM PatVec translateBind fam_insts (dL->L _ p) e = do ps <- translatePat fam_insts p - return [mkGuard ps (unLoc e)] + g <- mkGuard ps (unLoc e) + return [g] -- | Translate a boolean guard translateBoolGuard :: LHsExpr GhcTc -> DsM PatVec @@ -1321,7 +1336,7 @@ translateBoolGuard e -- The formal thing to do would be to generate (True <- True) -- but it is trivial to solve so instead we give back an empty -- PatVec for efficiency - | otherwise = return [mkGuard [truePattern] (unLoc e)] + | otherwise = (:[]) <$> mkGuard [truePattern] (unLoc e) {- Note [Guards and Approximation] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1658,13 +1673,14 @@ mkOneConFull x con = do -- * More smart constructors and fresh variable generation -- | Create a guard pattern -mkGuard :: PatVec -> HsExpr GhcTc -> Pattern -mkGuard pv e - | all cantFailPattern pv = PmGrd pv expr - | PmExprOther {} <- expr = fake_pat - | otherwise = PmGrd pv expr - where - expr = hsExprToPmExpr e +mkGuard :: PatVec -> HsExpr GhcTc -> DsM Pattern +mkGuard pv e = do + res <- allM cantFailPattern pv + let expr = hsExprToPmExpr e + tracePmD "mkGuard" (vcat [ppr pv, ppr e, ppr res, ppr expr]) + if | res -> pure (PmGrd pv expr) + | PmExprOther {} <- expr -> pure fake_pat + | otherwise -> pure (PmGrd pv expr) -- | Create a term equality of the form: `(False ~ (x ~ lit))` mkNegEq :: Id -> PmLit -> ComplexEq @@ -1738,14 +1754,37 @@ coercePmPat (PmCon { pm_con_con = con, pm_con_arg_tys = arg_tys , pm_con_args = coercePatVec args }] coercePmPat (PmGrd {}) = [] -- drop the guards --- | Check whether a data constructor is the only way to construct --- a data type. -singleConstructor :: ConLike -> Bool -singleConstructor (RealDataCon dc) = - case tyConDataCons (dataConTyCon dc) of - [_] -> True - _ -> False -singleConstructor _ = False +-- | Check whether a 'ConLike' has the /single match/ property, i.e. whether +-- it is the only possible match in the given context. See also +-- 'allCompleteMatches' and Note [Single match constructors]. +singleMatchConstructor :: ConLike -> [Type] -> DsM Bool +singleMatchConstructor cl tys = + any (isSingleton . snd) <$> allCompleteMatches cl tys + +{- +Note [Single match constructors] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When translating pattern guards for consumption by the checker, we desugar +every pattern guard that might fail ('cantFailPattern') to 'fake_pat' +(True <- _). Which patterns can't fail? Exactly those that only match on +'singleMatchConstructor's. + +Here are a few examples: + * @f a | (a, b) <- foo a = 42@: Product constructors are generally + single match. This extends to single constructors of GADTs like 'Refl'. + * If @f | Id <- id () = 42@, where @pattern Id = ()@ and 'Id' is part of a + singleton `COMPLETE` set, then 'Id' has the single match property. + +In effect, we can just enumerate 'allCompleteMatches' and check if the conlike +occurs as a singleton set. +There's the chance that 'Id' is part of multiple `COMPLETE` sets. That's +irrelevant; If the user specified a singleton set, it is single-match. + +Note that this doesn't really take into account incoming type constraints; +It might be obvious from type context that a particular GADT constructor has +the single-match property. We currently don't (can't) check this in the +translation step. See #15753 for why this yields surprising results. +-} -- | For a given conlike, finds all the sets of patterns which could -- be relevant to that conlike by consulting the result type. ===================================== testsuite/tests/pmcheck/should_compile/T15753a.hs ===================================== @@ -0,0 +1,28 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE TypeOperators #-} +{-# OPTIONS_GHC -Wincomplete-patterns #-} +module Bug where + +import Data.Type.Equality + +data G a where + GInt :: G Int + GBool :: G Bool + +ex1, ex2, ex3 + :: a :~: Int + -> G a + -> () + +ex1 Refl g + | GInt <- id g + = () + +ex2 Refl g + | GInt <- g + = () + +ex3 Refl g + = case id g of + GInt -> () + ===================================== testsuite/tests/pmcheck/should_compile/T15753b.hs ===================================== @@ -0,0 +1,10 @@ +{-# LANGUAGE PatternSynonyms #-} +module Bug where + +{-# COMPLETE Id #-} +pattern Id :: () +pattern Id = () + +bug :: () +bug | Id <- id () = () + ===================================== testsuite/tests/pmcheck/should_compile/T15884.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE ViewPatterns #-} + +module Bug where + +f :: Maybe a -> Bool +f (id->Nothing) = False +f (id->(Just _)) = True ===================================== testsuite/tests/pmcheck/should_compile/all.T ===================================== @@ -68,6 +68,12 @@ test('T15584', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) test('T15713', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15753a', expect_broken(15753), compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15753b', normal, compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15884', expect_broken(15884), compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) test('T16289', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e50c78f3ee38265a7d8fc66b2f1a1aacd7c87819 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e50c78f3ee38265a7d8fc66b2f1a1aacd7c87819 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 08:49:03 2019 From: gitlab at gitlab.haskell.org (Sebastian Graf) Date: Fri, 05 Apr 2019 04:49:03 -0400 Subject: [Git][ghc/ghc][wip/T15753] Make `singleConstructor` cope with pattern synonyms Message-ID: <5ca7167f628f6_62b33fa2c8ae786017326b4@gitlab.haskell.org.mail> Sebastian Graf pushed to branch wip/T15753 at Glasgow Haskell Compiler / GHC Commits: 9b41f502 by Sebastian Graf at 2019-04-05T08:48:53Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. - - - - - 5 changed files: - compiler/deSugar/Check.hs - + testsuite/tests/pmcheck/should_compile/T15753a.hs - + testsuite/tests/pmcheck/should_compile/T15753b.hs - + testsuite/tests/pmcheck/should_compile/T15884.hs - testsuite/tests/pmcheck/should_compile/all.T Changes: ===================================== compiler/deSugar/Check.hs ===================================== @@ -4,9 +4,13 @@ Author: George Karachalias Pattern Matching Coverage Checking. -} -{-# LANGUAGE CPP, GADTs, DataKinds, KindSignatures #-} -{-# LANGUAGE TupleSections #-} -{-# LANGUAGE ViewPatterns #-} +{-# LANGUAGE CPP #-} +{-# LANGUAGE GADTs #-} +{-# LANGUAGE DataKinds #-} +{-# LANGUAGE KindSignatures #-} +{-# LANGUAGE TupleSections #-} +{-# LANGUAGE ViewPatterns #-} +{-# LANGUAGE MultiWayIf #-} module Check ( -- Checking and printing @@ -55,7 +59,7 @@ import qualified GHC.LanguageExtensions as LangExt import Data.List (find) import Data.Maybe (catMaybes, isJust, fromMaybe) -import Control.Monad (forM, when, forM_, zipWithM) +import Control.Monad (forM, when, forM_, zipWithM, filterM) import Coercion import TcEvidence import TcSimplify (tcNormalise) @@ -289,6 +293,14 @@ data PmResult = , pmresultUncovered :: UncoveredCandidates , pmresultInaccessible :: [Located [LPat GhcTc]] } +instance Outputable PmResult where + ppr pmr = hang (text "PmResult") 2 $ vcat + [ text "pmresultProvenance" <+> ppr (pmresultProvenance pmr) + , text "pmresultRedundant" <+> ppr (pmresultRedundant pmr) + , text "pmresultUncovered" <+> ppr (pmresultUncovered pmr) + , text "pmresultInaccessible" <+> ppr (pmresultInaccessible pmr) + ] + -- | Either a list of patterns that are not covered, or their type, in case we -- have no patterns at hand. Not having patterns at hand can arise when -- handling EmptyCase expressions, in two cases: @@ -303,6 +315,10 @@ data PmResult = data UncoveredCandidates = UncoveredPatterns Uncovered | TypeOfUncovered Type +instance Outputable UncoveredCandidates where + ppr (UncoveredPatterns uc) = text "UnPat" <+> ppr uc + ppr (TypeOfUncovered ty) = text "UnTy" <+> ppr ty + -- | The empty pattern check result emptyPmResult :: PmResult emptyPmResult = PmResult FromBuiltin [] (UncoveredPatterns []) [] @@ -987,7 +1003,7 @@ translatePat fam_insts pat = case pat of | otherwise -> do ps <- translatePat fam_insts p (xp,xe) <- mkPmId2Forms ty - let g = mkGuard ps (mkHsWrap wrapper (unLoc xe)) + g <- mkGuard ps (mkHsWrap wrapper (unLoc xe)) return [xp,g] -- (n + k) ===> x (True <- x >= k) (n <- x-k) @@ -997,10 +1013,11 @@ translatePat fam_insts pat = case pat of ViewPat arg_ty lexpr lpat -> do ps <- translatePat fam_insts (unLoc lpat) -- See Note [Guards and Approximation] - case all cantFailPattern ps of + res <- allM cantFailPattern ps + case res of True -> do (xp,xe) <- mkPmId2Forms arg_ty - let g = mkGuard ps (HsApp noExt lexpr xe) + g <- mkGuard ps (HsApp noExt lexpr xe) return [xp,g] False -> mkCanFailPmPat arg_ty @@ -1255,41 +1272,38 @@ translateMatch _ _ = panic "translateMatch" translateGuards :: FamInstEnvs -> [GuardStmt GhcTc] -> DsM PatVec translateGuards fam_insts guards = do all_guards <- concat <$> mapM (translateGuard fam_insts) guards - return (replace_unhandled all_guards) - -- It should have been (return all_guards) but it is too expressive. + let + shouldKeep :: Pattern -> DsM Bool + shouldKeep p + | PmVar {} <- p = pure True + | PmCon {} <- p = (&&) + <$> singleMatchConstructor (pm_con_con p) (pm_con_arg_tys p) + <*> allM shouldKeep (pm_con_args p) + shouldKeep (PmGrd pv e) + | isNotPmExprOther e = pure True -- expensive but we want it + | otherwise = allM shouldKeep pv + shouldKeep _other_pat = pure False -- let the rest.. + + all_handled <- allM shouldKeep all_guards + -- It should have been @pure all_guards@ but it is too expressive. -- Since the term oracle does not handle all constraints we generate, -- we (hackily) replace all constraints the oracle cannot handle with a - -- single one (we need to know if there is a possibility of falure). + -- single one (we need to know if there is a possibility of failure). -- See Note [Guards and Approximation] for all guard-related approximations -- we implement. - where - replace_unhandled :: PatVec -> PatVec - replace_unhandled gv - | any_unhandled gv = fake_pat : [ p | p <- gv, shouldKeep p ] - | otherwise = gv - - any_unhandled :: PatVec -> Bool - any_unhandled gv = any (not . shouldKeep) gv - - shouldKeep :: Pattern -> Bool - shouldKeep p - | PmVar {} <- p = True - | PmCon {} <- p = singleConstructor (pm_con_con p) - && all shouldKeep (pm_con_args p) - shouldKeep (PmGrd pv e) - | all shouldKeep pv = True - | isNotPmExprOther e = True -- expensive but we want it - shouldKeep _other_pat = False -- let the rest.. + if all_handled + then pure all_guards + else do + kept <- filterM shouldKeep all_guards + pure (fake_pat : kept) -- | Check whether a pattern can fail to match -cantFailPattern :: Pattern -> Bool -cantFailPattern p - | PmVar {} <- p = True - | PmCon {} <- p = singleConstructor (pm_con_con p) - && all cantFailPattern (pm_con_args p) -cantFailPattern (PmGrd pv _e) - = all cantFailPattern pv -cantFailPattern _ = False +cantFailPattern :: Pattern -> DsM Bool +cantFailPattern PmVar {} = pure True +cantFailPattern PmCon { pm_con_con = c, pm_con_arg_tys = tys, pm_con_args = ps} + = (&&) <$> singleMatchConstructor c tys <*> allM cantFailPattern ps +cantFailPattern (PmGrd pv _e) = allM cantFailPattern pv +cantFailPattern _ = pure False -- | Translate a guard statement to Pattern translateGuard :: FamInstEnvs -> GuardStmt GhcTc -> DsM PatVec @@ -1312,7 +1326,8 @@ translateLet _binds = return [] translateBind :: FamInstEnvs -> LPat GhcTc -> LHsExpr GhcTc -> DsM PatVec translateBind fam_insts (dL->L _ p) e = do ps <- translatePat fam_insts p - return [mkGuard ps (unLoc e)] + g <- mkGuard ps (unLoc e) + return [g] -- | Translate a boolean guard translateBoolGuard :: LHsExpr GhcTc -> DsM PatVec @@ -1321,7 +1336,7 @@ translateBoolGuard e -- The formal thing to do would be to generate (True <- True) -- but it is trivial to solve so instead we give back an empty -- PatVec for efficiency - | otherwise = return [mkGuard [truePattern] (unLoc e)] + | otherwise = (:[]) <$> mkGuard [truePattern] (unLoc e) {- Note [Guards and Approximation] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1658,13 +1673,14 @@ mkOneConFull x con = do -- * More smart constructors and fresh variable generation -- | Create a guard pattern -mkGuard :: PatVec -> HsExpr GhcTc -> Pattern -mkGuard pv e - | all cantFailPattern pv = PmGrd pv expr - | PmExprOther {} <- expr = fake_pat - | otherwise = PmGrd pv expr - where - expr = hsExprToPmExpr e +mkGuard :: PatVec -> HsExpr GhcTc -> DsM Pattern +mkGuard pv e = do + res <- allM cantFailPattern pv + let expr = hsExprToPmExpr e + tracePmD "mkGuard" (vcat [ppr pv, ppr e, ppr res, ppr expr]) + if | res -> pure (PmGrd pv expr) + | PmExprOther {} <- expr -> pure fake_pat + | otherwise -> pure (PmGrd pv expr) -- | Create a term equality of the form: `(False ~ (x ~ lit))` mkNegEq :: Id -> PmLit -> ComplexEq @@ -1738,14 +1754,37 @@ coercePmPat (PmCon { pm_con_con = con, pm_con_arg_tys = arg_tys , pm_con_args = coercePatVec args }] coercePmPat (PmGrd {}) = [] -- drop the guards --- | Check whether a data constructor is the only way to construct --- a data type. -singleConstructor :: ConLike -> Bool -singleConstructor (RealDataCon dc) = - case tyConDataCons (dataConTyCon dc) of - [_] -> True - _ -> False -singleConstructor _ = False +-- | Check whether a 'ConLike' has the /single match/ property, i.e. whether +-- it is the only possible match in the given context. See also +-- 'allCompleteMatches' and Note [Single match constructors]. +singleMatchConstructor :: ConLike -> [Type] -> DsM Bool +singleMatchConstructor cl tys = + any (isSingleton . snd) <$> allCompleteMatches cl tys + +{- +Note [Single match constructors] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When translating pattern guards for consumption by the checker, we desugar +every pattern guard that might fail ('cantFailPattern') to 'fake_pat' +(True <- _). Which patterns can't fail? Exactly those that only match on +'singleMatchConstructor's. + +Here are a few examples: + * @f a | (a, b) <- foo a = 42@: Product constructors are generally + single match. This extends to single constructors of GADTs like 'Refl'. + * If @f | Id <- id () = 42@, where @pattern Id = ()@ and 'Id' is part of a + singleton `COMPLETE` set, then 'Id' has the single match property. + +In effect, we can just enumerate 'allCompleteMatches' and check if the conlike +occurs as a singleton set. +There's the chance that 'Id' is part of multiple `COMPLETE` sets. That's +irrelevant; If the user specified a singleton set, it is single-match. + +Note that this doesn't really take into account incoming type constraints; +It might be obvious from type context that a particular GADT constructor has +the single-match property. We currently don't (can't) check this in the +translation step. See #15753 for why this yields surprising results. +-} -- | For a given conlike, finds all the sets of patterns which could -- be relevant to that conlike by consulting the result type. ===================================== testsuite/tests/pmcheck/should_compile/T15753a.hs ===================================== @@ -0,0 +1,28 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE TypeOperators #-} +{-# OPTIONS_GHC -Wincomplete-patterns #-} +module Bug where + +import Data.Type.Equality + +data G a where + GInt :: G Int + GBool :: G Bool + +ex1, ex2, ex3 + :: a :~: Int + -> G a + -> () + +ex1 Refl g + | GInt <- id g + = () + +ex2 Refl g + | GInt <- g + = () + +ex3 Refl g + = case id g of + GInt -> () + ===================================== testsuite/tests/pmcheck/should_compile/T15753b.hs ===================================== @@ -0,0 +1,10 @@ +{-# LANGUAGE PatternSynonyms #-} +module Bug where + +{-# COMPLETE Id #-} +pattern Id :: () +pattern Id = () + +bug :: () +bug | Id <- id () = () + ===================================== testsuite/tests/pmcheck/should_compile/T15884.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE ViewPatterns #-} + +module Bug where + +f :: Maybe a -> Bool +f (id->Nothing) = False +f (id->(Just _)) = True ===================================== testsuite/tests/pmcheck/should_compile/all.T ===================================== @@ -68,6 +68,12 @@ test('T15584', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) test('T15713', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15753a', expect_broken(15753), compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15753b', normal, compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15884', expect_broken(15884), compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) test('T16289', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9b41f5023a003aadb78ee6b9cf5413e1af3261b1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9b41f5023a003aadb78ee6b9cf5413e1af3261b1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 10:32:56 2019 From: gitlab at gitlab.haskell.org (Sebastian Graf) Date: Fri, 05 Apr 2019 06:32:56 -0400 Subject: [Git][ghc/ghc][wip/T15753] Make `singleConstructor` cope with pattern synonyms Message-ID: <5ca72ed882225_62b3d67433417404ea@gitlab.haskell.org.mail> Sebastian Graf pushed to branch wip/T15753 at Glasgow Haskell Compiler / GHC Commits: 9108ead1 by Sebastian Graf at 2019-04-05T10:26:01Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 5 changed files: - compiler/deSugar/Check.hs - + testsuite/tests/pmcheck/should_compile/T15753a.hs - + testsuite/tests/pmcheck/should_compile/T15753b.hs - + testsuite/tests/pmcheck/should_compile/T15884.hs - testsuite/tests/pmcheck/should_compile/all.T Changes: ===================================== compiler/deSugar/Check.hs ===================================== @@ -4,9 +4,13 @@ Author: George Karachalias Pattern Matching Coverage Checking. -} -{-# LANGUAGE CPP, GADTs, DataKinds, KindSignatures #-} -{-# LANGUAGE TupleSections #-} -{-# LANGUAGE ViewPatterns #-} +{-# LANGUAGE CPP #-} +{-# LANGUAGE GADTs #-} +{-# LANGUAGE DataKinds #-} +{-# LANGUAGE KindSignatures #-} +{-# LANGUAGE TupleSections #-} +{-# LANGUAGE ViewPatterns #-} +{-# LANGUAGE MultiWayIf #-} module Check ( -- Checking and printing @@ -55,7 +59,7 @@ import qualified GHC.LanguageExtensions as LangExt import Data.List (find) import Data.Maybe (catMaybes, isJust, fromMaybe) -import Control.Monad (forM, when, forM_, zipWithM) +import Control.Monad (forM, when, forM_, zipWithM, filterM) import Coercion import TcEvidence import TcSimplify (tcNormalise) @@ -289,6 +293,14 @@ data PmResult = , pmresultUncovered :: UncoveredCandidates , pmresultInaccessible :: [Located [LPat GhcTc]] } +instance Outputable PmResult where + ppr pmr = hang (text "PmResult") 2 $ vcat + [ text "pmresultProvenance" <+> ppr (pmresultProvenance pmr) + , text "pmresultRedundant" <+> ppr (pmresultRedundant pmr) + , text "pmresultUncovered" <+> ppr (pmresultUncovered pmr) + , text "pmresultInaccessible" <+> ppr (pmresultInaccessible pmr) + ] + -- | Either a list of patterns that are not covered, or their type, in case we -- have no patterns at hand. Not having patterns at hand can arise when -- handling EmptyCase expressions, in two cases: @@ -303,6 +315,10 @@ data PmResult = data UncoveredCandidates = UncoveredPatterns Uncovered | TypeOfUncovered Type +instance Outputable UncoveredCandidates where + ppr (UncoveredPatterns uc) = text "UnPat" <+> ppr uc + ppr (TypeOfUncovered ty) = text "UnTy" <+> ppr ty + -- | The empty pattern check result emptyPmResult :: PmResult emptyPmResult = PmResult FromBuiltin [] (UncoveredPatterns []) [] @@ -987,7 +1003,7 @@ translatePat fam_insts pat = case pat of | otherwise -> do ps <- translatePat fam_insts p (xp,xe) <- mkPmId2Forms ty - let g = mkGuard ps (mkHsWrap wrapper (unLoc xe)) + g <- mkGuard ps (mkHsWrap wrapper (unLoc xe)) return [xp,g] -- (n + k) ===> x (True <- x >= k) (n <- x-k) @@ -997,10 +1013,11 @@ translatePat fam_insts pat = case pat of ViewPat arg_ty lexpr lpat -> do ps <- translatePat fam_insts (unLoc lpat) -- See Note [Guards and Approximation] - case all cantFailPattern ps of + res <- allM cantFailPattern ps + case res of True -> do (xp,xe) <- mkPmId2Forms arg_ty - let g = mkGuard ps (HsApp noExt lexpr xe) + g <- mkGuard ps (HsApp noExt lexpr xe) return [xp,g] False -> mkCanFailPmPat arg_ty @@ -1255,41 +1272,38 @@ translateMatch _ _ = panic "translateMatch" translateGuards :: FamInstEnvs -> [GuardStmt GhcTc] -> DsM PatVec translateGuards fam_insts guards = do all_guards <- concat <$> mapM (translateGuard fam_insts) guards - return (replace_unhandled all_guards) - -- It should have been (return all_guards) but it is too expressive. + let + shouldKeep :: Pattern -> DsM Bool + shouldKeep p + | PmVar {} <- p = pure True + | PmCon {} <- p = (&&) + <$> singleMatchConstructor (pm_con_con p) (pm_con_arg_tys p) + <*> allM shouldKeep (pm_con_args p) + shouldKeep (PmGrd pv e) + | isNotPmExprOther e = pure True -- expensive but we want it + | otherwise = allM shouldKeep pv + shouldKeep _other_pat = pure False -- let the rest.. + + all_handled <- allM shouldKeep all_guards + -- It should have been @pure all_guards@ but it is too expressive. -- Since the term oracle does not handle all constraints we generate, -- we (hackily) replace all constraints the oracle cannot handle with a - -- single one (we need to know if there is a possibility of falure). + -- single one (we need to know if there is a possibility of failure). -- See Note [Guards and Approximation] for all guard-related approximations -- we implement. - where - replace_unhandled :: PatVec -> PatVec - replace_unhandled gv - | any_unhandled gv = fake_pat : [ p | p <- gv, shouldKeep p ] - | otherwise = gv - - any_unhandled :: PatVec -> Bool - any_unhandled gv = any (not . shouldKeep) gv - - shouldKeep :: Pattern -> Bool - shouldKeep p - | PmVar {} <- p = True - | PmCon {} <- p = singleConstructor (pm_con_con p) - && all shouldKeep (pm_con_args p) - shouldKeep (PmGrd pv e) - | all shouldKeep pv = True - | isNotPmExprOther e = True -- expensive but we want it - shouldKeep _other_pat = False -- let the rest.. + if all_handled + then pure all_guards + else do + kept <- filterM shouldKeep all_guards + pure (fake_pat : kept) -- | Check whether a pattern can fail to match -cantFailPattern :: Pattern -> Bool -cantFailPattern p - | PmVar {} <- p = True - | PmCon {} <- p = singleConstructor (pm_con_con p) - && all cantFailPattern (pm_con_args p) -cantFailPattern (PmGrd pv _e) - = all cantFailPattern pv -cantFailPattern _ = False +cantFailPattern :: Pattern -> DsM Bool +cantFailPattern PmVar {} = pure True +cantFailPattern PmCon { pm_con_con = c, pm_con_arg_tys = tys, pm_con_args = ps} + = (&&) <$> singleMatchConstructor c tys <*> allM cantFailPattern ps +cantFailPattern (PmGrd pv _e) = allM cantFailPattern pv +cantFailPattern _ = pure False -- | Translate a guard statement to Pattern translateGuard :: FamInstEnvs -> GuardStmt GhcTc -> DsM PatVec @@ -1312,7 +1326,8 @@ translateLet _binds = return [] translateBind :: FamInstEnvs -> LPat GhcTc -> LHsExpr GhcTc -> DsM PatVec translateBind fam_insts (dL->L _ p) e = do ps <- translatePat fam_insts p - return [mkGuard ps (unLoc e)] + g <- mkGuard ps (unLoc e) + return [g] -- | Translate a boolean guard translateBoolGuard :: LHsExpr GhcTc -> DsM PatVec @@ -1321,7 +1336,7 @@ translateBoolGuard e -- The formal thing to do would be to generate (True <- True) -- but it is trivial to solve so instead we give back an empty -- PatVec for efficiency - | otherwise = return [mkGuard [truePattern] (unLoc e)] + | otherwise = (:[]) <$> mkGuard [truePattern] (unLoc e) {- Note [Guards and Approximation] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1658,13 +1673,14 @@ mkOneConFull x con = do -- * More smart constructors and fresh variable generation -- | Create a guard pattern -mkGuard :: PatVec -> HsExpr GhcTc -> Pattern -mkGuard pv e - | all cantFailPattern pv = PmGrd pv expr - | PmExprOther {} <- expr = fake_pat - | otherwise = PmGrd pv expr - where - expr = hsExprToPmExpr e +mkGuard :: PatVec -> HsExpr GhcTc -> DsM Pattern +mkGuard pv e = do + res <- allM cantFailPattern pv + let expr = hsExprToPmExpr e + tracePmD "mkGuard" (vcat [ppr pv, ppr e, ppr res, ppr expr]) + if | res -> pure (PmGrd pv expr) + | PmExprOther {} <- expr -> pure fake_pat + | otherwise -> pure (PmGrd pv expr) -- | Create a term equality of the form: `(False ~ (x ~ lit))` mkNegEq :: Id -> PmLit -> ComplexEq @@ -1738,14 +1754,37 @@ coercePmPat (PmCon { pm_con_con = con, pm_con_arg_tys = arg_tys , pm_con_args = coercePatVec args }] coercePmPat (PmGrd {}) = [] -- drop the guards --- | Check whether a data constructor is the only way to construct --- a data type. -singleConstructor :: ConLike -> Bool -singleConstructor (RealDataCon dc) = - case tyConDataCons (dataConTyCon dc) of - [_] -> True - _ -> False -singleConstructor _ = False +-- | Check whether a 'ConLike' has the /single match/ property, i.e. whether +-- it is the only possible match in the given context. See also +-- 'allCompleteMatches' and Note [Single match constructors]. +singleMatchConstructor :: ConLike -> [Type] -> DsM Bool +singleMatchConstructor cl tys = + any (isSingleton . snd) <$> allCompleteMatches cl tys + +{- +Note [Single match constructors] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When translating pattern guards for consumption by the checker, we desugar +every pattern guard that might fail ('cantFailPattern') to 'fake_pat' +(True <- _). Which patterns can't fail? Exactly those that only match on +'singleMatchConstructor's. + +Here are a few examples: + * @f a | (a, b) <- foo a = 42@: Product constructors are generally + single match. This extends to single constructors of GADTs like 'Refl'. + * If @f | Id <- id () = 42@, where @pattern Id = ()@ and 'Id' is part of a + singleton `COMPLETE` set, then 'Id' has the single match property. + +In effect, we can just enumerate 'allCompleteMatches' and check if the conlike +occurs as a singleton set. +There's the chance that 'Id' is part of multiple `COMPLETE` sets. That's +irrelevant; If the user specified a singleton set, it is single-match. + +Note that this doesn't really take into account incoming type constraints; +It might be obvious from type context that a particular GADT constructor has +the single-match property. We currently don't (can't) check this in the +translation step. See #15753 for why this yields surprising results. +-} -- | For a given conlike, finds all the sets of patterns which could -- be relevant to that conlike by consulting the result type. ===================================== testsuite/tests/pmcheck/should_compile/T15753a.hs ===================================== @@ -0,0 +1,28 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE TypeOperators #-} +{-# OPTIONS_GHC -Wincomplete-patterns #-} +module Bug where + +import Data.Type.Equality + +data G a where + GInt :: G Int + GBool :: G Bool + +ex1, ex2, ex3 + :: a :~: Int + -> G a + -> () + +ex1 Refl g + | GInt <- id g + = () + +ex2 Refl g + | GInt <- g + = () + +ex3 Refl g + = case id g of + GInt -> () + ===================================== testsuite/tests/pmcheck/should_compile/T15753b.hs ===================================== @@ -0,0 +1,10 @@ +{-# LANGUAGE PatternSynonyms #-} +module Bug where + +{-# COMPLETE Id #-} +pattern Id :: () +pattern Id = () + +bug :: () +bug | Id <- id () = () + ===================================== testsuite/tests/pmcheck/should_compile/T15884.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE ViewPatterns #-} + +module Bug where + +f :: Maybe a -> Bool +f (id->Nothing) = False +f (id->(Just _)) = True ===================================== testsuite/tests/pmcheck/should_compile/all.T ===================================== @@ -68,6 +68,12 @@ test('T15584', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) test('T15713', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15753a', expect_broken(15753), compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15753b', normal, compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15884', expect_broken(15884), compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) test('T16289', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9108ead1d23cecd6c404eabc3d61a47079a099cb -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9108ead1d23cecd6c404eabc3d61a47079a099cb You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 11:52:33 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 05 Apr 2019 07:52:33 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 12 commits: testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways Message-ID: <5ca74181acac7_62b33fa2cc3d37c817624ca@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 3125c86e by Ben Gamari at 2019-04-05T11:48:32Z testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways As noted in #16535. - - - - - e4a9eed6 by Ömer Sinan Ağacan at 2019-04-05T11:48:32Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - d7d845ab by Ben Gamari at 2019-04-05T11:48:32Z testsuite: Make closureSize less sensitive to optimisation - - - - - 02f0b702 by Ben Gamari at 2019-04-05T11:48:32Z process: Skip process005 in ghci way - - - - - 02bcf96b by Ben Gamari at 2019-04-05T11:48:33Z testsuite: Mark T13167 as broken in threaded2 As noted in #16536. - - - - - 09636efc by Ben Gamari at 2019-04-05T11:48:33Z testsuite: Mark T13910 as broken in optasm Due to #16537. - - - - - e15d3b82 by Ben Gamari at 2019-04-05T11:48:33Z testsuite: Mark T14272 as broken in optasm - - - - - b27767a3 by Ben Gamari at 2019-04-05T11:48:33Z testsuite: Mark T14761c as broken in hpc and optasm ways As noted in #16540. - - - - - f1b8f2f0 by Ben Gamari at 2019-04-05T11:48:33Z testsuite: Mark T16180 as broken in ghci way As noted in #16541. - - - - - ab8fb4e5 by Ben Gamari at 2019-04-05T11:48:33Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - 14fb0b09 by Ben Gamari at 2019-04-05T11:48:33Z testsuite: Mark Overflow as broken in hpc way As noted in #16543. - - - - - 043c00bc by Ben Gamari at 2019-04-05T11:51:44Z testsuite: Mark closure_size as broken in threaded2, optasm, dyn ways As noted in #16531. - - - - - 12 changed files: - libraries/base/tests/all.T - libraries/ghc-heap/tests/all.T - libraries/ghc-heap/tests/closure_size.hs - libraries/process - testsuite/driver/testlib.py - testsuite/tests/dependent/should_compile/all.T - testsuite/tests/primops/should_run/all.T - testsuite/tests/simplCore/should_compile/all.T - testsuite/tests/th/all.T - testsuite/tests/typecheck/should_compile/all.T - testsuite/tests/typecheck/should_run/all.T - testsuite/tests/warnings/should_compile/all.T Changes: ===================================== libraries/base/tests/all.T ===================================== @@ -202,7 +202,7 @@ test('T8089', compile_and_run, ['']) test('hWaitForInput-accurate-socket', reqlib('unix'), compile_and_run, ['']) test('T8684', expect_broken(8684), compile_and_run, ['']) -test('hWaitForInput-accurate-stdin', normal, compile_and_run, ['']) +test('hWaitForInput-accurate-stdin', [expect_broken_for(16535, ['threaded1', 'threaded2']), omit_ways(['ghci'])], compile_and_run, ['']) test('hWaitForInput-accurate-pipe', reqlib('unix'), compile_and_run, ['']) test('T9826',normal, compile_and_run,['']) test('T9848', @@ -233,5 +233,5 @@ test('T3474', test('T14425', normal, compile_and_run, ['']) test('T10412', normal, compile_and_run, ['']) test('T13896', normal, compile_and_run, ['']) -test('T13167', normal, compile_and_run, ['']) +test('T13167', expect_broken_for(16536, ['threaded2']), compile_and_run, ['']) test('T15349', [exit_code(1), expect_broken_for(15349, 'ghci')], compile_and_run, ['']) ===================================== libraries/ghc-heap/tests/all.T ===================================== @@ -7,5 +7,5 @@ test('heap_all', ], compile_and_run, ['']) test('closure_size', - omit_ways(['ghci', 'hpc', 'prof']), + [omit_ways(['ghci', 'hpc', 'prof']), expect_broken_for(16544, ['dyn', 'optasm', 'threaded2'])], compile_and_run, ['']) ===================================== libraries/ghc-heap/tests/closure_size.hs ===================================== @@ -1,6 +1,7 @@ {-# LANGUAGE BangPatterns #-} {-# LANGUAGE TypeApplications #-} {-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE MagicHash #-} import Control.Monad import Type.Reflection @@ -17,12 +18,17 @@ assertSize !x expected = do putStrLn $ prettyCallStack callStack {-# NOINLINE assertSize #-} -pap :: Int -> Char -> Int +pap :: Int -> Maybe Char -> Int pap x _ = x {-# NOINLINE pap #-} main :: IO () main = do + -- Ensure that GHC can't turn PAP into a FUN (see #16531) + let x :: Int + x = 42 + {-# NOINLINE x #-} + assertSize 'a' 2 assertSize (Just ()) 2 assertSize (Nothing :: Maybe ()) 2 @@ -30,5 +36,5 @@ main = do assertSize ((1,2,3) :: (Int,Int,Int)) 4 assertSize (id :: Int -> Int) 1 assertSize (fst :: (Int,Int) -> Int) 1 - assertSize (pap 1) 2 + assertSize (pap x) 2 ===================================== libraries/process ===================================== @@ -1 +1 @@ -Subproject commit d860209e53c1b40b7c251fc8378886bbcb394402 +Subproject commit 1a6197ff2112ed9849589b348981754ee1d3ca23 ===================================== testsuite/driver/testlib.py ===================================== @@ -1408,7 +1408,6 @@ def simple_run(name, way, prog, extra_run_opts): return failBecause('bad stderr') if not (opts.ignore_stdout or stdout_ok(name, way)): return failBecause('bad stdout') - check_hp = '-h' in my_rts_flags and opts.check_hp check_prof = '-p' in my_rts_flags ===================================== testsuite/tests/dependent/should_compile/all.T ===================================== @@ -40,7 +40,7 @@ test('T12742', normal, compile, ['']) # (1) Use -fexternal-interpreter, or # (2) Build the program twice: once with -dynamic, and then # with -prof using -osuf to set a different object file suffix. -test('T13910', omit_ways(['profasm']), compile, ['']) +test('T13910', [expect_broken_for(16537, ['optasm']), omit_ways(['profasm'])], compile, ['']) test('T13938', [extra_files(['T13938a.hs'])], makefile_test, ['T13938']) test('T14556', normal, compile, ['']) test('T14720', normal, compile, ['']) ===================================== testsuite/tests/primops/should_run/all.T ===================================== @@ -20,7 +20,9 @@ test('CmpInt8', normal, compile_and_run, ['']) test('CmpWord8', normal, compile_and_run, ['']) test('ShowPrim', normal, compile_and_run, ['']) -test('ArithInt16', normal, compile_and_run, ['']) -test('ArithWord16', normal, compile_and_run, ['']) +# These two tests use unboxed tuples, which GHCi doesn't support +test('ArithInt16', omit_ways(['ghci']), compile_and_run, ['']) +test('ArithWord16', omit_ways(['ghci']), compile_and_run, ['']) + test('CmpInt16', normal, compile_and_run, ['']) -test('CmpWord16', normal, compile_and_run, ['']) \ No newline at end of file +test('CmpWord16', normal, compile_and_run, ['']) ===================================== testsuite/tests/simplCore/should_compile/all.T ===================================== @@ -275,7 +275,7 @@ test('T14140', normal, makefile_test, ['T14140']) -test('T14272', normal, compile, ['']) +test('T14272', expect_broken_for(16539, ['optasm']), compile, ['']) test('T14270a', normal, compile, ['']) test('T14152', [ only_ways(['optasm']), check_errmsg(r'dead code') ], compile, ['-ddump-simpl']) test('T14152a', [extra_files(['T14152.hs']), pre_cmd('cp T14152.hs T14152a.hs'), ===================================== testsuite/tests/th/all.T ===================================== @@ -467,7 +467,7 @@ test('T15437', expect_broken(15437), multimod_compile, test('T15985', normal, compile, ['']) test('T16133', normal, compile_fail, ['']) test('T15471', normal, multimod_compile, ['T15471.hs', '-v0']) -test('T16180', normal, compile_and_run, ['-package ghc']) +test('T16180', expect_broken_for(16541, ['ghci']), compile_and_run, ['-package ghc']) test('T16183', normal, compile, ['-v0 -ddump-splices -dsuppress-uniques']) test('T16195', normal, multimod_compile, ['T16195.hs', '-v0']) test('T16293b', normal, compile, ['']) ===================================== testsuite/tests/typecheck/should_compile/all.T ===================================== @@ -659,7 +659,7 @@ test('T15586', normal, compile, ['']) test('T15368', normal, compile, ['-fdefer-type-errors']) test('T15645', normal, compile, ['-Wwarn=missing-monadfail-instances']) test('T15778', normal, compile, ['']) -test('T14761c', normal, compile, ['']) +test('T14761c', expect_broken_for(16540, ['hpc', 'optasm']), compile, ['']) test('T16008', normal, compile, ['']) test('T16033', normal, compile, ['']) test('T16141', normal, compile, ['-O']) ===================================== testsuite/tests/typecheck/should_run/all.T ===================================== @@ -35,7 +35,7 @@ test('tcrun018', normal, compile_and_run, ['']) test('tcrun019', normal, compile_and_run, ['']) test('tcrun020', normal, compile_and_run, ['']) test('tcrun021', normal, compile_and_run, ['-package containers']) -test('tcrun022', omit_ways(['ghci']), compile_and_run, ['-O']) +test('tcrun022', omit_ways(['hpc', 'ghci']), compile_and_run, ['-O']) test('tcrun023', [when(doing_ghci(), extra_hc_opts('-fobject-code'))], compile_and_run, ['-O']) test('tcrun024', normal, compile_and_run, ['-O']) ===================================== testsuite/tests/warnings/should_compile/all.T ===================================== @@ -25,4 +25,4 @@ test('MissingMod', normal, multimod_compile, ['MissingMod', '-Wmissing-home-modu test('StarBinder', normal, compile, ['']) -test('Overflow', normal, compile, ['']) +test('Overflow', expect_broken_for(16543, ['hpc']), compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/b58908e0ab93b49303fb7a06db38b397f4a3d684...043c00bc5418d51c94b7a9e346bc9c11e0415825 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/b58908e0ab93b49303fb7a06db38b397f4a3d684...043c00bc5418d51c94b7a9e346bc9c11e0415825 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 12:23:11 2019 From: gitlab at gitlab.haskell.org (Sebastian Graf) Date: Fri, 05 Apr 2019 08:23:11 -0400 Subject: [Git][ghc/ghc][wip/dmd-arity] 34 commits: Add support for bitreverse primop Message-ID: <5ca748af54f2_62b33fa2cc3d37c81777612@gitlab.haskell.org.mail> Sebastian Graf pushed to branch wip/dmd-arity at Glasgow Haskell Compiler / GHC Commits: 33173a51 by Alexandre at 2019-04-01T07:32:28Z Add support for bitreverse primop This commit includes the necessary changes in code and documentation to support a primop that reverses a word's bits. It also includes a test. - - - - - a3971b4e by Alexandre at 2019-04-01T07:32:28Z Bump ghc-prim's version where needed - - - - - 061276ea by Michael Sloan at 2019-04-01T07:32:30Z Remove unnecessary uses of UnboxedTuples pragma (see #13101 / #15454) Also removes a couple unnecessary MagicHash pragmas - - - - - e468c613 by David Eichmann at 2019-04-01T07:32:34Z Support Shake's --lint-fsatrace feature. Using this feature requires fsatrace (e.g. https://github.com/jacereda/fsatrace). Simply use the `--lint-fsatrace` option when running hadrian. Shake version >= 0.17.7 is required to support linting out of tree build dirs. - - - - - 1e9e4197 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add merge request template for backports for 8.8 - - - - - 55650d14 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add some simply issue templates - - - - - 27b99ed8 by Takenobu Tani at 2019-04-01T07:32:36Z Clean up URLs to point to GitLab This moves URL references to old Trac to their corresponding GitLab counterparts. This patch does not update the submodule library, such as libraries/Cabal. See also !539, !606, !618 [ci skip] - - - - - 18d1555d by Adam Sandberg Eriksson at 2019-04-01T07:32:38Z configure: document the use of the LD variable - - - - - 10352efa by Ben Gamari at 2019-04-01T22:22:34Z gitlab: Add feature request MR template - - - - - 1e52054b by Ben Gamari at 2019-04-01T23:16:21Z gitlab: Move feature request template to issue_templates Whoops. - - - - - e5c21ca9 by Ben Gamari at 2019-04-01T23:16:25Z gitlab: Mention ~"user facing" label - - - - - 39282422 by Ryan Scott at 2019-04-02T00:01:38Z Bump array submodule This bumps `array` to version 0.5.4.0 so that we can distinguish it with `MIN_VERSION_array` (as it introduces some changes to the `Show` instance for `UArray`). - - - - - 7cf5ba3d by Michal Terepeta at 2019-04-02T00:07:49Z Improve performance of newSmallArray# This: - Hoists part of the condition outside of the initialization loop in `stg_newSmallArrayzh`. - Annotates one of the unlikely branches as unlikely, also in `stg_newSmallArrayzh`. - Adds a couple of annotations to `allocateMightFail` indicating which branches are likely to be taken. Together this gives about 5% improvement. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - dd9c82ef by David Eichmann at 2019-04-02T00:13:55Z Hadrian: correct deps for ghc builder. Previously, when needing ghc as a builder, the ghcDeps (Files the GHC binary depends on) for the current stage were needed. This is incorrect as the previous stage's ghc is used for building. This commit fixes the issue, needing the previous stage's ghcDeps. - - - - - 345306d3 by Alexandre Baldé at 2019-04-02T16:34:30Z Fix formatting issue in ghc-prim's changelog [skip ci] - - - - - f54b5124 by David Eichmann at 2019-04-02T16:40:39Z Hadrian: traceAllow deep dependencies when compilling haskell object files. - - - - - d132b30a by David Eichmann at 2019-04-02T16:40:39Z Hadrian: lint ignore autom4te and ghc-pkg cache files. - - - - - bf734195 by Simon Marlow at 2019-04-02T16:46:46Z Add myself to libraries/ghci - - - - - 5a75ccd0 by klebinger.andreas at gmx.at at 2019-04-03T04:34:57Z Fix faulty substitutions in StgCse (#11532). `substBndr` should rename bindings which shadow existing ids. However while it was renaming the bindings it was not adding proper substitutions for renamed bindings. Instead of adding a substitution of the form `old -> new` for renamed bindings it mistakenly added `old -> old` if no replacement had taken place while adding none if `old` had been renamed. As a byproduct this should improve performance, as we no longer add useless substitutions for unshadowed bindings. - - - - - 2ec749b5 by Nathan Collins at 2019-04-03T04:41:05Z users-guide: Fix typo - - - - - ea192a09 by Andrew Martin at 2019-04-03T04:41:05Z base: Add documentation that liftA2 used to not be a typeclass method - - - - - 733f1b52 by Frank Steffahn at 2019-04-03T04:41:05Z users-guide: Typo in Users Guide, Glasgow Exts - - - - - 3364def0 by Ben Gamari at 2019-04-03T04:41:05Z integer-gmp: Write friendlier documentation for Integer - - - - - dd3a3d08 by Ben Gamari at 2019-04-03T04:41:05Z integer-simple: Add documentation for Integer type - - - - - 722fdddf by Chris Martin at 2019-04-03T04:41:05Z Correct two misspellings of "separately" - - - - - bf6dbe3d by Chris Martin at 2019-04-03T04:41:05Z Inline the definition of 'ap' in the Monad laws The law as it is currently written is meaningless, because nowhere have we defined the implementation of 'ap'. The reader of the Control.Monad documentation is provided with only a type signature, > ap :: Monad m => m (a -> b) -> m a -> m b an informal description, > In many situations, the liftM operations can be replaced by uses of > ap, which promotes function application. and a relationship between 'ap' and the 'liftM' functions > return f `ap` x1 `ap` ... `ap` xn > is equivalent to > liftMn f x1 x2 ... xn Without knowing how 'ap' is defined, a law involving 'ap' cannot provide any guidance for how to write a lawful Monad instance, nor can we conclude anything from the law. I suspect that a reader equipped with the understanding that 'ap' was defined prior to the invention of the Applicative class could deduce that 'ap' must be defined in terms of (>>=), but nowhere as far as I can tell have we written this down explicitly for readers without the benefit of historical context. If the law is meant to express a relationship among (<*>), (>>=), and 'return', it seems that it is better off making this statement directly, sidestepping 'ap' altogether. - - - - - 7b090b53 by Ben Gamari at 2019-04-03T07:57:40Z configure: Always use AC_LINK_ELSEIF when testing against assembler This fixes #16440, where the build system incorrectly concluded that the `.subsections_via_symbols` assembler directive was supported on a Linux system. This was caused by the fact that gcc was invoked with `-flto`; when so-configured gcc does not call the assembler but rather simply serialises its AST for compilation during the final link. This is described in Note [autoconf assembler checks and -flto]. - - - - - 4626cf21 by Sebastian Graf at 2019-04-03T08:03:47Z Fix Uncovered set of literal patterns Issues #16289 and #15713 are proof that the pattern match checker did an unsound job of estimating the value set abstraction corresponding to the uncovered set. The reason is that the fix from #11303 introducing `NLit` was incomplete: The `LitCon` case desugared to `Var` rather than `LitVar`, which would have done the necessary case splitting analogous to the `ConVar` case. This patch rectifies that by introducing the fresh unification variable in `LitCon` in value abstraction position rather than pattern postition, recording a constraint equating it to the constructor expression rather than the literal. Fixes #16289 and #15713. - - - - - 6f13e7b1 by Ben Gamari at 2019-04-03T12:12:26Z gitlab-ci: Build hyperlinked sources for releases Fixes #16445. - - - - - 895394c2 by Ben Gamari at 2019-04-03T12:15:06Z gitlab: Fix label names in issue templates - - - - - 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - e1cc1254 by Sebastian Graf at 2019-04-05T12:23:02Z Compute demand signatures assuming idArity This does four things: 1. Look at `idArity` instead of manifest lambdas to decide whether to use LetUp 2. Compute the strictness signature in LetDown assuming at least `idArity` incoming arguments 3. Remove the special case for trivial RHSs, which is subsumed by 2 4. Don't perform the W/W split when doing so would eta expand a binding. Otherwise we would eta expand PAPs, causing unnecessary churn in the Simplifier. NoFib Results -------------------------------------------------------------------------------- Program Allocs Instrs -------------------------------------------------------------------------------- fannkuch-redux +0.3% 0.0% gg -0.0% -0.1% maillist +0.2% +0.2% minimax 0.0% +0.8% pretty 0.0% -0.1% scc 0.0% -0.1% -------------------------------------------------------------------------------- Min -0.0% -0.1% Max +0.3% +0.8% Geometric Mean +0.0% +0.0% - - - - - 30 changed files: - .gitlab-ci.yml - + .gitlab/issue_templates/bug.md - + .gitlab/issue_templates/feature_request.md - + .gitlab/merge_request_templates/backport-for-8.8.md - .gitlab/merge_request_templates/merge-request.md - ANNOUNCE - CODEOWNERS - HACKING.md - README.md - aclocal.m4 - boot - compiler/basicTypes/Demand.hs - compiler/basicTypes/Id.hs - compiler/basicTypes/IdInfo.hs - compiler/basicTypes/Var.hs - compiler/cmm/CmmMachOp.hs - compiler/cmm/PprC.hs - compiler/codeGen/StgCmmMonad.hs - compiler/codeGen/StgCmmPrim.hs - compiler/coreSyn/CoreArity.hs - compiler/coreSyn/CoreLint.hs - compiler/coreSyn/CoreUnfold.hs - compiler/deSugar/Check.hs - compiler/deSugar/TmOracle.hs - compiler/ghci/ByteCodeLink.hs - compiler/ghci/RtClosureInspect.hs - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/llvmGen/LlvmCodeGen/CodeGen.hs - compiler/main/InteractiveEval.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f07b61d047967129a3ae0c56f8894d41c5a9b036...e1cc1254b81a7adadd8db77c7be625497264ab2b -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f07b61d047967129a3ae0c56f8894d41c5a9b036...e1cc1254b81a7adadd8db77c7be625497264ab2b You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 13:50:31 2019 From: gitlab at gitlab.haskell.org (Sebastian Graf) Date: Fri, 05 Apr 2019 09:50:31 -0400 Subject: [Git][ghc/ghc][wip/dmd-arity] Compute demand signatures assuming idArity Message-ID: <5ca75d279e137_62b33fa2e6e662d41795158@gitlab.haskell.org.mail> Sebastian Graf pushed to branch wip/dmd-arity at Glasgow Haskell Compiler / GHC Commits: 1a670dc0 by Sebastian Graf at 2019-04-05T13:50:20Z Compute demand signatures assuming idArity This does four things: 1. Look at `idArity` instead of manifest lambdas to decide whether to use LetUp 2. Compute the strictness signature in LetDown assuming at least `idArity` incoming arguments 3. Remove the special case for trivial RHSs, which is subsumed by 2 4. Don't perform the W/W split when doing so would eta expand a binding. Otherwise we would eta expand PAPs, causing unnecessary churn in the Simplifier. NoFib Results -------------------------------------------------------------------------------- Program Allocs Instrs -------------------------------------------------------------------------------- fannkuch-redux +0.3% 0.0% gg -0.0% -0.1% maillist +0.2% +0.2% minimax 0.0% +0.8% pretty 0.0% -0.1% scc 0.0% -0.1% -------------------------------------------------------------------------------- Min -0.0% -0.1% Max +0.3% +0.8% Geometric Mean +0.0% +0.0% - - - - - 17 changed files: - compiler/basicTypes/Demand.hs - compiler/basicTypes/Id.hs - compiler/basicTypes/IdInfo.hs - compiler/basicTypes/Var.hs - compiler/coreSyn/CoreArity.hs - compiler/coreSyn/CoreLint.hs - compiler/coreSyn/CoreUnfold.hs - compiler/simplCore/SimplMonad.hs - compiler/simplCore/SimplUtils.hs - compiler/stranal/DmdAnal.hs - compiler/stranal/WorkWrap.hs - compiler/stranal/WwLib.hs - + testsuite/tests/perf/compiler/WWRec.hs - testsuite/tests/perf/compiler/all.T - + testsuite/tests/stranal/sigs/NewtypeArity.hs - + testsuite/tests/stranal/sigs/NewtypeArity.stderr - testsuite/tests/stranal/sigs/all.T Changes: ===================================== compiler/basicTypes/Demand.hs ===================================== @@ -22,7 +22,7 @@ module Demand ( DmdType(..), dmdTypeDepth, lubDmdType, bothDmdType, nopDmdType, botDmdType, mkDmdType, - addDemand, removeDmdTyArgs, + addDemand, ensureArgs, BothDmdArg, mkBothDmdArg, toBothDmdArg, DmdEnv, emptyDmdEnv, @@ -34,7 +34,7 @@ module Demand ( vanillaCprProdRes, cprSumRes, appIsBottom, isBottomingSig, pprIfaceStrictSig, trimCPRInfo, returnsCPR_maybe, - StrictSig(..), mkStrictSig, mkClosedStrictSig, + StrictSig(..), mkStrictSigForArity, mkClosedStrictSig, nopSig, botSig, cprProdSig, isTopSig, hasDemandEnvSig, splitStrictSig, strictSigDmdEnv, @@ -47,10 +47,10 @@ module Demand ( deferAfterIO, postProcessUnsat, postProcessDmdType, - splitProdDmd_maybe, peelCallDmd, peelManyCalls, mkCallDmd, + splitProdDmd_maybe, peelCallDmd, peelManyCalls, mkCallDmd, mkCallDmds, mkWorkerDemand, dmdTransformSig, dmdTransformDataConSig, dmdTransformDictSelSig, argOneShots, argsOneShots, saturatedByOneShots, - trimToType, TypeShape(..), + TypeShape(..), peelTsFuns, trimToType, useCount, isUsedOnce, reuseEnv, killUsageDemand, killUsageSig, zapUsageDemand, zapUsageEnvSig, @@ -675,10 +675,15 @@ mkProdDmd dx = JD { sd = mkSProd $ map getStrDmd dx , ud = mkUProd $ map getUseDmd dx } +-- | Wraps the 'CleanDemand' with a one-shot call demand: @d@ -> @C1(d)@. mkCallDmd :: CleanDemand -> CleanDemand mkCallDmd (JD {sd = d, ud = u}) = JD { sd = mkSCall d, ud = mkUCall One u } +-- | @mkCallDmds n d@ returns @C1(C1...(C1 d))@ where there are @n@ @C1@'s. +mkCallDmds :: Arity -> CleanDemand -> CleanDemand +mkCallDmds arity cd = iterate mkCallDmd cd !! arity + -- See Note [Demand on the worker] in WorkWrap mkWorkerDemand :: Int -> Demand mkWorkerDemand n = JD { sd = Lazy, ud = Use One (go n) } @@ -804,6 +809,13 @@ instance Outputable TypeShape where ppr (TsFun ts) = text "TsFun" <> parens (ppr ts) ppr (TsProd tss) = parens (hsep $ punctuate comma $ map ppr tss) +-- | @peelTsFuns n ts@ tries to peel off @n@ 'TsFun' constructors from @ts@ and +-- returns 'Just' the wrapped 'TypeShape' on success, and 'Nothing' otherwise. +peelTsFuns :: Arity -> TypeShape -> Maybe TypeShape +peelTsFuns 0 ts = Just ts +peelTsFuns n (TsFun ts) = peelTsFuns (n-1) ts +peelTsFuns _ _ = Nothing + trimToType :: Demand -> TypeShape -> Demand -- See Note [Trimming a demand to a type] trimToType (JD { sd = ms, ud = mu }) ts @@ -1207,12 +1219,8 @@ mkDmdType fv ds res = DmdType fv ds res dmdTypeDepth :: DmdType -> Arity dmdTypeDepth (DmdType _ ds _) = length ds --- Remove any demand on arguments. This is used in dmdAnalRhs on the body -removeDmdTyArgs :: DmdType -> DmdType -removeDmdTyArgs = ensureArgs 0 - --- This makes sure we can use the demand type with n arguments, --- It extends the argument list with the correct resTypeArgDmd +-- | This makes sure we can use the demand type with n arguments. +-- It extends the argument list with the correct resTypeArgDmd. -- It also adjusts the DmdResult: Divergence survives additional arguments, -- CPR information does not (and definite converge also would not). ensureArgs :: Arity -> DmdType -> DmdType @@ -1567,8 +1575,56 @@ and on the second, then returning a constructor. If this same function is applied to one arg, all we can say is that it uses x with , and its arg with demand . + +Note [Understanding DmdType and StrictSig] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Demand types are sound approximations of an expression's semantics relative to +the incoming demand we put the expression under. Consider the following +expression: + + \x y -> x `seq` (y, 2*x) + +Here is a table with demand types resulting from different incoming demands we +put that expression under. Note the monotonicity; a stronger incoming demand +yields a more precise demand type: + + incoming demand | demand type + ---------------------------------------------------- + | {} + | {} + | {} + +Note that in the first example, the depth of the demand type was *higher* than +the arity of the incoming call demand due to the anonymous lambda. +The converse is also possible and happens when we unleash demand signatures. +In @f x y@, the incoming call demand on f has arity 2. But if all we have is a +demand signature with depth 1 for @f@ (which we can safely unleash, see below), +the demand type of @f@ under a call demand of arity 2 has a *lower* depth of 1. + +So: Demand types are elicited by putting an expression under an incoming (call) +demand, the arity of which can be lower or higher than the depth of the +resulting demand type. +In contrast, a demand signature summarises a function's semantics *without* +immediately specifying the incoming demand it was produced under. Despite StrSig +being a newtype wrapper around DmdType, it actually encodes two things: + + * The threshold (i.e., minimum arity) to unleash the signature + * A demand type that is sound to unleash when the minimum arity requirement is + met. + +Here comes the subtle part: The threshold is encoded in the wrapped demand +type's depth! So in mkStrictSigForArity we make sure to trim the list of +argument demands to the given threshold arity. Call sites will make sure that +this corresponds to the arity of the call demand that elicited the wrapped +demand type. See also Note [What are demand signatures?] in DmdAnal. + +Besides trimming argument demands, mkStrictSigForArity will also trim CPR +information if necessary. -} +-- | The depth of the wrapped 'DmdType' encodes the arity at which it is safe +-- to unleash. Better construct this through 'mkStrictSigForArity'. +-- See Note [Understanding DmdType and StrictSig] newtype StrictSig = StrictSig DmdType deriving( Eq ) @@ -1580,34 +1636,43 @@ pprIfaceStrictSig :: StrictSig -> SDoc pprIfaceStrictSig (StrictSig (DmdType _ dmds res)) = hcat (map ppr dmds) <> ppr res -mkStrictSig :: DmdType -> StrictSig -mkStrictSig dmd_ty = StrictSig dmd_ty +-- | Turns a 'DmdType' computed for the particular 'Arity' into a 'StrictSig' +-- unleashable at that arity. See Note [Understanding DmdType and StrictSig] +mkStrictSigForArity :: Arity -> DmdType -> StrictSig +mkStrictSigForArity arity dmd_ty = StrictSig (ensureArgs arity dmd_ty) mkClosedStrictSig :: [Demand] -> DmdResult -> StrictSig -mkClosedStrictSig ds res = mkStrictSig (DmdType emptyDmdEnv ds res) +mkClosedStrictSig ds res = mkStrictSigForArity (length ds) (DmdType emptyDmdEnv ds res) splitStrictSig :: StrictSig -> ([Demand], DmdResult) splitStrictSig (StrictSig (DmdType _ dmds res)) = (dmds, res) increaseStrictSigArity :: Int -> StrictSig -> StrictSig --- Add extra arguments to a strictness signature +-- ^ Add extra arguments to a strictness signature. +-- In contrast to 'etaExpandStrictSig', this /prepends/ additional argument +-- demands and leaves CPR info intact. increaseStrictSigArity arity_increase sig@(StrictSig dmd_ty@(DmdType env dmds res)) | isTopDmdType dmd_ty = sig - | arity_increase <= 0 = sig + | arity_increase == 0 = sig + | arity_increase < 0 = WARN( True, text "increaseStrictSigArity:" + <+> text "negative arity increase" + <+> ppr arity_increase ) + nopSig | otherwise = StrictSig (DmdType env dmds' res) where dmds' = replicate arity_increase topDmd ++ dmds etaExpandStrictSig :: Arity -> StrictSig -> StrictSig --- We are expanding (\x y. e) to (\x y z. e z) --- Add exta demands to the /end/ of the arg demands if necessary -etaExpandStrictSig arity sig@(StrictSig dmd_ty@(DmdType env dmds res)) - | isTopDmdType dmd_ty = sig - | arity_increase <= 0 = sig - | otherwise = StrictSig (DmdType env dmds' res) - where - arity_increase = arity - length dmds - dmds' = dmds ++ replicate arity_increase topDmd +-- ^ We are expanding (\x y. e) to (\x y z. e z). +-- In contrast to 'increaseStrictSigArity', this /appends/ extra arg demands if +-- necessary, potentially destroying the signature's CPR property. +etaExpandStrictSig arity (StrictSig dmd_ty) + | arity < dmdTypeDepth dmd_ty + -- an arity decrease must zap the whole signature, because it was possibly + -- computed for a higher incoming call demand. + = nopSig + | otherwise + = StrictSig $ ensureArgs arity dmd_ty isTopSig :: StrictSig -> Bool isTopSig (StrictSig ty) = isTopDmdType ty ===================================== compiler/basicTypes/Id.hs ===================================== @@ -668,6 +668,7 @@ isBottomingId v | isId v = isBottomingSig (idStrictness v) | otherwise = False +-- | Accesses the 'Id''s 'strictnessInfo'. idStrictness :: Id -> StrictSig idStrictness id = strictnessInfo (idInfo id) ===================================== compiler/basicTypes/IdInfo.hs ===================================== @@ -237,22 +237,34 @@ pprIdDetails other = brackets (pp other) -- too big. data IdInfo = IdInfo { - arityInfo :: !ArityInfo, -- ^ 'Id' arity - ruleInfo :: RuleInfo, -- ^ Specialisations of the 'Id's function which exist - -- See Note [Specialisations and RULES in IdInfo] - unfoldingInfo :: Unfolding, -- ^ The 'Id's unfolding - cafInfo :: CafInfo, -- ^ 'Id' CAF info - oneShotInfo :: OneShotInfo, -- ^ Info about a lambda-bound variable, if the 'Id' is one - inlinePragInfo :: InlinePragma, -- ^ Any inline pragma atached to the 'Id' - occInfo :: OccInfo, -- ^ How the 'Id' occurs in the program - - strictnessInfo :: StrictSig, -- ^ A strictness signature - - demandInfo :: Demand, -- ^ ID demand information - callArityInfo :: !ArityInfo, -- ^ How this is called. - -- n <=> all calls have at least n arguments - - levityInfo :: LevityInfo -- ^ when applied, will this Id ever have a levity-polymorphic type? + arityInfo :: !ArityInfo, + -- ^ 'Id' arity, as computed by 'CoreArity'. Specifies how many + -- arguments this 'Id' has to be applied to before it doesn any + -- meaningful work. + ruleInfo :: RuleInfo, + -- ^ Specialisations of the 'Id's function which exist. + -- See Note [Specialisations and RULES in IdInfo] + unfoldingInfo :: Unfolding, + -- ^ The 'Id's unfolding + cafInfo :: CafInfo, + -- ^ 'Id' CAF info + oneShotInfo :: OneShotInfo, + -- ^ Info about a lambda-bound variable, if the 'Id' is one + inlinePragInfo :: InlinePragma, + -- ^ Any inline pragma atached to the 'Id' + occInfo :: OccInfo, + -- ^ How the 'Id' occurs in the program + strictnessInfo :: StrictSig, + -- ^ A strictness signature. Digests how a function uses its arguments + -- if applied to at least 'arityInfo' arguments. + demandInfo :: Demand, + -- ^ ID demand information + callArityInfo :: !ArityInfo, + -- ^ How this is called. This is the number of arguments to which a + -- binding can be eta-expanded without losing any sharing. + -- n <=> all calls have at least n arguments + levityInfo :: LevityInfo + -- ^ when applied, will this Id ever have a levity-polymorphic type? } -- Setters ===================================== compiler/basicTypes/Var.hs ===================================== @@ -700,6 +700,8 @@ setIdNotExported id = ASSERT( isLocalId id ) ************************************************************************ -} +-- | Is this a type-level (i.e., computationally irrelevant, thus erasable) +-- variable? Satisfies @isTyVar = not . isId at . isTyVar :: Var -> Bool -- True of both TyVar and TcTyVar isTyVar (TyVar {}) = True isTyVar (TcTyVar {}) = True @@ -712,17 +714,21 @@ isTcTyVar _ = False isTyCoVar :: Var -> Bool isTyCoVar v = isTyVar v || isCoVar v +-- | Is this a value-level (i.e., computationally relevant) 'Id'entifier? +-- Satisfies @isId = not . isTyVar at . isId :: Var -> Bool isId (Id {}) = True isId _ = False +-- | Is this a coercion variable? +-- Satisfies @'isId' v ==> 'isCoVar' v == not ('isNonCoVarId' v)@. isCoVar :: Var -> Bool --- A coercion variable isCoVar (Id { id_details = details }) = isCoVarDetails details isCoVar _ = False +-- | Is this a term variable ('Id') that is /not/ a coercion variable? +-- Satisfies @'isId' v ==> 'isCoVar' v == not ('isNonCoVarId' v)@. isNonCoVarId :: Var -> Bool --- A term variable (Id) that is /not/ a coercion variable isNonCoVarId (Id { id_details = details }) = not (isCoVarDetails details) isNonCoVarId _ = False ===================================== compiler/coreSyn/CoreArity.hs ===================================== @@ -158,7 +158,7 @@ exprBotStrictness_maybe e {- Note [exprArity invariant] ~~~~~~~~~~~~~~~~~~~~~~~~~~ -exprArity has the following invariant: +exprArity has the following invariants: (1) If typeArity (exprType e) = n, then manifestArity (etaExpand e n) = n ===================================== compiler/coreSyn/CoreLint.hs ===================================== @@ -570,15 +570,9 @@ lintSingleBinding top_lvl_flag rec_flag (binder,rhs) (addWarnL (text "INLINE binder is (non-rule) loop breaker:" <+> ppr binder)) -- Only non-rule loop breakers inhibit inlining - -- Check whether arity and demand type are consistent (only if demand analysis - -- already happened) - -- - -- Note (Apr 2014): this is actually ok. See Note [Demand analysis for trivial right-hand sides] - -- in DmdAnal. After eta-expansion in CorePrep the rhs is no longer trivial. - -- ; let dmdTy = idStrictness binder - -- ; checkL (case dmdTy of - -- StrictSig dmd_ty -> idArity binder >= dmdTypeDepth dmd_ty || exprIsTrivial rhs) - -- (mkArityMsg binder) + -- We used to check that the dmdTypeDepth of a demand signature never + -- exceeds idArity, but that is an unnecessary complication, see + -- Note [idArity varies independently of dmdTypeDepth] in DmdAnal -- Check that the binder's arity is within the bounds imposed by -- the type and the strictness signature. See Note [exprArity invariant] @@ -2562,20 +2556,6 @@ mkKindErrMsg tyvar arg_ty hang (text "Arg type:") 4 (ppr arg_ty <+> dcolon <+> ppr (typeKind arg_ty))] -{- Not needed now -mkArityMsg :: Id -> MsgDoc -mkArityMsg binder - = vcat [hsep [text "Demand type has", - ppr (dmdTypeDepth dmd_ty), - text "arguments, rhs has", - ppr (idArity binder), - text "arguments,", - ppr binder], - hsep [text "Binder's strictness signature:", ppr dmd_ty] - - ] - where (StrictSig dmd_ty) = idStrictness binder --} mkCastErr :: CoreExpr -> Coercion -> Type -> Type -> MsgDoc mkCastErr expr = mk_cast_err "expression" "type" (ppr expr) ===================================== compiler/coreSyn/CoreUnfold.hs ===================================== @@ -1149,15 +1149,15 @@ certainlyWillInline dflags fn_info -- INLINABLE functions come via this path -- See Note [certainlyWillInline: INLINABLE] do_cunf expr (UnfIfGoodArgs { ug_size = size, ug_args = args }) - | not (null args) -- See Note [certainlyWillInline: be careful of thunks] + | arityInfo fn_info > 0 -- See Note [certainlyWillInline: be careful of thunks] , not (isBottomingSig (strictnessInfo fn_info)) -- Do not unconditionally inline a bottoming functions even if -- it seems smallish. We've carefully lifted it out to top level, -- so we don't want to re-inline it. - , let arity = length args - , size - (10 * (arity + 1)) <= ufUseThreshold dflags + , let unf_arity = length args + , size - (10 * (unf_arity + 1)) <= ufUseThreshold dflags = Just (fn_unf { uf_src = InlineStable - , uf_guidance = UnfWhen { ug_arity = arity + , uf_guidance = UnfWhen { ug_arity = unf_arity , ug_unsat_ok = unSaturatedOk , ug_boring_ok = inlineBoringOk expr } }) -- Note the "unsaturatedOk". A function like f = \ab. a @@ -1175,6 +1175,17 @@ found that the WorkWrap phase thought that y = case x of F# v -> F# (v +# v) was certainlyWillInline, so the addition got duplicated. +Note that we check arityInfo instead of the arity of the unfolding to detect +this case. This is so that we don't accidentally fail to inline small partial +applications, like `f = g 42` (where `g` recurses into `f`) where g has arity 2 +(say). Here there is no risk of work duplication, and the RHS is tiny, so +certainlyWillInline should return True. But `unf_arity` is zero! However f's +arity, gotten from `arityInfo fn_info`, is 1. + +Failing to say that `f` will inline forces W/W to generate a potentially huge +worker for f that will immediately cancel with `g`'s wrapper anyway, causing +unnecessary churn in the Simplifier while arriving at the same result. + Note [certainlyWillInline: INLINABLE] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ certainlyWillInline /must/ return Nothing for a large INLINABLE thing, ===================================== compiler/simplCore/SimplMonad.hs ===================================== @@ -21,7 +21,7 @@ module SimplMonad ( import GhcPrelude -import Var ( Var, isTyVar, mkLocalVar ) +import Var ( Var, isId, mkLocalVar ) import Name ( mkSystemVarName ) import Id ( Id, mkSysLocalOrCoVar ) import IdInfo ( IdDetails(..), vanillaIdInfo, setArityInfo ) @@ -187,7 +187,8 @@ newJoinId bndrs body_ty = do { uniq <- getUniqueM ; let name = mkSystemVarName uniq (fsLit "$j") join_id_ty = mkLamTypes bndrs body_ty -- Note [Funky mkLamTypes] - arity = length (filter (not . isTyVar) bndrs) + -- Note [idArity for join points] in SimplUtils + arity = length (filter isId bndrs) join_arity = length bndrs details = JoinId join_arity id_info = vanillaIdInfo `setArityInfo` arity ===================================== compiler/simplCore/SimplUtils.hs ===================================== @@ -1508,7 +1508,7 @@ tryEtaExpandRhs :: SimplMode -> OutId -> OutExpr -> SimplM (Arity, Bool, OutExpr) -- See Note [Eta-expanding at let bindings] -- If tryEtaExpandRhs rhs = (n, is_bot, rhs') then --- (a) rhs' has manifest arity +-- (a) rhs' has manifest arity n -- (b) if is_bot is True then rhs' applied to n args is guaranteed bottom tryEtaExpandRhs mode bndr rhs | Just join_arity <- isJoinId_maybe bndr @@ -1517,6 +1517,7 @@ tryEtaExpandRhs mode bndr rhs -- Note [Do not eta-expand join points] -- But do return the correct arity and bottom-ness, because -- these are used to set the bndr's IdInfo (#15517) + -- Note [idArity for join points] | otherwise = do { (new_arity, is_bot, new_rhs) <- try_expand @@ -1610,6 +1611,13 @@ CorePrep comes around, the code is very likely to look more like this: $j2 = if n > 0 then $j1 else (...) eta +Note [idArity for join points] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Because of Note [Do not eta-expand join points] we have it that the idArity +of a join point is always (less than or) equal to the join arity. +Essentially, for join points we set `idArity $j = count isId join_lam_bndrs`. +It really can be less if there are type-level binders in join_lam_bndrs. + Note [Do not eta-expand PAPs] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We used to have old_arity = manifestArity rhs, which meant that we ===================================== compiler/stranal/DmdAnal.hs ===================================== @@ -32,7 +32,7 @@ import Type import Coercion ( Coercion, coVarsOfCo ) import FamInstEnv import Util -import Maybes ( isJust ) +import Maybes ( fromMaybe, isJust ) import TysWiredIn import TysPrim ( realWorldStatePrimTy ) import ErrUtils ( dumpIfSet_dyn ) @@ -206,7 +206,6 @@ dmdAnal' env dmd (App fun arg) -- , text "overall res dmd_ty =" <+> ppr (res_ty `bothDmdType` arg_ty) ]) (res_ty `bothDmdType` arg_ty, App fun' arg') --- this is an anonymous lambda, since @dmdAnalRhsLetDown@ uses @collectBinders@ dmdAnal' env dmd (Lam var body) | isTyVar var = let @@ -286,10 +285,7 @@ dmdAnal' env dmd (Case scrut case_bndr ty alts) -- This is used for a non-recursive local let without manifest lambdas. -- This is the LetUp rule in the paper “Higher-Order Cardinality Analysis”. dmdAnal' env dmd (Let (NonRec id rhs) body) - | useLetUp id rhs - , Nothing <- unpackTrivial rhs - -- dmdAnalRhsLetDown treats trivial right hand sides specially - -- so if we have a trival right hand side, fall through to that. + | useLetUp id = (final_ty, Let (NonRec id' rhs') body') where (body_ty, body') = dmdAnal env dmd body @@ -582,25 +578,6 @@ environment, which effectively assigns them 'nopSig' (see "getStrictness") -} --- Trivial RHS --- See Note [Demand analysis for trivial right-hand sides] -dmdAnalTrivialRhs :: - AnalEnv -> Id -> CoreExpr -> Var -> - (DmdEnv, Id, CoreExpr) -dmdAnalTrivialRhs env id rhs fn - = (fn_fv, set_idStrictness env id fn_str, rhs) - where - fn_str = getStrictness env fn - fn_fv | isLocalId fn = unitVarEnv fn topDmd - | otherwise = emptyDmdEnv - -- Note [Remember to demand the function itself] - -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -- fn_fv: don't forget to produce a demand for fn itself - -- Lacking this caused #9128 - -- The demand is very conservative (topDmd), but that doesn't - -- matter; trivial bindings are usually inlined, so it only - -- kicks in for top-level bindings and NOINLINE bindings - -- Let bindings can be processed in two ways: -- Down (RHS before body) or Up (body before RHS). -- dmdAnalRhsLetDown implements the Down variant: @@ -621,28 +598,23 @@ dmdAnalRhsLetDown :: TopLevelFlag -- Process the RHS of the binding, add the strictness signature -- to the Id, and augment the environment with the signature as well. dmdAnalRhsLetDown top_lvl rec_flag env let_dmd id rhs - | Just fn <- unpackTrivial rhs -- See Note [Demand analysis for trivial right-hand sides] - = dmdAnalTrivialRhs env id rhs fn - - | otherwise - = (lazy_fv, id', mkLams bndrs' body') + = (lazy_fv, id', rhs') where - (bndrs, body, body_dmd) - = case isJoinId_maybe id of - Just join_arity -- See Note [Demand analysis for join points] - | (bndrs, body) <- collectNBinders join_arity rhs - -> (bndrs, body, let_dmd) - - Nothing | (bndrs, body) <- collectBinders rhs - -> (bndrs, body, mkBodyDmd env body) - - env_body = foldl' extendSigsWithLam env bndrs - (body_ty, body') = dmdAnal env_body body_dmd body - body_ty' = removeDmdTyArgs body_ty -- zap possible deep CPR info - (DmdType rhs_fv rhs_dmds rhs_res, bndrs') - = annotateLamBndrs env (isDFunId id) body_ty' bndrs - sig_ty = mkStrictSig (mkDmdType sig_fv rhs_dmds rhs_res') - id' = set_idStrictness env id sig_ty + rhs_arity = idArity id + rhs_dmd + -- See Note [Demand analysis for join points] + -- See Note [idArity for join points] in SimplUtils + -- rhs_arity matches the join arity of the join point + | isJoinId id + = mkCallDmds rhs_arity let_dmd + | otherwise + -- NB: rhs_arity + -- See Note [Demand signatures are computed for a threshold demand based on idArity] + = mkRhsDmd env rhs_arity rhs + (DmdType rhs_fv rhs_dmds rhs_res, rhs') + = dmdAnal env rhs_dmd rhs + sig = mkStrictSigForArity rhs_arity (mkDmdType sig_fv rhs_dmds rhs_res') + id' = set_idStrictness env id sig -- See Note [NOINLINE and strictness] @@ -666,36 +638,63 @@ dmdAnalRhsLetDown top_lvl rec_flag env let_dmd id rhs || not (isStrictDmd (idDemandInfo id) || ae_virgin env) -- See Note [Optimistic CPR in the "virgin" case] -mkBodyDmd :: AnalEnv -> CoreExpr -> CleanDemand --- See Note [Product demands for function body] -mkBodyDmd env body - = case deepSplitProductType_maybe (ae_fam_envs env) (exprType body) of - Nothing -> cleanEvalDmd - Just (dc, _, _, _) -> cleanEvalProdDmd (dataConRepArity dc) - -unpackTrivial :: CoreExpr -> Maybe Id --- Returns (Just v) if the arg is really equal to v, modulo --- casts, type applications etc --- See Note [Demand analysis for trivial right-hand sides] -unpackTrivial (Var v) = Just v -unpackTrivial (Cast e _) = unpackTrivial e -unpackTrivial (Lam v e) | isTyVar v = unpackTrivial e -unpackTrivial (App e a) | isTypeArg a = unpackTrivial e -unpackTrivial _ = Nothing - --- | If given the RHS of a let-binding, this 'useLetUp' determines --- whether we should process the binding up (body before rhs) or --- down (rhs before body). +-- | @mkRhsDmd env rhs_arity rhs@ creates a 'CleanDemand' for +-- unleashing on the given function's @rhs@, by creating a call demand of +-- @rhs_arity@ with a body demand appropriate for possible product types. +-- See Note [Product demands for function body]. +-- For example, a call of the form @mkRhsDmd _ 2 (\x y -> (x, y))@ returns a +-- clean usage demand of @C1(C1(U(U,U)))@. +mkRhsDmd :: AnalEnv -> Arity -> CoreExpr -> CleanDemand +mkRhsDmd env rhs_arity rhs = fromMaybe cleanEvalDmd $ do + let ts = findTypeShape (ae_fam_envs env) (exprType rhs) + TsProd tss <- peelTsFuns rhs_arity ts + pure (mkCallDmds rhs_arity (cleanEvalProdDmd (length tss))) + +-- | If given the let-bound 'Id', 'useLetUp' determines whether we should +-- process the binding up (body before rhs) or down (rhs before body). -- --- We use LetDown if there is a chance to get a useful strictness signature. --- This is the case when there are manifest value lambdas or the binding is a --- join point (hence always acts like a function, not a value). -useLetUp :: Var -> CoreExpr -> Bool -useLetUp f _ | isJoinId f = False -useLetUp f (Lam v e) | isTyVar v = useLetUp f e -useLetUp _ (Lam _ _) = False -useLetUp _ _ = True - +-- We use LetDown if there is a chance to get a useful strictness signature to +-- unleash at call sites. LetDown is generally more precise than LetUp if we can +-- correctly guess how it will be used in the body, that is, for which incoming +-- demand the strictness signature should be computed, which allows us to +-- unleash higher-order demands on arguments at call sites. This is mostly the +-- case when +-- +-- * The binding takes any arguments before performing meaningful work (cf. +-- 'idArity'), in which case we are interested to see how it uses them. +-- * The binding is a join point, hence acting like a function, not a value. +-- As a big plus, we know *precisely* how it will be used in the body; since +-- it's always tail-called, we can directly unleash the incoming demand of +-- the let binding on its RHS when computing a strictness signature. See +-- [Demand analysis for join points]. +-- +-- Thus, if the binding is not a join point and its arity is 0, we have a thunk +-- and use LetUp, implying that we have no usable demand signature available +-- when we analyse the let body. +-- +-- Since thunk evaluation is memoised, we want to unleash its 'DmdEnv' of free +-- vars at most once, regardless of how many times it was forced in the body. +-- This makes a real difference wrt. usage demands. The other reason is being +-- able to unleash a more precise product demand on its RHS once we know how the +-- thunk was used in the let body. +-- +-- Characteristic examples, always assuming a single evaluation: +-- +-- * @let x = 2*y in x + x@ => LetUp. Compared to LetDown, we find out that +-- the expression uses @y@ at most once. +-- * @let x = (a,b) in fst x@ => LetUp. Compared to LetDown, we find out that +-- @b@ is absent. +-- * @let f x = x*2 in f y@ => LetDown. Compared to LetUp, we find out that +-- the expression uses @y@ strictly, because we have @f@'s demand signature +-- available at the call site. +-- * @join exit = 2*y in if a then exit else if b then exit else 3*y@ => +-- LetDown. Compared to LetUp, we find out that the expression uses @y@ +-- strictly, because we can unleash @exit@'s signature at each call site. +-- * For a more convincing example with join points, see Note [Demand analysis +-- for join points]. +-- +useLetUp :: Var -> Bool +useLetUp f = idArity f == 0 && not (isJoinId f) {- Note [Demand analysis for join points] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -728,22 +727,141 @@ let_dmd here). Another win for join points! #13543. +Note [Demand signatures are computed for a threshold demand based on idArity] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We compute demand signatures assuming idArity incoming arguments to approximate +behavior for when we have a call site with at least that many arguments. idArity +is /at least/ the number of manifest lambdas, but might be higher for PAPs and +trivial RHS (see Note [Demand analysis for trivial right-hand sides]). + +Because idArity of a function varies independently of its cardinality properties +(cf. Note [idArity varies independently of dmdTypeDepth]), we implicitly encode +the arity for when a demand signature is sound to unleash in its 'dmdTypeDepth' +(cf. Note [Understanding DmdType and StrictSig] in Demand). It is unsound to +unleash a demand signature when the incoming number of arguments is less than +that. See Note [What are demand signatures?] for more details on soundness. + +Why idArity arguments? Because that's a conservative estimate of how many +arguments we must feed a function before it does anything interesting with them. +Also it elegantly subsumes the trivial RHS and PAP case. + +There might be functions for which we might want to analyse for more incoming +arguments than idArity. Example: + + f x = + if expensive + then \y -> ... y ... + else \y -> ... y ... + +We'd analyse `f` under a unary call demand C(S), corresponding to idArity +being 1. That's enough to look under the manifest lambda and find out how a +unary call would use `x`, but not enough to look into the lambdas in the if +branches. + +On the other hand, if we analysed for call demand C(C(S)), we'd get useful +strictness info for `y` (and more precise info on `x`) and possibly CPR +information, but + + * We would no longer be able to unleash the signature at unary call sites + * Performing the worker/wrapper split based on this information would be + implicitly eta-expanding `f`, playing fast and loose with divergence and + even being unsound in the presence of newtypes, so we refrain from doing so. + Also see Note [Don't eta expand in w/w] in WorkWrap. + +Since we only compute one signature, we do so for arity 1. Computing multiple +signatures for different arities (i.e., polyvariance) would be entirely +possible, if it weren't for the additional runtime and implementation +complexity. + +Note [idArity varies independently of dmdTypeDepth] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We used to check in CoreLint that dmdTypeDepth <= idArity for a let-bound +identifier. But that means we would have to zap demand signatures every time we +reset or decrease arity. That's an unnecessary dependency, because + + * The demand signature captures a semantic property that is independent of + what the binding's current arity is + * idArity is analysis information itself, thus volatile + * We already *have* dmdTypeDepth, wo why not just use it to encode the + threshold for when to unleash the signature + (cf. Note [Understanding DmdType and StrictSig] in Demand) + +Consider the following expression, for example: + + (let go x y = `x` seq ... in go) |> co + +`go` might have a strictness signature of ``. The simplifier will identify +`go` as a nullary join point through `joinPointBinding_maybe` and float the +coercion into the binding, leading to an arity decrease: + + join go = (\x y -> `x` seq ...) |> co in go + +With the CoreLint check, we would have to zap `go`'s perfectly viable strictness +signature. + +Note [What are demand signatures?] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Demand analysis interprets expressions in the abstract domain of demand +transformers. Given an incoming demand we put an expression under, its abstract +transformer gives us back a demand type denoting how other things (like +arguments and free vars) were used when the expression was evaluated. +Here's an example: + + f x y = + if x + expensive + then \z -> z + y * ... + else \z -> z * ... + +The abstract transformer (let's call it F_e) of the if expression (let's call it +e) would transform an incoming head demand into a demand type like +{x->,y->}. In pictures: + + Demand ---F_e---> DmdType + {x->,y->} + +Let's assume that the demand transformers we compute for an expression are +correct wrt. to some concrete semantics for Core. How do demand signatures fit +in? They are strange beasts, given that they come with strict rules when to +it's sound to unleash them. + +Fortunately, we can formalise the rules with Galois connections. Consider +f's strictness signature, {}. It's a single-point approximation of +the actual abstract transformer of f's RHS for arity 2. So, what happens is that +we abstract *once more* from the abstract domain we already are in, replacing +the incoming Demand by a simple lattice with two elements denoting incoming +arity: A_2 = {<2, >=2} (where '<2' is the top element and >=2 the bottom +element). Here's the diagram: + + A_2 -----f_f----> DmdType + ^ | + | α γ | + | v + Demand ---F_f---> DmdType + +With + α(C1(C1(_))) = >=2 -- example for usage demands, but similar for strictness + α(_) = <2 + γ(ty) = ty +and F_f being the abstract transformer of f's RHS and f_f being the abstracted +abstract transformer computable from our demand signature simply by + + f_f(>=2) = {} + f_f(<2) = postProcessUnsat {} + +where postProcessUnsat makes a proper top element out of the given demand type. + Note [Demand analysis for trivial right-hand sides] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider - foo = plusInt |> co + foo = plusInt |> co where plusInt is an arity-2 function with known strictness. Clearly we want plusInt's strictness to propagate to foo! But because it has no manifest lambdas, it won't do so automatically, and indeed 'co' might -have type (Int->Int->Int) ~ T, so we *can't* eta-expand. So we have a -special case for right-hand sides that are "trivial", namely variables, -casts, type applications, and the like. +have type (Int->Int->Int) ~ T. -Note that this can mean that 'foo' has an arity that is smaller than that -indicated by its demand info. e.g. if co :: (Int->Int->Int) ~ T, then -foo's arity will be zero (see Note [exprArity invariant] in CoreArity), -but its demand signature will be that of plusInt. A small example is the -test case of #8963. +Fortunately, CoreArity gives 'foo' arity 2, which is enough for LetDown to +forward plusInt's demand signature, and all is well (see Note [Newtype arity] in +CoreArity)! A small example is the test case NewtypeArity. Note [Product demands for function body] @@ -841,13 +959,6 @@ annotateBndr env dmd_ty var where (dmd_ty', dmd) = findBndrDmd env False dmd_ty var -annotateLamBndrs :: AnalEnv -> DFunFlag -> DmdType -> [Var] -> (DmdType, [Var]) -annotateLamBndrs env args_of_dfun ty bndrs = mapAccumR annotate ty bndrs - where - annotate dmd_ty bndr - | isId bndr = annotateLamIdBndr env args_of_dfun dmd_ty bndr - | otherwise = (dmd_ty, bndr) - annotateLamIdBndr :: AnalEnv -> DFunFlag -- is this lambda at the top of the RHS of a dfun? -> DmdType -- Demand type of body @@ -1160,12 +1271,6 @@ extendSigEnv top_lvl sigs var sig = extendVarEnv sigs var (sig, top_lvl) lookupSigEnv :: AnalEnv -> Id -> Maybe (StrictSig, TopLevelFlag) lookupSigEnv env id = lookupVarEnv (ae_sigs env) id -getStrictness :: AnalEnv -> Id -> StrictSig -getStrictness env fn - | isGlobalId fn = idStrictness fn - | Just (sig, _) <- lookupSigEnv env fn = sig - | otherwise = nopSig - nonVirgin :: AnalEnv -> AnalEnv nonVirgin env = env { ae_virgin = False } ===================================== compiler/stranal/WorkWrap.hs ===================================== @@ -9,6 +9,7 @@ module WorkWrap ( wwTopBinds ) where import GhcPrelude +import CoreArity ( manifestArity ) import CoreSyn import CoreUnfold ( certainlyWillInline, mkWwInlineRule, mkWorkerUnfolding ) import CoreUtils ( exprType, exprIsHNF ) @@ -457,7 +458,7 @@ tryWW dflags fam_envs is_rec fn_id rhs -- See Note [Don't w/w INLINE things] -- See Note [Don't w/w inline small non-loop-breaker things] - | is_fun + | is_fun && is_eta_exp = splitFun dflags fam_envs new_fn_id fn_info wrap_dmds res_info rhs | is_thunk -- See Note [Thunk splitting] @@ -474,9 +475,11 @@ tryWW dflags fam_envs is_rec fn_id rhs -- See Note [Zapping DmdEnv after Demand Analyzer] and -- See Note [Zapping Used Once info in WorkWrap] - is_fun = notNull wrap_dmds || isJoinId fn_id - is_thunk = not is_fun && not (exprIsHNF rhs) && not (isJoinId fn_id) - && not (isUnliftedType (idType fn_id)) + is_fun = notNull wrap_dmds || isJoinId fn_id + -- See Note [Don't eta expand in w/w] + is_eta_exp = length wrap_dmds == manifestArity rhs + is_thunk = not is_fun && not (exprIsHNF rhs) && not (isJoinId fn_id) + && not (isUnliftedType (idType fn_id)) {- Note [Zapping DmdEnv after Demand Analyzer] @@ -516,6 +519,30 @@ want to _keep_ the info for the code generator). We do not do it in the demand analyser for the same reasons outlined in Note [Zapping DmdEnv after Demand Analyzer] above. + +Note [Don't eta expand in w/w] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A binding where the manifestArity of the RHS is less than idArity of the binder +means CoreArity didn't eta expand that binding. When this happens, it does so +for a reason (see Note [exprArity invariant] in CoreArity) and we probably have +a PAP, cast or trivial expression as RHS. + +Performing the worker/wrapper split will implicitly eta-expand the binding to +idArity, overriding CoreArity's decision. Other than playing fast and loose with +divergence, it's also broken for newtypes: + + f = (\xy.blah) |> co + where + co :: (Int -> Int -> Char) ~ T + +Then idArity is 2 (despite the type T), and it can have a StrictSig based on a +threshold of 2. But we can't w/w it without a type error. + +The situation is less grave for PAPs, but the implicit eta expansion caused a +compiler allocation regression in T15164, where huge recursive instance method +groups, mostly consisting of PAPs, got w/w'd. This caused great churn in the +simplifier, when simply waiting for the PAPs to inline arrived at the same +output program. -} ===================================== compiler/stranal/WwLib.hs ===================================== @@ -134,7 +134,7 @@ mkWwBodies :: DynFlags -- wrap_fn_str E = case x of { (a,b) -> -- case a of { (a1,a2) -> -- E a1 a2 b y }} --- work_fn_str E = \a2 a2 b y -> +-- work_fn_str E = \a1 a2 b y -> -- let a = (a1,a2) in -- let x = (a,b) in -- E ===================================== testsuite/tests/perf/compiler/WWRec.hs ===================================== @@ -0,0 +1,73 @@ +{-# LANGUAGE MultiParamTypeClasses #-} +{-# LANGUAGE FlexibleContexts #-} +{-# LANGUAGE FlexibleInstances #-} +{-# LANGUAGE UndecidableInstances #-} + +module WWRec where + +class Rule f a where + get :: Decorator f => f a +class Monad f => Decorator f where + foo :: Rule f a => f a + +data A1 = MkA1 A2 +data A2 = MkA2 A3 +data A3 = MkA3 A4 +data A4 = MkA4 A5 +data A5 = MkA5 A6 +data A6 = MkA6 A7 +data A7 = MkA7 A8 +data A8 = MkA8 A9 +data A9 = MkA9 A10 +data A10 = MkA10 A11 +data A11 = MkA11 A12 +data A12 = MkA12 A13 +data A13 = MkA13 A14 +data A14 = MkA14 A15 +data A15 = MkA15 A16 +data A16 = MkA16 A17 +data A17 = MkA17 A18 +data A18 = MkA18 A19 +data A19 = MkA19 A20 +data A20 = MkA20 A21 +data A21 = MkA21 A22 +data A22 = MkA22 A23 +data A23 = MkA23 A24 +data A24 = MkA24 A25 +data A25 = MkA25 A26 +data A26 = MkA26 A27 +data A27 = MkA27 A28 +data A28 = MkA28 A29 +data A29 = MkA29 A30 +data A30 = MkA30 A1 + +instance Rule f A2 => Rule f A1 where get = MkA1 <$> foo +instance Rule f A3 => Rule f A2 where get = MkA2 <$> foo +instance Rule f A4 => Rule f A3 where get = MkA3 <$> foo +instance Rule f A5 => Rule f A4 where get = MkA4 <$> foo +instance Rule f A6 => Rule f A5 where get = MkA5 <$> foo +instance Rule f A7 => Rule f A6 where get = MkA6 <$> foo +instance Rule f A8 => Rule f A7 where get = MkA7 <$> foo +instance Rule f A9 => Rule f A8 where get = MkA8 <$> foo +instance Rule f A10 => Rule f A9 where get = MkA9 <$> foo +instance Rule f A11 => Rule f A10 where get = MkA10 <$> foo +instance Rule f A12 => Rule f A11 where get = MkA11 <$> foo +instance Rule f A13 => Rule f A12 where get = MkA12 <$> foo +instance Rule f A14 => Rule f A13 where get = MkA13 <$> foo +instance Rule f A15 => Rule f A14 where get = MkA14 <$> foo +instance Rule f A16 => Rule f A15 where get = MkA15 <$> foo +instance Rule f A17 => Rule f A16 where get = MkA16 <$> foo +instance Rule f A18 => Rule f A17 where get = MkA17 <$> foo +instance Rule f A19 => Rule f A18 where get = MkA18 <$> foo +instance Rule f A20 => Rule f A19 where get = MkA19 <$> foo +instance Rule f A21 => Rule f A20 where get = MkA20 <$> foo +instance Rule f A22 => Rule f A21 where get = MkA21 <$> foo +instance Rule f A23 => Rule f A22 where get = MkA22 <$> foo +instance Rule f A24 => Rule f A23 where get = MkA23 <$> foo +instance Rule f A25 => Rule f A24 where get = MkA24 <$> foo +instance Rule f A26 => Rule f A25 where get = MkA25 <$> foo +instance Rule f A27 => Rule f A26 where get = MkA26 <$> foo +instance Rule f A28 => Rule f A27 where get = MkA27 <$> foo +instance Rule f A29 => Rule f A28 where get = MkA28 <$> foo +instance Rule f A30 => Rule f A29 where get = MkA29 <$> foo +instance Rule f A1 => Rule f A30 where get = MkA30 <$> foo ===================================== testsuite/tests/perf/compiler/all.T ===================================== @@ -393,6 +393,13 @@ test ('T15164', compile, ['-v0 -O']) +# See https://gitlab.haskell.org/ghc/ghc/merge_requests/312#note_186960 +test ('WWRec', + [ collect_compiler_stats('bytes allocated',10) + ], + compile, + ['-v0 -O']) + test('T16190', [ collect_stats(), when(opsys('mingw32'), expect_broken(16389)) ===================================== testsuite/tests/stranal/sigs/NewtypeArity.hs ===================================== @@ -0,0 +1,10 @@ +-- | 't' and 't2' should have a strictness signature for arity 2 here. +module Test where + +newtype T = MkT (Int -> Int -> Int) + +t :: T +t = MkT (\a b -> a + b) + +t2 :: T +t2 = MkT (+) ===================================== testsuite/tests/stranal/sigs/NewtypeArity.stderr ===================================== @@ -0,0 +1,18 @@ + +==================== Strictness signatures ==================== +Test.$tc'MkT: m +Test.$tcT: m +Test.$trModule: m +Test.t: m +Test.t2: m + + + +==================== Strictness signatures ==================== +Test.$tc'MkT: m +Test.$tcT: m +Test.$trModule: m +Test.t: m +Test.t2: m + + ===================================== testsuite/tests/stranal/sigs/all.T ===================================== @@ -17,3 +17,4 @@ test('BottomFromInnerLambda', normal, compile, ['']) test('DmdAnalGADTs', normal, compile, ['']) test('T12370', normal, compile, ['']) test('CaseBinderCPR', normal, compile, ['']) +test('NewtypeArity', normal, compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1a670dc0196aeeb37e7b452e6f1b0e28380f032f -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1a670dc0196aeeb37e7b452e6f1b0e28380f032f You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 14:27:52 2019 From: gitlab at gitlab.haskell.org (Krzysztof Gogolewski) Date: Fri, 05 Apr 2019 10:27:52 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/T16533 Message-ID: <5ca765e8776c_62b33fa2e6e4a61018026d6@gitlab.haskell.org.mail> Krzysztof Gogolewski pushed new branch wip/T16533 at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/T16533 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 14:31:53 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Fri, 05 Apr 2019 10:31:53 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca766d9f0cba_62b33fa2e4facc6c18057de@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: 8cfe66b7 by Joachim Breitner at 2019-04-05T14:31:17Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,31 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where - entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + let entry_addr = interpConstrEntry !! ptrtag + code' <- if tables_next_to_code + then Just <$> case mkJumpToAddr <$> mPlatform <*> pure entry_addr of + Just code' -> pure code' + Nothing -> + -- This code must not be called. You either need to add your + -- architecture as a distinct case to 'Arch' and 'mPlatform', or use + -- non-TABLES_NEXT_TO_CODE mode. + fail "Unknown obscure arch is not supported" + else pure Nothing + let itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode + code = if tables_next_to_code then Just code' else Nothing } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +83,35 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +-- | 'Just' if it's a known OS, or 'Nothing' otherwise. +mPlatform :: Maybe Arch +mPlatform = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif -mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr :: Platform -> EntryFunPtr -> ItblCodes +mkJumpToAddr platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +271,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +314,38 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 +sizeOfEntryCode :: Bool -> Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = 0 | otherwise = case mkJumpToAddr undefined of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode tables_next_to_code) -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +354,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/8cfe66b7afcea709f37f9d6e414c12c7a3ee0543 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/8cfe66b7afcea709f37f9d6e414c12c7a3ee0543 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 15:09:47 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Fri, 05 Apr 2019 11:09:47 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca76fbbd1b65_62b33fa2e4facc6c1811971@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: 37e5db1b by Joachim Breitner at 2019-04-05T15:09:32Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,31 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where - entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + let entry_addr = interpConstrEntry !! ptrtag + code' <- if tables_next_to_code + then Just <$> case mkJumpToAddr <$> mArch <*> pure entry_addr of + Just code' -> pure code' + Nothing -> + -- This code must not be called. You either need to add your + -- architecture as a distinct case to 'Arch' and 'mArch', or use + -- non-TABLES_NEXT_TO_CODE mode. + fail "Unknown obscure arch is not supported" + else pure Nothing + let itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode + code = if tables_next_to_code then Just code' else Nothing } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +83,35 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +-- | 'Just' if it's a known OS, or 'Nothing' otherwise. +mArch :: Maybe Arch +mArch = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif -mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr :: Arch -> EntryFunPtr -> ItblCodes +mkJumpToAddr platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +271,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +314,38 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 +sizeOfEntryCode :: Bool -> Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = 0 | otherwise = case mkJumpToAddr undefined of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode tables_next_to_code) -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +354,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/37e5db1b1272a52bd80920f71d85c4df3f14cd73 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/37e5db1b1272a52bd80920f71d85c4df3f14cd73 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 15:53:23 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Fri, 05 Apr 2019 11:53:23 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca779f363ea5_62b38e29ad41819552@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: ccc8cfc9 by Joachim Breitner at 2019-04-05T15:53:15Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,29 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where - entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + let entry_addr = interpConstrEntry !! ptrtag + code' <- if tables_next_to_code + then Just <$> case mkJumpToAddr <$> mArch <*> pure entry_addr of + Just code' -> pure code' + Nothing -> + -- This code must not be called. You either need to add your + -- architecture as a distinct case to 'Arch' and 'mArch', or use + -- non-TABLES_NEXT_TO_CODE mode. + fail "Unknown obscure arch is not supported" + else pure Nothing + let itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode - then Just code' - else Nothing + code = code' } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +81,35 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +-- | 'Just' if it's a known OS, or 'Nothing' otherwise. +mArch :: Maybe Arch +mArch = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif -mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr :: Arch -> EntryFunPtr -> ItblCodes +mkJumpToAddr platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +269,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +312,38 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 +sizeOfEntryCode :: Bool -> Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = 0 | otherwise = case mkJumpToAddr undefined of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode tables_next_to_code) -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +352,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/ccc8cfc902a352870b036744bde5a4a5fc37cc0e -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/ccc8cfc902a352870b036744bde5a4a5fc37cc0e You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 16:14:24 2019 From: gitlab at gitlab.haskell.org (Sebastian Graf) Date: Fri, 05 Apr 2019 12:14:24 -0400 Subject: [Git][ghc/ghc][wip/dmd-arity] Compute demand signatures assuming idArity Message-ID: <5ca77ee09e942_62b33fa2cc3d37c818222e6@gitlab.haskell.org.mail> Sebastian Graf pushed to branch wip/dmd-arity at Glasgow Haskell Compiler / GHC Commits: f2105165 by Sebastian Graf at 2019-04-05T16:13:06Z Compute demand signatures assuming idArity This does four things: 1. Look at `idArity` instead of manifest lambdas to decide whether to use LetUp 2. Compute the strictness signature in LetDown assuming at least `idArity` incoming arguments 3. Remove the special case for trivial RHSs, which is subsumed by 2 4. Don't perform the W/W split when doing so would eta expand a binding. Otherwise we would eta expand PAPs, causing unnecessary churn in the Simplifier. NoFib Results -------------------------------------------------------------------------------- Program Allocs Instrs -------------------------------------------------------------------------------- fannkuch-redux +0.3% 0.0% gg -0.0% -0.1% maillist +0.2% +0.2% minimax 0.0% +0.8% pretty 0.0% -0.1% scc 0.0% -0.1% -------------------------------------------------------------------------------- Min -0.0% -0.1% Max +0.3% +0.8% Geometric Mean +0.0% +0.0% - - - - - 17 changed files: - compiler/basicTypes/Demand.hs - compiler/basicTypes/Id.hs - compiler/basicTypes/IdInfo.hs - compiler/basicTypes/Var.hs - compiler/coreSyn/CoreArity.hs - compiler/coreSyn/CoreLint.hs - compiler/coreSyn/CoreUnfold.hs - compiler/simplCore/SimplMonad.hs - compiler/simplCore/SimplUtils.hs - compiler/stranal/DmdAnal.hs - compiler/stranal/WorkWrap.hs - compiler/stranal/WwLib.hs - + testsuite/tests/perf/compiler/WWRec.hs - testsuite/tests/perf/compiler/all.T - + testsuite/tests/stranal/sigs/NewtypeArity.hs - + testsuite/tests/stranal/sigs/NewtypeArity.stderr - testsuite/tests/stranal/sigs/all.T Changes: ===================================== compiler/basicTypes/Demand.hs ===================================== @@ -22,7 +22,7 @@ module Demand ( DmdType(..), dmdTypeDepth, lubDmdType, bothDmdType, nopDmdType, botDmdType, mkDmdType, - addDemand, removeDmdTyArgs, + addDemand, ensureArgs, BothDmdArg, mkBothDmdArg, toBothDmdArg, DmdEnv, emptyDmdEnv, @@ -34,7 +34,7 @@ module Demand ( vanillaCprProdRes, cprSumRes, appIsBottom, isBottomingSig, pprIfaceStrictSig, trimCPRInfo, returnsCPR_maybe, - StrictSig(..), mkStrictSig, mkClosedStrictSig, + StrictSig(..), mkStrictSigForArity, mkClosedStrictSig, nopSig, botSig, cprProdSig, isTopSig, hasDemandEnvSig, splitStrictSig, strictSigDmdEnv, @@ -47,10 +47,10 @@ module Demand ( deferAfterIO, postProcessUnsat, postProcessDmdType, - splitProdDmd_maybe, peelCallDmd, peelManyCalls, mkCallDmd, + splitProdDmd_maybe, peelCallDmd, peelManyCalls, mkCallDmd, mkCallDmds, mkWorkerDemand, dmdTransformSig, dmdTransformDataConSig, dmdTransformDictSelSig, argOneShots, argsOneShots, saturatedByOneShots, - trimToType, TypeShape(..), + TypeShape(..), peelTsFuns, trimToType, useCount, isUsedOnce, reuseEnv, killUsageDemand, killUsageSig, zapUsageDemand, zapUsageEnvSig, @@ -675,10 +675,15 @@ mkProdDmd dx = JD { sd = mkSProd $ map getStrDmd dx , ud = mkUProd $ map getUseDmd dx } +-- | Wraps the 'CleanDemand' with a one-shot call demand: @d@ -> @C1(d)@. mkCallDmd :: CleanDemand -> CleanDemand mkCallDmd (JD {sd = d, ud = u}) = JD { sd = mkSCall d, ud = mkUCall One u } +-- | @mkCallDmds n d@ returns @C1(C1...(C1 d))@ where there are @n@ @C1@'s. +mkCallDmds :: Arity -> CleanDemand -> CleanDemand +mkCallDmds arity cd = iterate mkCallDmd cd !! arity + -- See Note [Demand on the worker] in WorkWrap mkWorkerDemand :: Int -> Demand mkWorkerDemand n = JD { sd = Lazy, ud = Use One (go n) } @@ -804,6 +809,13 @@ instance Outputable TypeShape where ppr (TsFun ts) = text "TsFun" <> parens (ppr ts) ppr (TsProd tss) = parens (hsep $ punctuate comma $ map ppr tss) +-- | @peelTsFuns n ts@ tries to peel off @n@ 'TsFun' constructors from @ts@ and +-- returns 'Just' the wrapped 'TypeShape' on success, and 'Nothing' otherwise. +peelTsFuns :: Arity -> TypeShape -> Maybe TypeShape +peelTsFuns 0 ts = Just ts +peelTsFuns n (TsFun ts) = peelTsFuns (n-1) ts +peelTsFuns _ _ = Nothing + trimToType :: Demand -> TypeShape -> Demand -- See Note [Trimming a demand to a type] trimToType (JD { sd = ms, ud = mu }) ts @@ -1207,12 +1219,8 @@ mkDmdType fv ds res = DmdType fv ds res dmdTypeDepth :: DmdType -> Arity dmdTypeDepth (DmdType _ ds _) = length ds --- Remove any demand on arguments. This is used in dmdAnalRhs on the body -removeDmdTyArgs :: DmdType -> DmdType -removeDmdTyArgs = ensureArgs 0 - --- This makes sure we can use the demand type with n arguments, --- It extends the argument list with the correct resTypeArgDmd +-- | This makes sure we can use the demand type with n arguments. +-- It extends the argument list with the correct resTypeArgDmd. -- It also adjusts the DmdResult: Divergence survives additional arguments, -- CPR information does not (and definite converge also would not). ensureArgs :: Arity -> DmdType -> DmdType @@ -1567,8 +1575,56 @@ and on the second, then returning a constructor. If this same function is applied to one arg, all we can say is that it uses x with , and its arg with demand . + +Note [Understanding DmdType and StrictSig] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Demand types are sound approximations of an expression's semantics relative to +the incoming demand we put the expression under. Consider the following +expression: + + \x y -> x `seq` (y, 2*x) + +Here is a table with demand types resulting from different incoming demands we +put that expression under. Note the monotonicity; a stronger incoming demand +yields a more precise demand type: + + incoming demand | demand type + ---------------------------------------------------- + | {} + | {} + | {} + +Note that in the first example, the depth of the demand type was *higher* than +the arity of the incoming call demand due to the anonymous lambda. +The converse is also possible and happens when we unleash demand signatures. +In @f x y@, the incoming call demand on f has arity 2. But if all we have is a +demand signature with depth 1 for @f@ (which we can safely unleash, see below), +the demand type of @f@ under a call demand of arity 2 has a *lower* depth of 1. + +So: Demand types are elicited by putting an expression under an incoming (call) +demand, the arity of which can be lower or higher than the depth of the +resulting demand type. +In contrast, a demand signature summarises a function's semantics *without* +immediately specifying the incoming demand it was produced under. Despite StrSig +being a newtype wrapper around DmdType, it actually encodes two things: + + * The threshold (i.e., minimum arity) to unleash the signature + * A demand type that is sound to unleash when the minimum arity requirement is + met. + +Here comes the subtle part: The threshold is encoded in the wrapped demand +type's depth! So in mkStrictSigForArity we make sure to trim the list of +argument demands to the given threshold arity. Call sites will make sure that +this corresponds to the arity of the call demand that elicited the wrapped +demand type. See also Note [What are demand signatures?] in DmdAnal. + +Besides trimming argument demands, mkStrictSigForArity will also trim CPR +information if necessary. -} +-- | The depth of the wrapped 'DmdType' encodes the arity at which it is safe +-- to unleash. Better construct this through 'mkStrictSigForArity'. +-- See Note [Understanding DmdType and StrictSig] newtype StrictSig = StrictSig DmdType deriving( Eq ) @@ -1580,34 +1636,43 @@ pprIfaceStrictSig :: StrictSig -> SDoc pprIfaceStrictSig (StrictSig (DmdType _ dmds res)) = hcat (map ppr dmds) <> ppr res -mkStrictSig :: DmdType -> StrictSig -mkStrictSig dmd_ty = StrictSig dmd_ty +-- | Turns a 'DmdType' computed for the particular 'Arity' into a 'StrictSig' +-- unleashable at that arity. See Note [Understanding DmdType and StrictSig] +mkStrictSigForArity :: Arity -> DmdType -> StrictSig +mkStrictSigForArity arity dmd_ty = StrictSig (ensureArgs arity dmd_ty) mkClosedStrictSig :: [Demand] -> DmdResult -> StrictSig -mkClosedStrictSig ds res = mkStrictSig (DmdType emptyDmdEnv ds res) +mkClosedStrictSig ds res = mkStrictSigForArity (length ds) (DmdType emptyDmdEnv ds res) splitStrictSig :: StrictSig -> ([Demand], DmdResult) splitStrictSig (StrictSig (DmdType _ dmds res)) = (dmds, res) increaseStrictSigArity :: Int -> StrictSig -> StrictSig --- Add extra arguments to a strictness signature +-- ^ Add extra arguments to a strictness signature. +-- In contrast to 'etaExpandStrictSig', this /prepends/ additional argument +-- demands and leaves CPR info intact. increaseStrictSigArity arity_increase sig@(StrictSig dmd_ty@(DmdType env dmds res)) | isTopDmdType dmd_ty = sig - | arity_increase <= 0 = sig + | arity_increase == 0 = sig + | arity_increase < 0 = WARN( True, text "increaseStrictSigArity:" + <+> text "negative arity increase" + <+> ppr arity_increase ) + nopSig | otherwise = StrictSig (DmdType env dmds' res) where dmds' = replicate arity_increase topDmd ++ dmds etaExpandStrictSig :: Arity -> StrictSig -> StrictSig --- We are expanding (\x y. e) to (\x y z. e z) --- Add exta demands to the /end/ of the arg demands if necessary -etaExpandStrictSig arity sig@(StrictSig dmd_ty@(DmdType env dmds res)) - | isTopDmdType dmd_ty = sig - | arity_increase <= 0 = sig - | otherwise = StrictSig (DmdType env dmds' res) - where - arity_increase = arity - length dmds - dmds' = dmds ++ replicate arity_increase topDmd +-- ^ We are expanding (\x y. e) to (\x y z. e z). +-- In contrast to 'increaseStrictSigArity', this /appends/ extra arg demands if +-- necessary, potentially destroying the signature's CPR property. +etaExpandStrictSig arity (StrictSig dmd_ty) + | arity < dmdTypeDepth dmd_ty + -- an arity decrease must zap the whole signature, because it was possibly + -- computed for a higher incoming call demand. + = nopSig + | otherwise + = StrictSig $ ensureArgs arity dmd_ty isTopSig :: StrictSig -> Bool isTopSig (StrictSig ty) = isTopDmdType ty ===================================== compiler/basicTypes/Id.hs ===================================== @@ -668,6 +668,7 @@ isBottomingId v | isId v = isBottomingSig (idStrictness v) | otherwise = False +-- | Accesses the 'Id''s 'strictnessInfo'. idStrictness :: Id -> StrictSig idStrictness id = strictnessInfo (idInfo id) ===================================== compiler/basicTypes/IdInfo.hs ===================================== @@ -237,22 +237,34 @@ pprIdDetails other = brackets (pp other) -- too big. data IdInfo = IdInfo { - arityInfo :: !ArityInfo, -- ^ 'Id' arity - ruleInfo :: RuleInfo, -- ^ Specialisations of the 'Id's function which exist - -- See Note [Specialisations and RULES in IdInfo] - unfoldingInfo :: Unfolding, -- ^ The 'Id's unfolding - cafInfo :: CafInfo, -- ^ 'Id' CAF info - oneShotInfo :: OneShotInfo, -- ^ Info about a lambda-bound variable, if the 'Id' is one - inlinePragInfo :: InlinePragma, -- ^ Any inline pragma atached to the 'Id' - occInfo :: OccInfo, -- ^ How the 'Id' occurs in the program - - strictnessInfo :: StrictSig, -- ^ A strictness signature - - demandInfo :: Demand, -- ^ ID demand information - callArityInfo :: !ArityInfo, -- ^ How this is called. - -- n <=> all calls have at least n arguments - - levityInfo :: LevityInfo -- ^ when applied, will this Id ever have a levity-polymorphic type? + arityInfo :: !ArityInfo, + -- ^ 'Id' arity, as computed by 'CoreArity'. Specifies how many + -- arguments this 'Id' has to be applied to before it doesn any + -- meaningful work. + ruleInfo :: RuleInfo, + -- ^ Specialisations of the 'Id's function which exist. + -- See Note [Specialisations and RULES in IdInfo] + unfoldingInfo :: Unfolding, + -- ^ The 'Id's unfolding + cafInfo :: CafInfo, + -- ^ 'Id' CAF info + oneShotInfo :: OneShotInfo, + -- ^ Info about a lambda-bound variable, if the 'Id' is one + inlinePragInfo :: InlinePragma, + -- ^ Any inline pragma atached to the 'Id' + occInfo :: OccInfo, + -- ^ How the 'Id' occurs in the program + strictnessInfo :: StrictSig, + -- ^ A strictness signature. Digests how a function uses its arguments + -- if applied to at least 'arityInfo' arguments. + demandInfo :: Demand, + -- ^ ID demand information + callArityInfo :: !ArityInfo, + -- ^ How this is called. This is the number of arguments to which a + -- binding can be eta-expanded without losing any sharing. + -- n <=> all calls have at least n arguments + levityInfo :: LevityInfo + -- ^ when applied, will this Id ever have a levity-polymorphic type? } -- Setters ===================================== compiler/basicTypes/Var.hs ===================================== @@ -700,6 +700,8 @@ setIdNotExported id = ASSERT( isLocalId id ) ************************************************************************ -} +-- | Is this a type-level (i.e., computationally irrelevant, thus erasable) +-- variable? Satisfies @isTyVar = not . isId at . isTyVar :: Var -> Bool -- True of both TyVar and TcTyVar isTyVar (TyVar {}) = True isTyVar (TcTyVar {}) = True @@ -712,17 +714,21 @@ isTcTyVar _ = False isTyCoVar :: Var -> Bool isTyCoVar v = isTyVar v || isCoVar v +-- | Is this a value-level (i.e., computationally relevant) 'Id'entifier? +-- Satisfies @isId = not . isTyVar at . isId :: Var -> Bool isId (Id {}) = True isId _ = False +-- | Is this a coercion variable? +-- Satisfies @'isId' v ==> 'isCoVar' v == not ('isNonCoVarId' v)@. isCoVar :: Var -> Bool --- A coercion variable isCoVar (Id { id_details = details }) = isCoVarDetails details isCoVar _ = False +-- | Is this a term variable ('Id') that is /not/ a coercion variable? +-- Satisfies @'isId' v ==> 'isCoVar' v == not ('isNonCoVarId' v)@. isNonCoVarId :: Var -> Bool --- A term variable (Id) that is /not/ a coercion variable isNonCoVarId (Id { id_details = details }) = not (isCoVarDetails details) isNonCoVarId _ = False ===================================== compiler/coreSyn/CoreArity.hs ===================================== @@ -158,7 +158,7 @@ exprBotStrictness_maybe e {- Note [exprArity invariant] ~~~~~~~~~~~~~~~~~~~~~~~~~~ -exprArity has the following invariant: +exprArity has the following invariants: (1) If typeArity (exprType e) = n, then manifestArity (etaExpand e n) = n ===================================== compiler/coreSyn/CoreLint.hs ===================================== @@ -570,15 +570,9 @@ lintSingleBinding top_lvl_flag rec_flag (binder,rhs) (addWarnL (text "INLINE binder is (non-rule) loop breaker:" <+> ppr binder)) -- Only non-rule loop breakers inhibit inlining - -- Check whether arity and demand type are consistent (only if demand analysis - -- already happened) - -- - -- Note (Apr 2014): this is actually ok. See Note [Demand analysis for trivial right-hand sides] - -- in DmdAnal. After eta-expansion in CorePrep the rhs is no longer trivial. - -- ; let dmdTy = idStrictness binder - -- ; checkL (case dmdTy of - -- StrictSig dmd_ty -> idArity binder >= dmdTypeDepth dmd_ty || exprIsTrivial rhs) - -- (mkArityMsg binder) + -- We used to check that the dmdTypeDepth of a demand signature never + -- exceeds idArity, but that is an unnecessary complication, see + -- Note [idArity varies independently of dmdTypeDepth] in DmdAnal -- Check that the binder's arity is within the bounds imposed by -- the type and the strictness signature. See Note [exprArity invariant] @@ -2562,20 +2556,6 @@ mkKindErrMsg tyvar arg_ty hang (text "Arg type:") 4 (ppr arg_ty <+> dcolon <+> ppr (typeKind arg_ty))] -{- Not needed now -mkArityMsg :: Id -> MsgDoc -mkArityMsg binder - = vcat [hsep [text "Demand type has", - ppr (dmdTypeDepth dmd_ty), - text "arguments, rhs has", - ppr (idArity binder), - text "arguments,", - ppr binder], - hsep [text "Binder's strictness signature:", ppr dmd_ty] - - ] - where (StrictSig dmd_ty) = idStrictness binder --} mkCastErr :: CoreExpr -> Coercion -> Type -> Type -> MsgDoc mkCastErr expr = mk_cast_err "expression" "type" (ppr expr) ===================================== compiler/coreSyn/CoreUnfold.hs ===================================== @@ -1149,15 +1149,15 @@ certainlyWillInline dflags fn_info -- INLINABLE functions come via this path -- See Note [certainlyWillInline: INLINABLE] do_cunf expr (UnfIfGoodArgs { ug_size = size, ug_args = args }) - | not (null args) -- See Note [certainlyWillInline: be careful of thunks] + | arityInfo fn_info > 0 -- See Note [certainlyWillInline: be careful of thunks] , not (isBottomingSig (strictnessInfo fn_info)) -- Do not unconditionally inline a bottoming functions even if -- it seems smallish. We've carefully lifted it out to top level, -- so we don't want to re-inline it. - , let arity = length args - , size - (10 * (arity + 1)) <= ufUseThreshold dflags + , let unf_arity = length args + , size - (10 * (unf_arity + 1)) <= ufUseThreshold dflags = Just (fn_unf { uf_src = InlineStable - , uf_guidance = UnfWhen { ug_arity = arity + , uf_guidance = UnfWhen { ug_arity = unf_arity , ug_unsat_ok = unSaturatedOk , ug_boring_ok = inlineBoringOk expr } }) -- Note the "unsaturatedOk". A function like f = \ab. a @@ -1175,6 +1175,17 @@ found that the WorkWrap phase thought that y = case x of F# v -> F# (v +# v) was certainlyWillInline, so the addition got duplicated. +Note that we check arityInfo instead of the arity of the unfolding to detect +this case. This is so that we don't accidentally fail to inline small partial +applications, like `f = g 42` (where `g` recurses into `f`) where g has arity 2 +(say). Here there is no risk of work duplication, and the RHS is tiny, so +certainlyWillInline should return True. But `unf_arity` is zero! However f's +arity, gotten from `arityInfo fn_info`, is 1. + +Failing to say that `f` will inline forces W/W to generate a potentially huge +worker for f that will immediately cancel with `g`'s wrapper anyway, causing +unnecessary churn in the Simplifier while arriving at the same result. + Note [certainlyWillInline: INLINABLE] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ certainlyWillInline /must/ return Nothing for a large INLINABLE thing, ===================================== compiler/simplCore/SimplMonad.hs ===================================== @@ -21,7 +21,7 @@ module SimplMonad ( import GhcPrelude -import Var ( Var, isTyVar, mkLocalVar ) +import Var ( Var, isId, mkLocalVar ) import Name ( mkSystemVarName ) import Id ( Id, mkSysLocalOrCoVar ) import IdInfo ( IdDetails(..), vanillaIdInfo, setArityInfo ) @@ -187,7 +187,8 @@ newJoinId bndrs body_ty = do { uniq <- getUniqueM ; let name = mkSystemVarName uniq (fsLit "$j") join_id_ty = mkLamTypes bndrs body_ty -- Note [Funky mkLamTypes] - arity = length (filter (not . isTyVar) bndrs) + -- Note [idArity for join points] in SimplUtils + arity = length (filter isId bndrs) join_arity = length bndrs details = JoinId join_arity id_info = vanillaIdInfo `setArityInfo` arity ===================================== compiler/simplCore/SimplUtils.hs ===================================== @@ -1508,7 +1508,7 @@ tryEtaExpandRhs :: SimplMode -> OutId -> OutExpr -> SimplM (Arity, Bool, OutExpr) -- See Note [Eta-expanding at let bindings] -- If tryEtaExpandRhs rhs = (n, is_bot, rhs') then --- (a) rhs' has manifest arity +-- (a) rhs' has manifest arity n -- (b) if is_bot is True then rhs' applied to n args is guaranteed bottom tryEtaExpandRhs mode bndr rhs | Just join_arity <- isJoinId_maybe bndr @@ -1517,6 +1517,7 @@ tryEtaExpandRhs mode bndr rhs -- Note [Do not eta-expand join points] -- But do return the correct arity and bottom-ness, because -- these are used to set the bndr's IdInfo (#15517) + -- Note [idArity for join points] | otherwise = do { (new_arity, is_bot, new_rhs) <- try_expand @@ -1610,6 +1611,13 @@ CorePrep comes around, the code is very likely to look more like this: $j2 = if n > 0 then $j1 else (...) eta +Note [idArity for join points] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Because of Note [Do not eta-expand join points] we have it that the idArity +of a join point is always (less than or) equal to the join arity. +Essentially, for join points we set `idArity $j = count isId join_lam_bndrs`. +It really can be less if there are type-level binders in join_lam_bndrs. + Note [Do not eta-expand PAPs] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We used to have old_arity = manifestArity rhs, which meant that we ===================================== compiler/stranal/DmdAnal.hs ===================================== @@ -206,7 +206,6 @@ dmdAnal' env dmd (App fun arg) -- , text "overall res dmd_ty =" <+> ppr (res_ty `bothDmdType` arg_ty) ]) (res_ty `bothDmdType` arg_ty, App fun' arg') --- this is an anonymous lambda, since @dmdAnalRhsLetDown@ uses @collectBinders@ dmdAnal' env dmd (Lam var body) | isTyVar var = let @@ -286,10 +285,7 @@ dmdAnal' env dmd (Case scrut case_bndr ty alts) -- This is used for a non-recursive local let without manifest lambdas. -- This is the LetUp rule in the paper “Higher-Order Cardinality Analysis”. dmdAnal' env dmd (Let (NonRec id rhs) body) - | useLetUp id rhs - , Nothing <- unpackTrivial rhs - -- dmdAnalRhsLetDown treats trivial right hand sides specially - -- so if we have a trival right hand side, fall through to that. + | useLetUp id = (final_ty, Let (NonRec id' rhs') body') where (body_ty, body') = dmdAnal env dmd body @@ -582,25 +578,6 @@ environment, which effectively assigns them 'nopSig' (see "getStrictness") -} --- Trivial RHS --- See Note [Demand analysis for trivial right-hand sides] -dmdAnalTrivialRhs :: - AnalEnv -> Id -> CoreExpr -> Var -> - (DmdEnv, Id, CoreExpr) -dmdAnalTrivialRhs env id rhs fn - = (fn_fv, set_idStrictness env id fn_str, rhs) - where - fn_str = getStrictness env fn - fn_fv | isLocalId fn = unitVarEnv fn topDmd - | otherwise = emptyDmdEnv - -- Note [Remember to demand the function itself] - -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -- fn_fv: don't forget to produce a demand for fn itself - -- Lacking this caused #9128 - -- The demand is very conservative (topDmd), but that doesn't - -- matter; trivial bindings are usually inlined, so it only - -- kicks in for top-level bindings and NOINLINE bindings - -- Let bindings can be processed in two ways: -- Down (RHS before body) or Up (body before RHS). -- dmdAnalRhsLetDown implements the Down variant: @@ -621,28 +598,23 @@ dmdAnalRhsLetDown :: TopLevelFlag -- Process the RHS of the binding, add the strictness signature -- to the Id, and augment the environment with the signature as well. dmdAnalRhsLetDown top_lvl rec_flag env let_dmd id rhs - | Just fn <- unpackTrivial rhs -- See Note [Demand analysis for trivial right-hand sides] - = dmdAnalTrivialRhs env id rhs fn - - | otherwise - = (lazy_fv, id', mkLams bndrs' body') + = (lazy_fv, id', rhs') where - (bndrs, body, body_dmd) - = case isJoinId_maybe id of - Just join_arity -- See Note [Demand analysis for join points] - | (bndrs, body) <- collectNBinders join_arity rhs - -> (bndrs, body, let_dmd) - - Nothing | (bndrs, body) <- collectBinders rhs - -> (bndrs, body, mkBodyDmd env body) - - env_body = foldl' extendSigsWithLam env bndrs - (body_ty, body') = dmdAnal env_body body_dmd body - body_ty' = removeDmdTyArgs body_ty -- zap possible deep CPR info - (DmdType rhs_fv rhs_dmds rhs_res, bndrs') - = annotateLamBndrs env (isDFunId id) body_ty' bndrs - sig_ty = mkStrictSig (mkDmdType sig_fv rhs_dmds rhs_res') - id' = set_idStrictness env id sig_ty + rhs_arity = idArity id + rhs_dmd + -- See Note [Demand analysis for join points] + -- See Note [idArity for join points] in SimplUtils + -- rhs_arity matches the join arity of the join point + | isJoinId id + = mkCallDmds rhs_arity let_dmd + | otherwise + -- NB: rhs_arity + -- See Note [Demand signatures are computed for a threshold demand based on idArity] + = mkRhsDmd env rhs_arity rhs + (DmdType rhs_fv rhs_dmds rhs_res, rhs') + = dmdAnal env rhs_dmd rhs + sig = mkStrictSigForArity rhs_arity (mkDmdType sig_fv rhs_dmds rhs_res') + id' = set_idStrictness env id sig -- See Note [NOINLINE and strictness] @@ -666,36 +638,63 @@ dmdAnalRhsLetDown top_lvl rec_flag env let_dmd id rhs || not (isStrictDmd (idDemandInfo id) || ae_virgin env) -- See Note [Optimistic CPR in the "virgin" case] -mkBodyDmd :: AnalEnv -> CoreExpr -> CleanDemand --- See Note [Product demands for function body] -mkBodyDmd env body - = case deepSplitProductType_maybe (ae_fam_envs env) (exprType body) of - Nothing -> cleanEvalDmd - Just (dc, _, _, _) -> cleanEvalProdDmd (dataConRepArity dc) - -unpackTrivial :: CoreExpr -> Maybe Id --- Returns (Just v) if the arg is really equal to v, modulo --- casts, type applications etc --- See Note [Demand analysis for trivial right-hand sides] -unpackTrivial (Var v) = Just v -unpackTrivial (Cast e _) = unpackTrivial e -unpackTrivial (Lam v e) | isTyVar v = unpackTrivial e -unpackTrivial (App e a) | isTypeArg a = unpackTrivial e -unpackTrivial _ = Nothing - --- | If given the RHS of a let-binding, this 'useLetUp' determines --- whether we should process the binding up (body before rhs) or --- down (rhs before body). +-- | @mkRhsDmd env rhs_arity rhs@ creates a 'CleanDemand' for +-- unleashing on the given function's @rhs@, by creating a call demand of +-- @rhs_arity@ with a body demand appropriate for possible product types. +-- See Note [Product demands for function body]. +-- For example, a call of the form @mkRhsDmd _ 2 (\x y -> (x, y))@ returns a +-- clean usage demand of @C1(C1(U(U,U)))@. +mkRhsDmd :: AnalEnv -> Arity -> CoreExpr -> CleanDemand +mkRhsDmd env rhs_arity rhs = + case peelTsFuns rhs_arity (findTypeShape (ae_fam_envs env) (exprType rhs)) of + Just (TsProd tss) -> mkCallDmds rhs_arity (cleanEvalProdDmd (length tss)) + _ -> mkCallDmds rhs_arity cleanEvalDmd + +-- | If given the let-bound 'Id', 'useLetUp' determines whether we should +-- process the binding up (body before rhs) or down (rhs before body). -- --- We use LetDown if there is a chance to get a useful strictness signature. --- This is the case when there are manifest value lambdas or the binding is a --- join point (hence always acts like a function, not a value). -useLetUp :: Var -> CoreExpr -> Bool -useLetUp f _ | isJoinId f = False -useLetUp f (Lam v e) | isTyVar v = useLetUp f e -useLetUp _ (Lam _ _) = False -useLetUp _ _ = True - +-- We use LetDown if there is a chance to get a useful strictness signature to +-- unleash at call sites. LetDown is generally more precise than LetUp if we can +-- correctly guess how it will be used in the body, that is, for which incoming +-- demand the strictness signature should be computed, which allows us to +-- unleash higher-order demands on arguments at call sites. This is mostly the +-- case when +-- +-- * The binding takes any arguments before performing meaningful work (cf. +-- 'idArity'), in which case we are interested to see how it uses them. +-- * The binding is a join point, hence acting like a function, not a value. +-- As a big plus, we know *precisely* how it will be used in the body; since +-- it's always tail-called, we can directly unleash the incoming demand of +-- the let binding on its RHS when computing a strictness signature. See +-- [Demand analysis for join points]. +-- +-- Thus, if the binding is not a join point and its arity is 0, we have a thunk +-- and use LetUp, implying that we have no usable demand signature available +-- when we analyse the let body. +-- +-- Since thunk evaluation is memoised, we want to unleash its 'DmdEnv' of free +-- vars at most once, regardless of how many times it was forced in the body. +-- This makes a real difference wrt. usage demands. The other reason is being +-- able to unleash a more precise product demand on its RHS once we know how the +-- thunk was used in the let body. +-- +-- Characteristic examples, always assuming a single evaluation: +-- +-- * @let x = 2*y in x + x@ => LetUp. Compared to LetDown, we find out that +-- the expression uses @y@ at most once. +-- * @let x = (a,b) in fst x@ => LetUp. Compared to LetDown, we find out that +-- @b@ is absent. +-- * @let f x = x*2 in f y@ => LetDown. Compared to LetUp, we find out that +-- the expression uses @y@ strictly, because we have @f@'s demand signature +-- available at the call site. +-- * @join exit = 2*y in if a then exit else if b then exit else 3*y@ => +-- LetDown. Compared to LetUp, we find out that the expression uses @y@ +-- strictly, because we can unleash @exit@'s signature at each call site. +-- * For a more convincing example with join points, see Note [Demand analysis +-- for join points]. +-- +useLetUp :: Var -> Bool +useLetUp f = idArity f == 0 && not (isJoinId f) {- Note [Demand analysis for join points] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -728,22 +727,141 @@ let_dmd here). Another win for join points! #13543. +Note [Demand signatures are computed for a threshold demand based on idArity] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We compute demand signatures assuming idArity incoming arguments to approximate +behavior for when we have a call site with at least that many arguments. idArity +is /at least/ the number of manifest lambdas, but might be higher for PAPs and +trivial RHS (see Note [Demand analysis for trivial right-hand sides]). + +Because idArity of a function varies independently of its cardinality properties +(cf. Note [idArity varies independently of dmdTypeDepth]), we implicitly encode +the arity for when a demand signature is sound to unleash in its 'dmdTypeDepth' +(cf. Note [Understanding DmdType and StrictSig] in Demand). It is unsound to +unleash a demand signature when the incoming number of arguments is less than +that. See Note [What are demand signatures?] for more details on soundness. + +Why idArity arguments? Because that's a conservative estimate of how many +arguments we must feed a function before it does anything interesting with them. +Also it elegantly subsumes the trivial RHS and PAP case. + +There might be functions for which we might want to analyse for more incoming +arguments than idArity. Example: + + f x = + if expensive + then \y -> ... y ... + else \y -> ... y ... + +We'd analyse `f` under a unary call demand C(S), corresponding to idArity +being 1. That's enough to look under the manifest lambda and find out how a +unary call would use `x`, but not enough to look into the lambdas in the if +branches. + +On the other hand, if we analysed for call demand C(C(S)), we'd get useful +strictness info for `y` (and more precise info on `x`) and possibly CPR +information, but + + * We would no longer be able to unleash the signature at unary call sites + * Performing the worker/wrapper split based on this information would be + implicitly eta-expanding `f`, playing fast and loose with divergence and + even being unsound in the presence of newtypes, so we refrain from doing so. + Also see Note [Don't eta expand in w/w] in WorkWrap. + +Since we only compute one signature, we do so for arity 1. Computing multiple +signatures for different arities (i.e., polyvariance) would be entirely +possible, if it weren't for the additional runtime and implementation +complexity. + +Note [idArity varies independently of dmdTypeDepth] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We used to check in CoreLint that dmdTypeDepth <= idArity for a let-bound +identifier. But that means we would have to zap demand signatures every time we +reset or decrease arity. That's an unnecessary dependency, because + + * The demand signature captures a semantic property that is independent of + what the binding's current arity is + * idArity is analysis information itself, thus volatile + * We already *have* dmdTypeDepth, wo why not just use it to encode the + threshold for when to unleash the signature + (cf. Note [Understanding DmdType and StrictSig] in Demand) + +Consider the following expression, for example: + + (let go x y = `x` seq ... in go) |> co + +`go` might have a strictness signature of ``. The simplifier will identify +`go` as a nullary join point through `joinPointBinding_maybe` and float the +coercion into the binding, leading to an arity decrease: + + join go = (\x y -> `x` seq ...) |> co in go + +With the CoreLint check, we would have to zap `go`'s perfectly viable strictness +signature. + +Note [What are demand signatures?] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Demand analysis interprets expressions in the abstract domain of demand +transformers. Given an incoming demand we put an expression under, its abstract +transformer gives us back a demand type denoting how other things (like +arguments and free vars) were used when the expression was evaluated. +Here's an example: + + f x y = + if x + expensive + then \z -> z + y * ... + else \z -> z * ... + +The abstract transformer (let's call it F_e) of the if expression (let's call it +e) would transform an incoming head demand into a demand type like +{x->,y->}. In pictures: + + Demand ---F_e---> DmdType + {x->,y->} + +Let's assume that the demand transformers we compute for an expression are +correct wrt. to some concrete semantics for Core. How do demand signatures fit +in? They are strange beasts, given that they come with strict rules when to +it's sound to unleash them. + +Fortunately, we can formalise the rules with Galois connections. Consider +f's strictness signature, {}. It's a single-point approximation of +the actual abstract transformer of f's RHS for arity 2. So, what happens is that +we abstract *once more* from the abstract domain we already are in, replacing +the incoming Demand by a simple lattice with two elements denoting incoming +arity: A_2 = {<2, >=2} (where '<2' is the top element and >=2 the bottom +element). Here's the diagram: + + A_2 -----f_f----> DmdType + ^ | + | α γ | + | v + Demand ---F_f---> DmdType + +With + α(C1(C1(_))) = >=2 -- example for usage demands, but similar for strictness + α(_) = <2 + γ(ty) = ty +and F_f being the abstract transformer of f's RHS and f_f being the abstracted +abstract transformer computable from our demand signature simply by + + f_f(>=2) = {} + f_f(<2) = postProcessUnsat {} + +where postProcessUnsat makes a proper top element out of the given demand type. + Note [Demand analysis for trivial right-hand sides] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider - foo = plusInt |> co + foo = plusInt |> co where plusInt is an arity-2 function with known strictness. Clearly we want plusInt's strictness to propagate to foo! But because it has no manifest lambdas, it won't do so automatically, and indeed 'co' might -have type (Int->Int->Int) ~ T, so we *can't* eta-expand. So we have a -special case for right-hand sides that are "trivial", namely variables, -casts, type applications, and the like. +have type (Int->Int->Int) ~ T. -Note that this can mean that 'foo' has an arity that is smaller than that -indicated by its demand info. e.g. if co :: (Int->Int->Int) ~ T, then -foo's arity will be zero (see Note [exprArity invariant] in CoreArity), -but its demand signature will be that of plusInt. A small example is the -test case of #8963. +Fortunately, CoreArity gives 'foo' arity 2, which is enough for LetDown to +forward plusInt's demand signature, and all is well (see Note [Newtype arity] in +CoreArity)! A small example is the test case NewtypeArity. Note [Product demands for function body] @@ -841,13 +959,6 @@ annotateBndr env dmd_ty var where (dmd_ty', dmd) = findBndrDmd env False dmd_ty var -annotateLamBndrs :: AnalEnv -> DFunFlag -> DmdType -> [Var] -> (DmdType, [Var]) -annotateLamBndrs env args_of_dfun ty bndrs = mapAccumR annotate ty bndrs - where - annotate dmd_ty bndr - | isId bndr = annotateLamIdBndr env args_of_dfun dmd_ty bndr - | otherwise = (dmd_ty, bndr) - annotateLamIdBndr :: AnalEnv -> DFunFlag -- is this lambda at the top of the RHS of a dfun? -> DmdType -- Demand type of body @@ -1160,12 +1271,6 @@ extendSigEnv top_lvl sigs var sig = extendVarEnv sigs var (sig, top_lvl) lookupSigEnv :: AnalEnv -> Id -> Maybe (StrictSig, TopLevelFlag) lookupSigEnv env id = lookupVarEnv (ae_sigs env) id -getStrictness :: AnalEnv -> Id -> StrictSig -getStrictness env fn - | isGlobalId fn = idStrictness fn - | Just (sig, _) <- lookupSigEnv env fn = sig - | otherwise = nopSig - nonVirgin :: AnalEnv -> AnalEnv nonVirgin env = env { ae_virgin = False } ===================================== compiler/stranal/WorkWrap.hs ===================================== @@ -9,6 +9,7 @@ module WorkWrap ( wwTopBinds ) where import GhcPrelude +import CoreArity ( manifestArity ) import CoreSyn import CoreUnfold ( certainlyWillInline, mkWwInlineRule, mkWorkerUnfolding ) import CoreUtils ( exprType, exprIsHNF ) @@ -457,7 +458,7 @@ tryWW dflags fam_envs is_rec fn_id rhs -- See Note [Don't w/w INLINE things] -- See Note [Don't w/w inline small non-loop-breaker things] - | is_fun + | is_fun && is_eta_exp = splitFun dflags fam_envs new_fn_id fn_info wrap_dmds res_info rhs | is_thunk -- See Note [Thunk splitting] @@ -474,9 +475,11 @@ tryWW dflags fam_envs is_rec fn_id rhs -- See Note [Zapping DmdEnv after Demand Analyzer] and -- See Note [Zapping Used Once info in WorkWrap] - is_fun = notNull wrap_dmds || isJoinId fn_id - is_thunk = not is_fun && not (exprIsHNF rhs) && not (isJoinId fn_id) - && not (isUnliftedType (idType fn_id)) + is_fun = notNull wrap_dmds || isJoinId fn_id + -- See Note [Don't eta expand in w/w] + is_eta_exp = length wrap_dmds == manifestArity rhs + is_thunk = not is_fun && not (exprIsHNF rhs) && not (isJoinId fn_id) + && not (isUnliftedType (idType fn_id)) {- Note [Zapping DmdEnv after Demand Analyzer] @@ -516,6 +519,36 @@ want to _keep_ the info for the code generator). We do not do it in the demand analyser for the same reasons outlined in Note [Zapping DmdEnv after Demand Analyzer] above. + +Note [Don't eta expand in w/w] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A binding where the manifestArity of the RHS is less than idArity of the binder +means CoreArity didn't eta expand that binding. When this happens, it does so +for a reason (see Note [exprArity invariant] in CoreArity) and we probably have +a PAP, cast or trivial expression as RHS. + +Performing the worker/wrapper split will implicitly eta-expand the binding to +idArity, overriding CoreArity's decision. Other than playing fast and loose with +divergence, it's also broken for newtypes: + + f = (\xy.blah) |> co + where + co :: (Int -> Int -> Char) ~ T + +Then idArity is 2 (despite the type T), and it can have a StrictSig based on a +threshold of 2. But we can't w/w it without a type error. + +The situation is less grave for PAPs, but the implicit eta expansion caused a +compiler allocation regression in T15164, where huge recursive instance method +groups, mostly consisting of PAPs, got w/w'd. This caused great churn in the +simplifier, when simply waiting for the PAPs to inline arrived at the same +output program. + +Note there is the worry here that such PAPs and trivial RHSs might not *always* +be inlined. That would lead to reboxing, because the analysis tacitly assumes +that we W/W'd for idArity and will propagate analysis information under that +assumption. So far, this doesn't seem to matter in practice. +See https://gitlab.haskell.org/ghc/ghc/merge_requests/312#note_192064. -} ===================================== compiler/stranal/WwLib.hs ===================================== @@ -134,7 +134,7 @@ mkWwBodies :: DynFlags -- wrap_fn_str E = case x of { (a,b) -> -- case a of { (a1,a2) -> -- E a1 a2 b y }} --- work_fn_str E = \a2 a2 b y -> +-- work_fn_str E = \a1 a2 b y -> -- let a = (a1,a2) in -- let x = (a,b) in -- E ===================================== testsuite/tests/perf/compiler/WWRec.hs ===================================== @@ -0,0 +1,73 @@ +{-# LANGUAGE MultiParamTypeClasses #-} +{-# LANGUAGE FlexibleContexts #-} +{-# LANGUAGE FlexibleInstances #-} +{-# LANGUAGE UndecidableInstances #-} + +module WWRec where + +class Rule f a where + get :: Decorator f => f a +class Monad f => Decorator f where + foo :: Rule f a => f a + +data A1 = MkA1 A2 +data A2 = MkA2 A3 +data A3 = MkA3 A4 +data A4 = MkA4 A5 +data A5 = MkA5 A6 +data A6 = MkA6 A7 +data A7 = MkA7 A8 +data A8 = MkA8 A9 +data A9 = MkA9 A10 +data A10 = MkA10 A11 +data A11 = MkA11 A12 +data A12 = MkA12 A13 +data A13 = MkA13 A14 +data A14 = MkA14 A15 +data A15 = MkA15 A16 +data A16 = MkA16 A17 +data A17 = MkA17 A18 +data A18 = MkA18 A19 +data A19 = MkA19 A20 +data A20 = MkA20 A21 +data A21 = MkA21 A22 +data A22 = MkA22 A23 +data A23 = MkA23 A24 +data A24 = MkA24 A25 +data A25 = MkA25 A26 +data A26 = MkA26 A27 +data A27 = MkA27 A28 +data A28 = MkA28 A29 +data A29 = MkA29 A30 +data A30 = MkA30 A1 + +instance Rule f A2 => Rule f A1 where get = MkA1 <$> foo +instance Rule f A3 => Rule f A2 where get = MkA2 <$> foo +instance Rule f A4 => Rule f A3 where get = MkA3 <$> foo +instance Rule f A5 => Rule f A4 where get = MkA4 <$> foo +instance Rule f A6 => Rule f A5 where get = MkA5 <$> foo +instance Rule f A7 => Rule f A6 where get = MkA6 <$> foo +instance Rule f A8 => Rule f A7 where get = MkA7 <$> foo +instance Rule f A9 => Rule f A8 where get = MkA8 <$> foo +instance Rule f A10 => Rule f A9 where get = MkA9 <$> foo +instance Rule f A11 => Rule f A10 where get = MkA10 <$> foo +instance Rule f A12 => Rule f A11 where get = MkA11 <$> foo +instance Rule f A13 => Rule f A12 where get = MkA12 <$> foo +instance Rule f A14 => Rule f A13 where get = MkA13 <$> foo +instance Rule f A15 => Rule f A14 where get = MkA14 <$> foo +instance Rule f A16 => Rule f A15 where get = MkA15 <$> foo +instance Rule f A17 => Rule f A16 where get = MkA16 <$> foo +instance Rule f A18 => Rule f A17 where get = MkA17 <$> foo +instance Rule f A19 => Rule f A18 where get = MkA18 <$> foo +instance Rule f A20 => Rule f A19 where get = MkA19 <$> foo +instance Rule f A21 => Rule f A20 where get = MkA20 <$> foo +instance Rule f A22 => Rule f A21 where get = MkA21 <$> foo +instance Rule f A23 => Rule f A22 where get = MkA22 <$> foo +instance Rule f A24 => Rule f A23 where get = MkA23 <$> foo +instance Rule f A25 => Rule f A24 where get = MkA24 <$> foo +instance Rule f A26 => Rule f A25 where get = MkA25 <$> foo +instance Rule f A27 => Rule f A26 where get = MkA26 <$> foo +instance Rule f A28 => Rule f A27 where get = MkA27 <$> foo +instance Rule f A29 => Rule f A28 where get = MkA28 <$> foo +instance Rule f A30 => Rule f A29 where get = MkA29 <$> foo +instance Rule f A1 => Rule f A30 where get = MkA30 <$> foo ===================================== testsuite/tests/perf/compiler/all.T ===================================== @@ -393,6 +393,13 @@ test ('T15164', compile, ['-v0 -O']) +# See https://gitlab.haskell.org/ghc/ghc/merge_requests/312#note_186960 +test ('WWRec', + [ collect_compiler_stats('bytes allocated',10) + ], + compile, + ['-v0 -O']) + test('T16190', [ collect_stats(), when(opsys('mingw32'), expect_broken(16389)) ===================================== testsuite/tests/stranal/sigs/NewtypeArity.hs ===================================== @@ -0,0 +1,10 @@ +-- | 't' and 't2' should have a strictness signature for arity 2 here. +module Test where + +newtype T = MkT (Int -> Int -> Int) + +t :: T +t = MkT (\a b -> a + b) + +t2 :: T +t2 = MkT (+) ===================================== testsuite/tests/stranal/sigs/NewtypeArity.stderr ===================================== @@ -0,0 +1,18 @@ + +==================== Strictness signatures ==================== +Test.$tc'MkT: m +Test.$tcT: m +Test.$trModule: m +Test.t: m +Test.t2: m + + + +==================== Strictness signatures ==================== +Test.$tc'MkT: m +Test.$tcT: m +Test.$trModule: m +Test.t: m +Test.t2: m + + ===================================== testsuite/tests/stranal/sigs/all.T ===================================== @@ -17,3 +17,4 @@ test('BottomFromInnerLambda', normal, compile, ['']) test('DmdAnalGADTs', normal, compile, ['']) test('T12370', normal, compile, ['']) test('CaseBinderCPR', normal, compile, ['']) +test('NewtypeArity', normal, compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/f2105165903426ff5f86045f4a9533ac023d9939 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/f2105165903426ff5f86045f4a9533ac023d9939 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 20:28:12 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Fri, 05 Apr 2019 16:28:12 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca7ba5c6b489_62b33fa2cc3d37c81836772@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: e7369f6b by Joachim Breitner at 2019-04-05T20:28:04Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,23 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where - entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + let entry_addr = interpConstrEntry !! ptrtag + code' <- if tables_next_to_code + then Just <$> mkJumpToAddr entry_addr + else pure Nothing + let itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode - then Just code' - else Nothing + code = code' } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +75,46 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +mkJumpToAddr :: Monad m => EntryFunPtr-> m ItblCodes +mkJumpToAddr ptr = do + arch <- case mArch of + Just a -> pure a + Nothing -> + -- This code must not be called. You either need to add your + -- architecture as a distinct case to 'Arch' and 'mArch', or use + -- non-TABLES_NEXT_TO_CODE mode. + fail "mkJumpToAddr: Unknown obscure arch is not supported with TABLES_NEXT_TO_CODE" + pure $ mkJumpToAddr' arch ptr + +-- | 'Just' if it's a known OS, or 'Nothing' otherwise. +mArch :: Maybe Arch +mArch = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif -mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr' :: Arch -> EntryFunPtr -> ItblCodes +mkJumpToAddr' platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +274,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +317,40 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 - | otherwise = - case mkJumpToAddr undefined of +sizeOfEntryCode :: Monad m => Bool -> m Int +sizeOfEntryCode tables_next_to_code = do + | not tables_next_to_code = pure 0 + | otherwise = do + code' <- mkJumpToAddr undefined + pure $ case code' of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do + sz0 <- sizeOfEntryCode tables_next_to_code let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral $ conInfoTableSizeB + sz0 -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +359,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e7369f6bf0351ab9fe0752afe27e71f6e691d694 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e7369f6bf0351ab9fe0752afe27e71f6e691d694 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 5 22:15:37 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Fri, 05 Apr 2019 18:15:37 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca7d38979386_62b33fa2c58046b41837815@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: 43f6cd44 by Joachim Breitner at 2019-04-05T22:11:03Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,23 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where - entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + let entry_addr = interpConstrEntry !! ptrtag + code' <- if tables_next_to_code + then Just <$> mkJumpToAddr entry_addr + else pure Nothing + let itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode - then Just code' - else Nothing + code = code' } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +75,46 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +mkJumpToAddr :: Monad m => EntryFunPtr-> m ItblCodes +mkJumpToAddr ptr = do + arch <- case mArch of + Just a -> pure a + Nothing -> + -- This code must not be called. You either need to add your + -- architecture as a distinct case to 'Arch' and 'mArch', or use + -- non-TABLES_NEXT_TO_CODE mode. + fail "mkJumpToAddr: Unknown obscure arch is not supported with TABLES_NEXT_TO_CODE" + pure $ mkJumpToAddr' arch ptr + +-- | 'Just' if it's a known OS, or 'Nothing' otherwise. +mArch :: Maybe Arch +mArch = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif -mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr' :: Arch -> EntryFunPtr -> ItblCodes +mkJumpToAddr' platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +274,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +317,40 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 - | otherwise = - case mkJumpToAddr undefined of +sizeOfEntryCode :: Monad m => Bool -> m Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = pure 0 + | otherwise = do + code' <- mkJumpToAddr undefined + pure $ case code' of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do + sz0 <- sizeOfEntryCode tables_next_to_code let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral $ conInfoTableSizeB + sz0 -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +359,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/43f6cd445d1c4f211cd059dff07c0f3f1df6b15d -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/43f6cd445d1c4f211cd059dff07c0f3f1df6b15d You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 01:01:51 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Fri, 05 Apr 2019 21:01:51 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca7fa7fb70b8_62b33fa2c58046b4185204a@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: db63d200 by Joachim Breitner at 2019-04-06T01:01:31Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,6 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +875,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,18 +1617,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - --- Determines whether we will be compiling --- info tables that reside just before the entry code, or with an --- indirection to the entry code. See TABLES_NEXT_TO_CODE in --- includes/rts/storage/InfoTables.h. -mkTablesNextToCode :: Bool -> Bool -mkTablesNextToCode unregisterised - = not unregisterised && cGhcEnableTablesNextToCode == "YES" - data DynLibLoader = Deployable | SystemDependent @@ -1874,6 +1865,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,23 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where - entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + let entry_addr = interpConstrEntry !! ptrtag + code' <- if tables_next_to_code + then Just <$> mkJumpToAddr entry_addr + else pure Nothing + let itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode - then Just code' - else Nothing + code = code' } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +75,46 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +mkJumpToAddr :: MonadFail m => EntryFunPtr-> m ItblCodes +mkJumpToAddr ptr = do + arch <- case mArch of + Just a -> pure a + Nothing -> + -- This code must not be called. You either need to add your + -- architecture as a distinct case to 'Arch' and 'mArch', or use + -- non-TABLES_NEXT_TO_CODE mode. + fail "mkJumpToAddr: Unknown obscure arch is not supported with TABLES_NEXT_TO_CODE" + pure $ mkJumpToAddr' arch ptr + +-- | 'Just' if it's a known OS, or 'Nothing' otherwise. +mArch :: Maybe Arch +mArch = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif -mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr' :: Arch -> EntryFunPtr -> ItblCodes +mkJumpToAddr' platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +274,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +317,40 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 - | otherwise = - case mkJumpToAddr undefined of +sizeOfEntryCode :: MonadFail m => Bool -> m Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = pure 0 + | otherwise = do + code' <- mkJumpToAddr undefined + pure $ case code' of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do + sz0 <- sizeOfEntryCode tables_next_to_code let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral $ conInfoTableSizeB + sz0 -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +359,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/db63d200e85f73887ada6223e629fca88392754b -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/db63d200e85f73887ada6223e629fca88392754b You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 04:38:45 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 00:38:45 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 17 commits: Replace git.haskell.org with gitlab.haskell.org (#16196) Message-ID: <5ca82d5597d49_62b33fa2c8df70001858063@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 92b2f16b by Ben Gamari at 2019-04-06T02:26:30Z gitlab-ci: Test using slowtest in deb9-debug job - - - - - 6beb2fd1 by Ben Gamari at 2019-04-06T02:26:30Z testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways As noted in #16535. - - - - - 8ea9b0b6 by Ömer Sinan Ağacan at 2019-04-06T02:26:30Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - bf8699fc by Ben Gamari at 2019-04-06T02:26:30Z testsuite: Make closureSize less sensitive to optimisation - - - - - 0b65c722 by Ben Gamari at 2019-04-06T02:26:30Z process: Skip process005 in ghci way - - - - - a70c9a98 by Ben Gamari at 2019-04-06T02:26:30Z testsuite: Mark T13167 as fragile in threaded2 As noted in #16536. - - - - - 938ed3f1 by Ben Gamari at 2019-04-06T02:26:38Z testsuite: Mark T13910 as broken in optasm Due to #16537. - - - - - 4aff3254 by Ben Gamari at 2019-04-06T02:26:38Z testsuite: Mark T14272 as broken in optasm - - - - - b77cc30c by Ben Gamari at 2019-04-06T02:26:38Z testsuite: Mark T14761c as broken in hpc and optasm ways As noted in #16540. - - - - - 58cbe0a3 by Ben Gamari at 2019-04-06T02:26:38Z testsuite: Mark T16180 as broken in ghci way As noted in #16541. - - - - - dba27fea by Ben Gamari at 2019-04-06T02:26:38Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - 1c283c1b by Ben Gamari at 2019-04-06T02:26:38Z testsuite: Mark Overflow as broken in hpc way As noted in #16543. - - - - - 68cc8db4 by Ben Gamari at 2019-04-06T02:26:38Z testsuite: Mark closure_size as broken in threaded2, optasm, dyn ways As noted in #16531. - - - - - e6e5279a by Ben Gamari at 2019-04-06T02:26:38Z testsuite: Mark T2783 as fragile in threaded1 It was previously marked as broken but it passes non-deterministically. See #2783. - - - - - 30 changed files: - .gitlab-ci.yml - HACKING.md - README.md - boot - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/types/Type.hs - hadrian/appveyor.yml - hadrian/doc/windows.md - libraries/base/base.cabal - libraries/base/tests/all.T - libraries/ghc-boot-th/ghc-boot-th.cabal.in - libraries/ghc-boot/ghc-boot.cabal.in - libraries/ghc-compact/ghc-compact.cabal - libraries/ghc-heap/ghc-heap.cabal.in - libraries/ghc-heap/tests/all.T - libraries/ghc-heap/tests/closure_size.hs - libraries/ghc-prim/ghc-prim.cabal - libraries/ghci/ghci.cabal.in - libraries/integer-simple/integer-simple.cabal - libraries/process - libraries/template-haskell/template-haskell.cabal.in - testsuite/driver/testlib.py - testsuite/tests/dependent/should_compile/all.T - + testsuite/tests/deriving/should_compile/T16518.hs - testsuite/tests/deriving/should_compile/all.T - + testsuite/tests/ghci/scripts/T16527.hs - + testsuite/tests/ghci/scripts/T16527.script - + testsuite/tests/ghci/scripts/T16527.stdout - testsuite/tests/ghci/scripts/all.T The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/043c00bc5418d51c94b7a9e346bc9c11e0415825...e6e5279a9a9f5cfa0a562b17432bd80b5675774d -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/043c00bc5418d51c94b7a9e346bc9c11e0415825...e6e5279a9a9f5cfa0a562b17432bd80b5675774d You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 13:39:37 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 09:39:37 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 11 commits: process: Bump submodule Message-ID: <5ca8ac19cb480_62b33fa2c61dc2541892095@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: ad8e7efb by Ben Gamari at 2019-04-06T13:39:09Z process: Bump submodule * Skip process005 in ghci way * Mark process002 as fragile in threaded2 - - - - - 56502e5e by Ben Gamari at 2019-04-06T13:39:27Z testsuite: Mark T13167 as fragile in threaded2 As noted in #16536. - - - - - 5d4f1f10 by Ben Gamari at 2019-04-06T13:39:27Z testsuite: Mark T13910 as broken in optasm Due to #16537. - - - - - ad9ca3fe by Ben Gamari at 2019-04-06T13:39:27Z testsuite: Mark T14272 as broken in optasm - - - - - a6bc671a by Ben Gamari at 2019-04-06T13:39:28Z testsuite: Mark T14761c as broken in hpc and optasm ways As noted in #16540. - - - - - c5e4b6b2 by Ben Gamari at 2019-04-06T13:39:28Z testsuite: Mark T16180 as broken in ghci way As noted in #16541. - - - - - 218dfef2 by Ben Gamari at 2019-04-06T13:39:28Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - 0bc3ea26 by Ben Gamari at 2019-04-06T13:39:28Z testsuite: Mark Overflow as broken in hpc way As noted in #16543. - - - - - 8fb72f3c by Ben Gamari at 2019-04-06T13:39:28Z testsuite: Mark closure_size as broken in threaded2, optasm, dyn ways As noted in #16531. - - - - - a2214c77 by Ben Gamari at 2019-04-06T13:39:28Z testsuite: Mark T2783 as fragile in threaded1 It was previously marked as broken but it passes non-deterministically. See #2783. - - - - - 4ee82fe1 by Ben Gamari at 2019-04-06T13:39:28Z testsuite: Fix fragile_for test modifier - - - - - 11 changed files: - libraries/base/tests/all.T - libraries/ghc-heap/tests/all.T - libraries/process - testsuite/driver/testlib.py - testsuite/tests/dependent/should_compile/all.T - testsuite/tests/rts/all.T - testsuite/tests/simplCore/should_compile/all.T - testsuite/tests/th/all.T - testsuite/tests/typecheck/should_compile/all.T - testsuite/tests/typecheck/should_run/all.T - testsuite/tests/warnings/should_compile/all.T Changes: ===================================== libraries/base/tests/all.T ===================================== @@ -233,5 +233,5 @@ test('T3474', test('T14425', normal, compile_and_run, ['']) test('T10412', normal, compile_and_run, ['']) test('T13896', normal, compile_and_run, ['']) -test('T13167', normal, compile_and_run, ['']) +test('T13167', fragile_for(16536, ['threaded2']), compile_and_run, ['']) test('T15349', [exit_code(1), expect_broken_for(15349, 'ghci')], compile_and_run, ['']) ===================================== libraries/ghc-heap/tests/all.T ===================================== @@ -7,5 +7,5 @@ test('heap_all', ], compile_and_run, ['']) test('closure_size', - omit_ways(['ghci', 'hpc', 'prof']), + [omit_ways(['ghci', 'hpc', 'prof']), expect_broken_for(16544, ['dyn', 'optasm', 'threaded2'])], compile_and_run, ['']) ===================================== libraries/process ===================================== @@ -1 +1 @@ -Subproject commit d860209e53c1b40b7c251fc8378886bbcb394402 +Subproject commit 5c81524694ceaf8523a1846718a7a7c3f402124f ===================================== testsuite/driver/testlib.py ===================================== @@ -258,7 +258,7 @@ def fragile( bug ): return helper -def fragile_for( name, opts, bug, ways ): +def fragile_for( bug, ways ): """ Indicates that the test should be skipped due to fragility in the given test ways as documented in the given ticket. @@ -1408,7 +1408,6 @@ def simple_run(name, way, prog, extra_run_opts): return failBecause('bad stderr') if not (opts.ignore_stdout or stdout_ok(name, way)): return failBecause('bad stdout') - check_hp = '-h' in my_rts_flags and opts.check_hp check_prof = '-p' in my_rts_flags ===================================== testsuite/tests/dependent/should_compile/all.T ===================================== @@ -40,7 +40,7 @@ test('T12742', normal, compile, ['']) # (1) Use -fexternal-interpreter, or # (2) Build the program twice: once with -dynamic, and then # with -prof using -osuf to set a different object file suffix. -test('T13910', omit_ways(['profasm']), compile, ['']) +test('T13910', [expect_broken_for(16537, ['optasm']), omit_ways(['profasm'])], compile, ['']) test('T13938', [extra_files(['T13938a.hs'])], makefile_test, ['T13938']) test('T14556', normal, compile, ['']) test('T14720', normal, compile, ['']) ===================================== testsuite/tests/rts/all.T ===================================== @@ -73,7 +73,7 @@ test('T2047', [ignore_stdout, extra_run_opts('+RTS -c -RTS')], # Blackhole-detection test. # Skip GHCi due to #2786 test('T2783', [ omit_ways(['ghci']), exit_code(1) - , expect_broken_for(2783, ['threaded1']) + , fragile_for(2783, ['threaded1']) ], compile_and_run, ['']) # Test the work-stealing deque implementation. We run this test in ===================================== testsuite/tests/simplCore/should_compile/all.T ===================================== @@ -275,7 +275,7 @@ test('T14140', normal, makefile_test, ['T14140']) -test('T14272', normal, compile, ['']) +test('T14272', expect_broken_for(16539, ['optasm']), compile, ['']) test('T14270a', normal, compile, ['']) test('T14152', [ only_ways(['optasm']), check_errmsg(r'dead code') ], compile, ['-ddump-simpl']) test('T14152a', [extra_files(['T14152.hs']), pre_cmd('cp T14152.hs T14152a.hs'), ===================================== testsuite/tests/th/all.T ===================================== @@ -467,7 +467,7 @@ test('T15437', expect_broken(15437), multimod_compile, test('T15985', normal, compile, ['']) test('T16133', normal, compile_fail, ['']) test('T15471', normal, multimod_compile, ['T15471.hs', '-v0']) -test('T16180', normal, compile_and_run, ['-package ghc']) +test('T16180', expect_broken_for(16541, ['ghci']), compile_and_run, ['-package ghc']) test('T16183', normal, compile, ['-v0 -ddump-splices -dsuppress-uniques']) test('T16195', normal, multimod_compile, ['T16195.hs', '-v0']) test('T16293b', normal, compile, ['']) ===================================== testsuite/tests/typecheck/should_compile/all.T ===================================== @@ -659,7 +659,7 @@ test('T15586', normal, compile, ['']) test('T15368', normal, compile, ['-fdefer-type-errors']) test('T15645', normal, compile, ['-Wwarn=missing-monadfail-instances']) test('T15778', normal, compile, ['']) -test('T14761c', normal, compile, ['']) +test('T14761c', expect_broken_for(16540, ['hpc', 'optasm']), compile, ['']) test('T16008', normal, compile, ['']) test('T16033', normal, compile, ['']) test('T16141', normal, compile, ['-O']) ===================================== testsuite/tests/typecheck/should_run/all.T ===================================== @@ -35,7 +35,7 @@ test('tcrun018', normal, compile_and_run, ['']) test('tcrun019', normal, compile_and_run, ['']) test('tcrun020', normal, compile_and_run, ['']) test('tcrun021', normal, compile_and_run, ['-package containers']) -test('tcrun022', omit_ways(['ghci']), compile_and_run, ['-O']) +test('tcrun022', omit_ways(['hpc', 'ghci']), compile_and_run, ['-O']) test('tcrun023', [when(doing_ghci(), extra_hc_opts('-fobject-code'))], compile_and_run, ['-O']) test('tcrun024', normal, compile_and_run, ['-O']) ===================================== testsuite/tests/warnings/should_compile/all.T ===================================== @@ -25,4 +25,4 @@ test('MissingMod', normal, multimod_compile, ['MissingMod', '-Wmissing-home-modu test('StarBinder', normal, compile, ['']) -test('Overflow', normal, compile, ['']) +test('Overflow', expect_broken_for(16543, ['hpc']), compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/e6e5279a9a9f5cfa0a562b17432bd80b5675774d...4ee82fe1c4fbdef48b7abc5c830223d9e7d38e52 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/e6e5279a9a9f5cfa0a562b17432bd80b5675774d...4ee82fe1c4fbdef48b7abc5c830223d9e7d38e52 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 13:49:06 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 09:49:06 -0400 Subject: [Git][ghc/ghc] Deleted branch wip/ghc-8.6-merges Message-ID: <5ca8ae5264c2e_62b33fa2f24d845418958ce@gitlab.haskell.org.mail> Ben Gamari deleted branch wip/ghc-8.6-merges at Glasgow Haskell Compiler / GHC -- You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 13:49:07 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 09:49:07 -0400 Subject: [Git][ghc/ghc][ghc-8.6] 3 commits: gitlab-ci: Build hyperlinked sources for releases Message-ID: <5ca8ae5311d0_62b33fa2eb4293341896026@gitlab.haskell.org.mail> Ben Gamari pushed to branch ghc-8.6 at Glasgow Haskell Compiler / GHC Commits: e04e3d81 by Ben Gamari at 2019-04-04T16:35:47Z gitlab-ci: Build hyperlinked sources for releases Fixes #16445. (cherry picked from commit a32ac2f4d963b657c0a53359b492c593e82304b1) - - - - - 9cf1f91b by klebinger.andreas at gmx.at at 2019-04-04T16:35:47Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. (cherry picked from commit 9b131500371a07626e33edc56700c12322364560) - - - - - d2a284ab by Ben Gamari at 2019-04-04T16:35:47Z configure: Always use AC_LINK_ELSEIF when testing against assembler This fixes #16440, where the build system incorrectly concluded that the `.subsections_via_symbols` assembler directive was supported on a Linux system. This was caused by the fact that gcc was invoked with `-flto`; when so-configured gcc does not call the assembler but rather simply serialises its AST for compilation during the final link. This is described in Note [autoconf assembler checks and -flto]. (cherry picked from commit 7b090b53fea065d2cfd967ea919426af9ba8d737) - - - - - 3 changed files: - .gitlab-ci.yml - aclocal.m4 - rts/StgCRun.c Changes: ===================================== .gitlab-ci.yml ===================================== @@ -152,6 +152,12 @@ validate-x86_64-darwin: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" + # Build hyperlinked sources for documentation when building releases + - | + if [[ -n "$CI_COMMIT_TAG" ]]; then + echo "EXTRA_HADDOCK_OPTS += --hyperlinked-source --quickjump" >> mk/build.mk + fi - bash .circleci/prepare-system.sh # workaround for docker permissions ===================================== aclocal.m4 ===================================== @@ -288,11 +288,31 @@ AC_DEFUN([FPTOOLS_SET_HASKELL_PLATFORM_VARS], esac } + dnl Note [autoconf assembler checks and -flto] + dnl ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + dnl + dnl Autoconf's AC_COMPILE_IFELSE macro is fragile in the case of checks + dnl which require that the assembler is run. Specifically, GCC does not run + dnl the assembler if invoked with `-c -flto`; it merely dumps its internal + dnl AST to the object file, to be compiled and assembled during the final + dnl link. + dnl + dnl This can cause configure checks like that for the + dnl .subsections_via_symbols directive to pass unexpected (see #16440), + dnl leading the build system to incorrectly conclude that the directive is + dnl supported. + dnl + dnl For this reason, it is important that configure checks that rely on the + dnl assembler failing use AC_LINK_IFELSE rather than AC_COMPILE_IFELSE, + dnl ensuring that the assembler sees the check. + dnl + dnl ** check for Apple-style dead-stripping support dnl (.subsections-via-symbols assembler directive) AC_MSG_CHECKING(for .subsections_via_symbols) - AC_COMPILE_IFELSE( + dnl See Note [autoconf assembler checks and -flto] + AC_LINK_IFELSE( [AC_LANG_PROGRAM([], [__asm__ (".subsections_via_symbols");])], [AC_MSG_RESULT(yes) HaskellHaveSubsectionsViaSymbols=True @@ -305,8 +325,9 @@ AC_DEFUN([FPTOOLS_SET_HASKELL_PLATFORM_VARS], dnl ** check for .ident assembler directive AC_MSG_CHECKING(whether your assembler supports .ident directive) - AC_COMPILE_IFELSE( - [AC_LANG_SOURCE([__asm__ (".ident \"GHC x.y.z\"");])], + dnl See Note [autoconf assembler checks and -flto] + AC_LINK_IFELSE( + [AC_LANG_PROGRAM([__asm__ (".ident \"GHC x.y.z\"");], [])], [AC_MSG_RESULT(yes) HaskellHaveIdentDirective=True], [AC_MSG_RESULT(no) @@ -330,8 +351,15 @@ AC_DEFUN([FPTOOLS_SET_HASKELL_PLATFORM_VARS], ;; esac AC_MSG_CHECKING(for GNU non-executable stack support) - AC_COMPILE_IFELSE( - [AC_LANG_PROGRAM([__asm__ (".section .note.GNU-stack,\"\",$progbits");], [0])], + dnl See Note [autoconf assembler checks and -flto] + AC_LINK_IFELSE( + dnl the `main` function is placed after the .note.GNU-stack directive + dnl so we need to ensure that the active segment is correctly set, + dnl otherwise `main` will be placed in the wrong segment. + [AC_LANG_PROGRAM([ + __asm__ (".section .note.GNU-stack,\"\",$progbits"); + __asm__ (".section .text"); + ], [0])], [AC_MSG_RESULT(yes) HaskellHaveGnuNonexecStack=True], [AC_MSG_RESULT(no) ===================================== rts/StgCRun.c ===================================== @@ -489,15 +489,15 @@ StgRunIsImplementedInAssembler(void) "movq 48(%%rsp),%%rdi\n\t" "movq 56(%%rsp),%%rsi\n\t" "movq 64(%%rsp),%%xmm6\n\t" - "movq 72(%%rax),%%xmm7\n\t" - "movq 80(%%rax),%%xmm8\n\t" - "movq 88(%%rax),%%xmm9\n\t" - "movq 96(%%rax),%%xmm10\n\t" - "movq 104(%%rax),%%xmm11\n\t" - "movq 112(%%rax),%%xmm12\n\t" - "movq 120(%%rax),%%xmm13\n\t" - "movq 128(%%rax),%%xmm14\n\t" - "movq 136(%%rax),%%xmm15\n\t" + "movq 72(%%rsp),%%xmm7\n\t" + "movq 80(%%rsp),%%xmm8\n\t" + "movq 88(%%rsp),%%xmm9\n\t" + "movq 96(%%rsp),%%xmm10\n\t" + "movq 104(%%rsp),%%xmm11\n\t" + "movq 112(%%rsp),%%xmm12\n\t" + "movq 120(%%rsp),%%xmm13\n\t" + "movq 128(%%rsp),%%xmm14\n\t" + "movq 136(%%rsp),%%xmm15\n\t" #endif "addq %1, %%rsp\n\t" "retq" View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/97c1ef86decbf633c58804125ef46f69e84119d9...d2a284ab461681919cadaed394adebe42c4cc7bb -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/97c1ef86decbf633c58804125ef46f69e84119d9...d2a284ab461681919cadaed394adebe42c4cc7bb You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 13:50:51 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 09:50:51 -0400 Subject: [Git][ghc/ghc][ghc-8.8] 14 commits: Update binary submodule to latest master branch tip Message-ID: <5ca8aebbbe68_62b33fa2f1f4fcbc1901714@gitlab.haskell.org.mail> Ben Gamari pushed to branch ghc-8.8 at Glasgow Haskell Compiler / GHC Commits: ac12033a by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update binary submodule to latest master branch tip - - - - - 6ac90706 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update text submodule - - - - - 29e38980 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update unix submodule - - - - - 48989b49 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update deepseq submodule - - - - - 9d228b75 by Herbert Valerio Riedel at 2019-04-05T17:10:45Z Update haskeline submodule - - - - - ab97500f by Herbert Valerio Riedel at 2019-04-05T17:11:06Z Update parsec submodule - - - - - 950d45b9 by Herbert Valerio Riedel at 2019-04-05T17:11:35Z Update process submodule - - - - - 47b1a718 by Herbert Valerio Riedel at 2019-04-05T17:12:01Z Update stm submodule - - - - - 8cb38504 by Herbert Valerio Riedel at 2019-04-05T17:12:28Z Update terminfo submodule - - - - - 94a576f3 by Herbert Valerio Riedel at 2019-04-05T17:13:52Z Update hpc submodule - - - - - dd26d493 by Herbert Valerio Riedel at 2019-04-05T17:14:19Z Update filepath submodule - - - - - 460eec60 by Herbert Valerio Riedel at 2019-04-05T17:14:40Z Update directory submodule - - - - - 3ab1b786 by Herbert Valerio Riedel at 2019-04-05T17:16:27Z Update parallel submodule - - - - - 86ce5718 by Herbert Valerio Riedel at 2019-04-05T17:20:02Z base: Remove `Monad(fail)` method and reexport `MonadFail(fail)` instead As per https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail Coauthored-by: Ben Gamari <ben at well-typed.com> - - - - - 30 changed files: - .gitmodules - compiler/ghc.cabal.in - compiler/main/DynFlags.hs - compiler/prelude/PrelNames.hs - compiler/rename/RnExpr.hs - compiler/rename/RnSource.hs - compiler/simplCore/SimplCore.hs - compiler/typecheck/TcMatches.hs - ghc/ghc-bin.cabal.in - libraries/base/Control/Monad.hs - libraries/base/Control/Monad/ST/Lazy/Imp.hs - libraries/base/GHC/Base.hs - libraries/base/GHC/Conc/Sync.hs - libraries/base/GHC/TopHandler.hs - libraries/base/Prelude.hs - libraries/base/System/IO.hs - libraries/base/Text/ParserCombinators/ReadP.hs - libraries/base/Text/ParserCombinators/ReadPrec.hs - libraries/base/base.cabal - libraries/binary - libraries/deepseq - libraries/directory - libraries/filepath - libraries/haskeline - libraries/hpc - libraries/parallel - libraries/parsec - libraries/process - libraries/stm - libraries/terminfo The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d90dcd4ac41c2c1310e1496104c8aeaf76c51b1c...86ce5718b18a31d85c8321a6c37b0b150e8f3c29 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d90dcd4ac41c2c1310e1496104c8aeaf76c51b1c...86ce5718b18a31d85c8321a6c37b0b150e8f3c29 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 14:10:20 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 10:10:20 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] 25 commits: Bump Haddock submodule Message-ID: <5ca8b34c55e0c_62b33fa2c61dc254190871a@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: 882ed7d4 by Alec Theriault at 2019-03-29T19:47:06Z Bump Haddock submodule There shouldn't be any more features added to the `ghc-8.8` branch of Haddock after this bump. - - - - - 463a4e4d by Matthew Pickering at 2019-04-01T06:43:57Z Don't overwrite the set log_action when using --interactive -ddump-json didn't work with --interactive as --interactive overwrote the log_action in terms of defaultLogAction. Reviewers: bgamari Subscribers: rwbarton, thomie, carter GHC Trac Issues: #14078 Differential Revision: https://phabricator.haskell.org/D4533 (cherry picked from commit 10faf44d97095b2f8516b6d449d266f6889dcd70) - - - - - 6c27fd35 by Matthew Pickering at 2019-04-01T06:50:04Z Only build vanilla way in devel2 flavour Fixes #16210 (cherry picked from commit e7e5f4aebec2f095071a1949bb4595744754ee04) - - - - - 9ba0a4bd by Edward Z. Yang at 2019-04-01T06:56:10Z Fix #16219: TemplateHaskell causes indefinite package build error It should work to write an indefinite package using TemplateHaskell, so long as all of the actual TH code lives outside of the package. However, cleverness we had to build TH code even when building with -fno-code meant that we attempted to build object code for modules in an indefinite package, even when the signatures were not instantiated. This patch disables said logic in the event that an indefinite package is being typechecked. Signed-off-by: Edward Z. Yang <ezyang at fb.com> Test Plan: validate Reviewers: simonpj, bgamari Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #16219 Differential Revision: https://phabricator.haskell.org/D5475 (cherry picked from commit d6d735c1114082b9e9cc1ba7da87c49f52891320) - - - - - db5a43a9 by Ryan Scott at 2019-04-02T18:22:28Z Correct duplicate 4.12.0.0 entry in base's changelog See #16490. [ci skip] - - - - - c0661417 by Ben Gamari at 2019-04-02T18:22:48Z Bump transformers to 0.5.6.2 See #16199. - - - - - d90dcd4a by Ryan Scott at 2019-04-02T18:24:17Z Bump array submodule This bumps `array` to version 0.5.4.0 so that we can distinguish it with `MIN_VERSION_array` (as it introduces some changes to the `Show` instance for `UArray`). - - - - - ac12033a by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update binary submodule to latest master branch tip - - - - - 6ac90706 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update text submodule - - - - - 29e38980 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update unix submodule - - - - - 48989b49 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update deepseq submodule - - - - - 9d228b75 by Herbert Valerio Riedel at 2019-04-05T17:10:45Z Update haskeline submodule - - - - - ab97500f by Herbert Valerio Riedel at 2019-04-05T17:11:06Z Update parsec submodule - - - - - 950d45b9 by Herbert Valerio Riedel at 2019-04-05T17:11:35Z Update process submodule - - - - - 47b1a718 by Herbert Valerio Riedel at 2019-04-05T17:12:01Z Update stm submodule - - - - - 8cb38504 by Herbert Valerio Riedel at 2019-04-05T17:12:28Z Update terminfo submodule - - - - - 94a576f3 by Herbert Valerio Riedel at 2019-04-05T17:13:52Z Update hpc submodule - - - - - dd26d493 by Herbert Valerio Riedel at 2019-04-05T17:14:19Z Update filepath submodule - - - - - 460eec60 by Herbert Valerio Riedel at 2019-04-05T17:14:40Z Update directory submodule - - - - - 3ab1b786 by Herbert Valerio Riedel at 2019-04-05T17:16:27Z Update parallel submodule - - - - - 86ce5718 by Herbert Valerio Riedel at 2019-04-05T17:20:02Z base: Remove `Monad(fail)` method and reexport `MonadFail(fail)` instead As per https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail Coauthored-by: Ben Gamari <ben at well-typed.com> - - - - - c64c528f by klebinger.andreas at gmx.at at 2019-04-06T14:09:52Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. (cherry picked from commit 9b131500371a07626e33edc56700c12322364560) - - - - - 61c410f1 by Ben Gamari at 2019-04-06T14:09:53Z configure: Always use AC_LINK_ELSEIF when testing against assembler This fixes #16440, where the build system incorrectly concluded that the `.subsections_via_symbols` assembler directive was supported on a Linux system. This was caused by the fact that gcc was invoked with `-flto`; when so-configured gcc does not call the assembler but rather simply serialises its AST for compilation during the final link. This is described in Note [autoconf assembler checks and -flto]. (cherry picked from commit 7b090b53fea065d2cfd967ea919426af9ba8d737) - - - - - 7a70eb74 by Ben Gamari at 2019-04-06T14:09:53Z gitlab-ci: Backport from master - - - - - 25b2de03 by Ben Gamari at 2019-04-06T14:09:53Z gitab - - - - - 30 changed files: - .gitlab-ci.yml - .gitmodules - aclocal.m4 - compiler/ghc.cabal.in - compiler/main/DynFlags.hs - compiler/main/GhcMake.hs - compiler/main/Packages.hs - compiler/prelude/PrelNames.hs - compiler/rename/RnExpr.hs - compiler/rename/RnSource.hs - compiler/simplCore/SimplCore.hs - compiler/typecheck/TcMatches.hs - ghc/GHCi/UI.hs - ghc/ghc-bin.cabal.in - hadrian/src/Settings/Flavours/Development.hs - libraries/array - libraries/base/Control/Monad.hs - libraries/base/Control/Monad/ST/Lazy/Imp.hs - libraries/base/GHC/Base.hs - libraries/base/GHC/Conc/Sync.hs - libraries/base/GHC/TopHandler.hs - libraries/base/Prelude.hs - libraries/base/System/IO.hs - libraries/base/Text/ParserCombinators/ReadP.hs - libraries/base/Text/ParserCombinators/ReadPrec.hs - libraries/base/base.cabal - libraries/base/changelog.md - libraries/binary - libraries/deepseq - libraries/directory The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1e1331a7a27515cb43e4afcfd26a3050533cca43...25b2de0352b4d4d7ac47d8f54f33a85e25a34672 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1e1331a7a27515cb43e4afcfd26a3050533cca43...25b2de0352b4d4d7ac47d8f54f33a85e25a34672 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 14:10:44 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 10:10:44 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] gitlab-ci: Backport from master Message-ID: <5ca8b364e2d1c_62b39d007ec1909572@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: 20364073 by Ben Gamari at 2019-04-06T14:10:35Z gitlab-ci: Backport from master - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -1,17 +1,31 @@ variables: GIT_SSL_NO_VERIFY: "1" + # Commit of ghc/ci-images repository from which to pull Docker images + DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + before_script: - python3 .gitlab/fix-submodules.py - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" stages: - - lint - - build - - full-build - - cleanup # See Note [Cleanup on Windows] + - lint # Source linting + - build # A quick smoke-test to weed out broken commits + - full-build # Build all the things + - cleanup # See Note [Cleanup on Windows] + - packaging # Source distribution, etc. + - hackage # head.hackage testing + - deploy # push documentation + +.only-default: &only-default + only: + - master + - /ghc-[0-9]+\.[0-9]+/ + - merge_requests + - tags ############################################################ # Runner Tags @@ -30,44 +44,106 @@ stages: ############################################################ ghc-linters: + allow_failure: true stage: lint - image: ghcci/linters:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" script: - - | - if [ -n "$CI_MERGE_REQUEST_ID" ]; then - base="$(git merge-base $CI_MERGE_REQUEST_BRANCH_NAME HEAD)" - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) - submodchecker .git $(git rev-list $base..$CI_COMMIT_SHA) - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA - fi + - git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME + - base="$(git merge-base FETCH_HEAD $CI_COMMIT_SHA)" + - "echo Merge base $base" + # - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) + - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) + - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + dependencies: [] + tags: + - lint + only: + refs: + - merge_requests + +# We allow the submodule checker to fail when run on merge requests (to +# accomodate, e.g., haddock changes not yet upstream) but not on `master` or +# Marge jobs. +.lint-submods: + stage: lint + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" + script: + - submodchecker .git $(git rev-list $base..$CI_COMMIT_SHA) + dependencies: [] + tags: + - lint + +lint-submods: + extends: .lint-submods + only: + refs: + - master + - /ghc-[0-9]+\.[0-9]+/ + - wip/marge_bot_batch_merge_job + +lint-submods-mr: + extends: .lint-submods + allow_failure: true + only: + refs: + - merge_requests + +.lint-changelogs: + stage: lint + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" dependencies: [] tags: - lint + script: + - | + grep TBA libraries/*/changelog.md && ( + echo "Error: Found \"TBA\"s in changelogs." + exit 1 + ) + +lint-changelogs: + extends: .lint-changelogs + allow_failure: true + only: + refs: + - /ghc-[0-9]+\.[0-9]+/ + +lint-release-changelogs: + extends: .lint-changelogs + only: + - tags + ############################################################ # Validation via Pipelines (hadrian) ############################################################ .validate-hadrian: + <<: *only-default allow_failure: true script: + - cabal update - git clean -xdf && git submodule foreach git clean -xdf - bash .circleci/prepare-system.sh - if [[ -d ./cabal-cache ]]; then cp -R ./.cabal-cache ~/.cabal-cache; fi - ./boot - ./configure $CONFIGURE_ARGS - - hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` + - hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` binary-dist + - mv _build/bindist/ghc*.tar.xz ghc.tar.xz cache: key: hadrian paths: - cabal-cache + artifacts: + when: always + paths: + - ghc.tar.xz validate-x86_64-linux-deb8-hadrian: extends: .validate-hadrian stage: build - image: ghcci/x86_64-linux-deb8:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb8:$DOCKER_REV" before_script: # workaround for docker permissions - sudo chown ghc:ghc -R . @@ -75,14 +151,17 @@ validate-x86_64-linux-deb8-hadrian: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" tags: - x86_64-linux + ############################################################ # Validation via Pipelines (make) ############################################################ .validate: + <<: *only-default variables: TEST_TYPE: test before_script: @@ -92,22 +171,25 @@ validate-x86_64-linux-deb8-hadrian: - ./configure $CONFIGURE_ARGS - | THREADS=`mk/detect-cpu-count.sh` - make V=0 -j$THREADS + make V=0 -j$THREADS WERROR=-Werror - | make binary-dist TAR_COMP_OPTS="-1" - mv ghc-*.tar.xz ghc.tar.xz - | THREADS=`mk/detect-cpu-count.sh` - make $TEST_TYPE THREADS=$THREADS JUNIT_FILE=../../junit.xml + make $TEST_TYPE THREADS=$THREADS JUNIT_FILE=../../junit.xml METRICS_FILE=$METRICS_FILE dependencies: [] artifacts: reports: junit: junit.xml expire_in: 2 week paths: - - ghc.tar.xz + - ghc-*.tar.xz - junit.xml +################################# +# x86_64-darwin +################################# + validate-x86_64-darwin: extends: .validate stage: full-build @@ -115,17 +197,20 @@ validate-x86_64-darwin: - x86_64-darwin variables: GHC_VERSION: 8.6.3 + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-apple-darwin.tar.xz" MACOSX_DEPLOYMENT_TARGET: "10.7" # Only Sierra and onwards supports clock_gettime. See #12858 ac_cv_func_clock_gettime: "no" LANG: "en_US.UTF-8" CONFIGURE_ARGS: --with-intree-gmp + TEST_ENV: "x86_64-darwin" before_script: - git clean -xdf && git submodule foreach git clean -xdf - python3 .gitlab/fix-submodules.py - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" - bash .gitlab/darwin-init.sh - PATH="`pwd`/toolchain/bin:$PATH" @@ -150,6 +235,12 @@ validate-x86_64-darwin: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" + # Build hyperlinked sources for documentation when building releases + - | + if [[ -n "$CI_COMMIT_TAG" ]]; then + echo "EXTRA_HADDOCK_OPTS += --hyperlinked-source --quickjump" >> mk/build.mk + fi - bash .circleci/prepare-system.sh # workaround for docker permissions @@ -162,22 +253,31 @@ validate-x86_64-darwin: - cabal-cache - toolchain -validate-aarch64-linux-deb9: +################################# +# aarch64-linux-deb9 +################################# + +.build-aarch64-linux-deb9: extends: .validate-linux stage: full-build - image: ghcci/aarch64-linux-deb9:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/aarch64-linux-deb9:$DOCKER_REV" allow_failure: true - artifacts: - when: always - expire_in: 2 week + variables: + TEST_ENV: "aarch64-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-aarch64-linux-deb9.tar.xz" cache: key: linux-aarch64-deb9 tags: - aarch64-linux +validate-aarch64-linux-deb9: + extends: .build-aarch64-linux-deb9 + artifacts: + when: always + expire_in: 2 week + nightly-aarch64-linux-deb9: - extends: validate-aarch64-linux-deb9 - stage: full-build + extends: .build-aarch64-linux-deb9 artifacts: expire_in: 2 year variables: @@ -186,22 +286,28 @@ nightly-aarch64-linux-deb9: variables: - $NIGHTLY -validate-i386-linux-deb9: +################################# +# i386-linux-deb9 +################################# + +.build-i386-linux-deb9: extends: .validate-linux stage: full-build - image: ghcci/i386-linux-deb9:0.1 - allow_failure: true + image: "registry.gitlab.haskell.org/ghc/ci-images/i386-linux-deb9:$DOCKER_REV" + variables: + TEST_ENV: "i386-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-i386-deb9-linux.tar.xz" + cache: + key: linux-i386-deb9 + +validate-i386-linux-deb9: + extends: .build-i386-linux-deb9 artifacts: when: always expire_in: 2 week - cache: - key: linux-i386-deb9 nightly-i386-linux-deb9: - extends: .validate-linux - stage: full-build - image: ghcci/i386-linux-deb9:0.1 - allow_failure: true + extends: .build-i386-linux-deb9 variables: TEST_TYPE: slowtest artifacts: @@ -210,22 +316,29 @@ nightly-i386-linux-deb9: only: variables: - $NIGHTLY + +################################# +# x86_64-linux-deb9 +################################# + +.build-x86_64-linux-deb9: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb9:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb9-linux.tar.xz" cache: - key: linux-i386-deb9 + key: linux-x86_64-deb9 validate-x86_64-linux-deb9: - extends: .validate-linux - stage: build - image: ghcci/x86_64-linux-deb9:0.2 + extends: .build-x86_64-linux-deb9 artifacts: when: always expire_in: 2 week - cache: - key: linux-x86_64-deb9 nightly-x86_64-linux-deb9: - extends: validate-x86_64-linux-deb9 - stage: build + extends: .build-x86_64-linux-deb9 artifacts: expire_in: 2 year variables: @@ -234,70 +347,93 @@ nightly-x86_64-linux-deb9: variables: - $NIGHTLY +# N.B. Has DEBUG assertions enabled in stage2 +validate-x86_64-linux-deb9-debug: + extends: .build-x86_64-linux-deb9 + stage: build + variables: + BUILD_FLAVOUR: validate + TEST_ENV: "x86_64-linux-deb9-debug" + validate-x86_64-linux-deb9-llvm: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build - allow_failure: true - image: ghcci/x86_64-linux-deb9:0.2 variables: BUILD_FLAVOUR: perf-llvm - cache: - key: linux-x86_64-deb9 - -validate-x86_64-linux-deb8: - extends: .validate-linux - stage: full-build - image: ghcci/x86_64-linux-deb8:0.1 - cache: - key: linux-x86_64-deb8 - artifacts: - when: always - expire_in: 2 week - -validate-x86_64-linux-fedora27: - extends: .validate-linux - stage: full-build - image: ghcci/x86_64-linux-fedora27:0.1 - cache: - key: linux-x86_64-fedora27 - artifacts: - when: always - expire_in: 2 week + TEST_ENV: "x86_64-linux-deb9-llvm" validate-x86_64-linux-deb9-integer-simple: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build variables: INTEGER_LIBRARY: integer-simple - image: ghcci/x86_64-linux-deb9:0.2 - cache: - key: linux-x86_64-deb9 + TEST_ENV: "x86_64-linux-deb9-integer-simple" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb9-linux-integer-simple.tar.xz" nightly-x86_64-linux-deb9-integer-simple: - extends: validate-x86_64-linux-deb9-integer-simple + extends: .build-x86_64-linux-deb9 stage: full-build - artifacts: - expire_in: 2 year variables: + INTEGER_LIBRARY: integer-simple + TEST_ENV: "x86_64-linux-deb9-integer-simple" TEST_TYPE: slowtest + artifacts: + expire_in: 2 year only: variables: - $NIGHTLY validate-x86_64-linux-deb9-unreg: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build variables: CONFIGURE_ARGS: --enable-unregisterised - image: ghcci/x86_64-linux-deb9:0.2 + TEST_ENV: "x86_64-linux-deb9-unreg" + + +################################# +# x86_64-linux-deb8 +################################# + +release-x86_64-linux-deb8: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb8:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-deb8" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb8-linux.tar.xz" + only: + - tags cache: - key: linux-x86_64-deb9 + key: linux-x86_64-deb8 + artifacts: + when: always + expire_in: 2 week + + +################################# +# x86_64-linux-fedora27 +################################# + +validate-x86_64-linux-fedora27: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-fedora27:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-fedora27" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-fedora27-linux.tar.xz" + cache: + key: linux-x86_64-fedora27 + artifacts: + when: always + expire_in: 2 week ############################################################ # Validation via Pipelines (Windows) ############################################################ -.validate-windows: +.build-windows: + <<: *only-default before_script: - git clean -xdf - git submodule foreach git clean -xdf @@ -314,72 +450,131 @@ validate-x86_64-linux-deb9-unreg: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" - bash .gitlab/win32-init.sh after_script: - rd /s /q tmp - robocopy /np /nfl /ndl /e "%APPDATA%\cabal" cabal-cache - bash -c 'make clean || true' dependencies: [] + variables: + FORCE_SYMLINKS: 1 + LANG: "en_US.UTF-8" cache: paths: - cabal-cache - ghc-8.6.2 - ghc-tarballs -validate-x86_64-windows-hadrian: - extends: .validate-windows +.build-windows-hadrian: + extends: .build-windows stage: full-build variables: GHC_VERSION: "8.6.2" - LANG: "en_US.UTF-8" script: - | - set MSYSTEM=MINGW64 python boot bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex' - mkdir -p _build - cp -R inplace/mingw _build/mingw - - bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --flavour=Quick" - # FIXME: Bindist disabled due to #16073 - #- bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh binary-dist" + - bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --flavour=Quick --docs=no-sphinx binary-dist" + - mv _build/bindist/ghc*.tar.xz ghc.tar.xz # FIXME: Testsuite disabled due to #16156. - #- bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' - cache: - key: x86_64-windows + # - bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' tags: - x86_64-windows + artifacts: + when: always + paths: + - ghc.tar.xz -validate-x86_64-windows: - extends: .validate-windows +validate-x86_64-windows-hadrian: + extends: .build-windows-hadrian + variables: + MSYSTEM: MINGW64 + cache: + key: x86_64-windows-hadrian + +nightly-i386-windows-hadrian: + extends: .build-windows-hadrian + variables: + MSYSTEM: MINGW32 + only: + variables: + - $NIGHTLY + cache: + key: i386-windows-hadrian + +.build-windows-make: + extends: .build-windows stage: full-build # due to #16084 allow_failure: true variables: + BUILD_FLAVOUR: "quick" GHC_VERSION: "8.6.2" - LANG: "en_US.UTF-8" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-mingw32.tar.xz" script: - | - set MSYSTEM=MINGW64 python boot - bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex' - - bash -c "echo include mk/flavours/quick.mk > mk/build.mk" + bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex $CONFIGURE_ARGS' + - bash -c "echo include mk/flavours/${BUILD_FLAVOUR}.mk > mk/build.mk" - bash -c "PATH=`pwd`/toolchain/bin:$PATH make -j`mk/detect-cpu-count.sh`" - - | - bash -c "make binary-dist TAR_COMP_OPTS=-1" - mv ghc-*.tar.xz ghc.tar.xz + - bash -c "PATH=`pwd`/toolchain/bin:$PATH make binary-dist TAR_COMP_OPTS=-1" - bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' - cache: - key: x86_64-windows tags: - x86_64-windows artifacts: when: always + expire_in: 2 week reports: junit: junit.xml paths: - - ghc.tar.xz + - ghc-*.tar.xz - junit.xml +validate-x86_64-windows: + extends: .build-windows-make + variables: + MSYSTEM: MINGW64 + CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" + cache: + key: x86_64-windows + +# Normal Windows validate builds are profiled; that won't do for releases. +release-x86_64-windows: + extends: validate-x86_64-windows + variables: + MSYSTEM: MINGW64 + BUILD_FLAVOUR: "perf" + CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" + only: + - tags + +release-i386-windows: + extends: .build-windows-make + only: + - tags + variables: + MSYSTEM: MINGW32 + BUILD_FLAVOUR: "perf" + CONFIGURE_ARGS: "--target=i386-unknown-mingw32" + cache: + key: i386-windows + +nightly-i386-windows: + extends: .build-windows-make + only: + variables: + - $NIGHTLY + variables: + MSYSTEM: MINGW32 + CONFIGURE_ARGS: "--target=i386-unknown-mingw32" + cache: + key: i386-windows + +############################################################ +# Cleanup +############################################################ + # Note [Cleaning up after shell executor] # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # @@ -393,6 +588,7 @@ validate-x86_64-windows: # See Note [Cleanup after shell executor] cleanup-windows: + <<: *only-default stage: cleanup tags: - x86_64-windows @@ -415,10 +611,12 @@ cleanup-windows: # See Note [Cleanup after shell executor] cleanup-darwin: + <<: *only-default stage: cleanup tags: - x86_64-darwin when: always + dependencies: [] before_script: - echo "Time to clean up" script: @@ -430,3 +628,106 @@ cleanup-darwin: - rm -Rf $BUILD_DIR/* - exit 0 +############################################################ +# Packaging +############################################################ + +doc-tarball: + <<: *only-default + stage: packaging + tags: + - x86_64-linux + image: ghcci/x86_64-linux-deb9:0.2 + dependencies: + - validate-x86_64-linux-deb9 + - validate-x86_64-windows + artifacts: + paths: + - haddock.html.tar.xz + - libraries.html.tar.xz + - users_guide.html.tar.xz + - index.html + - "*.pdf" + script: + - rm -Rf docs + - bash -ex distrib/mkDocs/mkDocs ghc-x86_64-deb9-linux.tar.xz ghc-x86_64-mingw32.tar.xz + - ls -lh + - mv docs/*.tar.xz docs/index.html . + +source-tarball: + stage: packaging + tags: + - x86_64-linux + image: ghcci/x86_64-linux-deb9:0.2 + only: + - tags + artifacts: + paths: + - ghc-*.tar.xz + - version + script: + - mk/get-win32-tarballs.sh download all + - ./boot + - ./configure + - make sdist + - mv sdistprep/*.xz . + - make show VALUE=version > version + + +############################################################ +# Testing via head.hackage +############################################################ + +# Triggering jobs in the ghc/head.hackage project requires that we have a job +# token for that repository. Furthermore the head.hackage CI job must have +# access to an unprivileged access token with the ability to query the ghc/ghc +# project such that it can find the job ID of the fedora27 job for the current +# pipeline. + +.hackage: + <<: *only-default + stage: hackage + image: ghcci/x86_64-linux-deb9:0.2 + tags: + - x86_64-linux + dependencies: [] + variables: + HEAD_HACKAGE_PROJECT_ID: "78" + script: + - bash .gitlab/start-head.hackage.sh + +hackage: + extends: .hackage + when: manual + +hackage-label: + extends: .hackage + only: + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*user-facing.*/ + +nightly-hackage: + extends: .hackage + only: + variables: + - $NIGHTLY + +pages: + stage: deploy + dependencies: + - doc-tarball + image: ghcci/x86_64-linux-deb9:0.2 + tags: + - x86_64-linux + script: + - mkdir -p public/doc + - tar -xf haddock.html.tar.xz -C public/doc + - tar -xf libraries.html.tar.xz -C public/doc + - tar -xf users_guide.html.tar.xz -C public/doc + - cp -f index.html public/doc + only: + - master + artifacts: + paths: + - public + View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/20364073caef323eea76332ee0b2c0f4cd539952 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/20364073caef323eea76332ee0b2c0f4cd539952 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:08:10 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Sat, 06 Apr 2019 12:08:10 -0400 Subject: [Git][ghc/ghc][wip/D5082] Make tablesNextToCode a proper dynamic flag (#15548) Message-ID: <5ca8ceea53bc_62b33fa2c50c025419298c2@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: 4a4467c3 by Joachim Breitner at 2019-04-06T14:53:54Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. GHCI_TABLES_NEXT_TO_CODE is also removed entirely. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. That said, GHC still is hardcoded to define TABLES_NEXT_TO_CODE based on that default value. This is bad, but neccessary until the remaining uses of TABLES_NEXT_TO_CODE get it from make/Hadrian. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 9 changed files: - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DynFlags.hs - compiler/utils/Util.hs - hadrian/src/Settings/Packages.hs - libraries/ghci/GHCi/InfoTable.hsc - libraries/ghci/GHCi/Message.hs - libraries/ghci/GHCi/Run.hs - testsuite/tests/codeGen/should_compile/jmp_tbl.hs Changes: ===================================== compiler/ghc.mk ===================================== @@ -330,14 +330,6 @@ endif ifeq "$(GhcWithInterpreter)" "YES" compiler_stage2_CONFIGURE_OPTS += --flags=ghci -ifeq "$(GhcEnableTablesNextToCode) $(GhcUnregisterised)" "YES NO" -# Should GHCI be building info tables in the TABLES_NEXT_TO_CODE style -# or not? -# XXX This should logically be a CPP option, but there doesn't seem to -# be a flag for that -compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DGHCI_TABLES_NEXT_TO_CODE -endif - # Should the debugger commands be enabled? ifeq "$(GhciWithDebugger)" "YES" compiler_stage2_CONFIGURE_OPTS += --ghc-option=-DDEBUGGER ===================================== compiler/ghci/ByteCodeItbls.hs ===================================== @@ -71,6 +71,8 @@ make_constr_itbls hsc_env cons = descr = dataConIdentity dcon - r <- iservCmd hsc_env (MkConInfoTable ptrs' nptrs_really + tables_next_to_code = tablesNextToCode dflags + + r <- iservCmd hsc_env (MkConInfoTable tables_next_to_code ptrs' nptrs_really conNo (tagForCon dflags dcon) descr) return (getName dcon, ItblPtr r) ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,7 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, + mkTablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, @@ -876,6 +876,10 @@ data DynFlags = DynFlags { integerLibrary :: IntegerLibrary, -- ^ IntegerGMP or IntegerSimple. Set at configure time, but may be overriden -- by GHC-API users. See Note [The integer library] in PrelNames + tablesNextToCode :: Bool, + -- ^ Determines whether we will be compiling info tables that reside just + -- before the entry code, or with an indirection to the entry code. See + -- TABLES_NEXT_TO_CODE in includes/rts/storage/InfoTables.h. llvmTargets :: LlvmTargets, llvmPasses :: LlvmPasses, verbosity :: Int, -- ^ Verbosity level: see Note [Verbosity levels] @@ -1614,10 +1618,6 @@ defaultObjectTarget platform | cGhcWithNativeCodeGen == "YES" = HscAsm | otherwise = HscLlvm -tablesNextToCode :: DynFlags -> Bool -tablesNextToCode dflags - = mkTablesNextToCode (platformUnregisterised (targetPlatform dflags)) - -- Determines whether we will be compiling -- info tables that reside just before the entry code, or with an -- indirection to the entry code. See TABLES_NEXT_TO_CODE in @@ -1874,6 +1874,9 @@ defaultDynFlags mySettings (myLlvmTargets, myLlvmPasses) = ghcLink = LinkBinary, hscTarget = defaultHscTarget (sTargetPlatform mySettings), integerLibrary = cIntegerLibraryType, + tablesNextToCode = + not (platformUnregisterised $ sTargetPlatform mySettings) && + cGhcEnableTablesNextToCode == "YES", verbosity = 0, optLevel = 0, debugLevel = 0, ===================================== compiler/utils/Util.hs ===================================== @@ -11,7 +11,6 @@ module Util ( -- * Flags dependent on the compiler build ghciSupported, debugIsOn, ncgDebugIsOn, - ghciTablesNextToCode, isWindowsHost, isDarwinHost, -- * General list processing @@ -208,13 +207,6 @@ ncgDebugIsOn = True ncgDebugIsOn = False #endif -ghciTablesNextToCode :: Bool -#if defined(GHCI_TABLES_NEXT_TO_CODE) -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - isWindowsHost :: Bool #if defined(mingw32_HOST_OS) isWindowsHost = True ===================================== hadrian/src/Settings/Packages.hs ===================================== @@ -57,10 +57,6 @@ packageArgs = do , notM ghcWithSMP ? arg "--ghc-option=-optc-DNOSMP" , (any (wayUnit Threaded) rtsWays) ? notStage0 ? arg "--ghc-option=-optc-DTHREADED_RTS" - , ghcWithInterpreter ? - ghcEnableTablesNextToCode ? - notM (flag GhcUnregisterised) ? - notStage0 ? arg "--ghc-option=-DGHCI_TABLES_NEXT_TO_CODE" , ghcWithInterpreter ? ghciWithDebugger <$> flavour ? notStage0 ? arg "--ghc-option=-DDEBUGGER" ===================================== libraries/ghci/GHCi/InfoTable.hsc ===================================== @@ -26,19 +26,13 @@ import Data.ByteString (ByteString) import qualified Data.ByteString as BS #endif -ghciTablesNextToCode :: Bool -#ifdef TABLES_NEXT_TO_CODE -ghciTablesNextToCode = True -#else -ghciTablesNextToCode = False -#endif - #ifdef GHCI /* To end */ -- NOTE: Must return a pointer acceptable for use in the header of a closure. -- If tables_next_to_code is enabled, then it must point the the 'code' field. -- Otherwise, it should point to the start of the StgInfoTable. mkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -47,23 +41,23 @@ mkConInfoTable -- resulting info table is allocated with allocateExec(), and -- should be freed with freeExec(). -mkConInfoTable ptr_words nonptr_words tag ptrtag con_desc = - castFunPtrToPtr <$> newExecConItbl itbl con_desc - where - entry_addr = interpConstrEntry !! ptrtag - code' = mkJumpToAddr entry_addr +mkConInfoTable tables_next_to_code ptr_words nonptr_words tag ptrtag con_desc = do + let entry_addr = interpConstrEntry !! ptrtag + code' <- if tables_next_to_code + then Just <$> mkJumpToAddr entry_addr + else pure Nothing + let itbl = StgInfoTable { - entry = if ghciTablesNextToCode + entry = if tables_next_to_code then Nothing else Just entry_addr, ptrs = fromIntegral ptr_words, nptrs = fromIntegral nonptr_words, tipe = CONSTR, srtlen = fromIntegral tag, - code = if ghciTablesNextToCode - then Just code' - else Nothing + code = code' } + castFunPtrToPtr <$> newExecConItbl tables_next_to_code itbl con_desc -- ----------------------------------------------------------------------------- @@ -81,39 +75,46 @@ data Arch = ArchSPARC | ArchARM64 | ArchPPC64 | ArchPPC64LE - | ArchUnknown deriving Show -platform :: Arch -platform = +mkJumpToAddr :: MonadFail m => EntryFunPtr-> m ItblCodes +mkJumpToAddr ptr = do + arch <- case mArch of + Just a -> pure a + Nothing -> + -- This code must not be called. You either need to add your + -- architecture as a distinct case to 'Arch' and 'mArch', or use + -- non-TABLES_NEXT_TO_CODE mode. + fail "mkJumpToAddr: Unknown obscure arch is not supported with TABLES_NEXT_TO_CODE" + pure $ mkJumpToAddr' arch ptr + +-- | 'Just' if it's a known OS, or 'Nothing' otherwise. +mArch :: Maybe Arch +mArch = #if defined(sparc_HOST_ARCH) - ArchSPARC + Just ArchSPARC #elif defined(powerpc_HOST_ARCH) - ArchPPC + Just ArchPPC #elif defined(i386_HOST_ARCH) - ArchX86 + Just ArchX86 #elif defined(x86_64_HOST_ARCH) - ArchX86_64 + Just ArchX86_64 #elif defined(alpha_HOST_ARCH) - ArchAlpha + Just ArchAlpha #elif defined(arm_HOST_ARCH) - ArchARM + Just ArchARM #elif defined(aarch64_HOST_ARCH) - ArchARM64 + Just ArchARM64 #elif defined(powerpc64_HOST_ARCH) - ArchPPC64 + Just ArchPPC64 #elif defined(powerpc64le_HOST_ARCH) - ArchPPC64LE + Just ArchPPC64LE #else -# if defined(TABLES_NEXT_TO_CODE) -# error Unimplemented architecture -# else - ArchUnknown -# endif + Nothing #endif -mkJumpToAddr :: EntryFunPtr -> ItblCodes -mkJumpToAddr a = case platform of +mkJumpToAddr' :: Arch -> EntryFunPtr -> ItblCodes +mkJumpToAddr' platform a = case platform of ArchSPARC -> -- After some consideration, we'll try this, where -- 0x55555555 stands in for the address to jump to. @@ -273,11 +274,6 @@ mkJumpToAddr a = case platform of 0x618C0000 .|. lo16 w32, 0x7D8903A6, 0x4E800420 ] - -- This code must not be called. You either need to - -- add your architecture as a distinct case or - -- use non-TABLES_NEXT_TO_CODE mode - ArchUnknown -> error "mkJumpToAddr: ArchUnknown is unsupported" - byte0 :: (Integral w) => w -> Word8 byte0 w = fromIntegral w @@ -321,38 +317,40 @@ data StgConInfoTable = StgConInfoTable { pokeConItbl - :: Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable + :: Bool -> Ptr StgConInfoTable -> Ptr StgConInfoTable -> StgConInfoTable -> IO () -pokeConItbl wr_ptr _ex_ptr itbl = do -#if defined(TABLES_NEXT_TO_CODE) - -- Write the offset to the con_desc from the end of the standard InfoTable - -- at the first byte. - let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) - (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset -#else - -- Write the con_desc address after the end of the info table. - -- Use itblSize because CPP will not pick up PROFILING when calculating - -- the offset. - pokeByteOff wr_ptr itblSize (conDesc itbl) -#endif +pokeConItbl tables_next_to_code wr_ptr _ex_ptr itbl = do + if tables_next_to_code + then do + -- Write the offset to the con_desc from the end of the standard InfoTable + -- at the first byte. + let con_desc_offset = conDesc itbl `minusPtr` (_ex_ptr `plusPtr` conInfoTableSizeB) + (#poke StgConInfoTable, con_desc) wr_ptr con_desc_offset + else do + -- Write the con_desc address after the end of the info table. + -- Use itblSize because CPP will not pick up PROFILING when calculating + -- the offset. + pokeByteOff wr_ptr itblSize (conDesc itbl) pokeItbl (wr_ptr `plusPtr` (#offset StgConInfoTable, i)) (infoTable itbl) -sizeOfEntryCode :: Int -sizeOfEntryCode - | not ghciTablesNextToCode = 0 - | otherwise = - case mkJumpToAddr undefined of +sizeOfEntryCode :: MonadFail m => Bool -> m Int +sizeOfEntryCode tables_next_to_code + | not tables_next_to_code = pure 0 + | otherwise = do + code' <- mkJumpToAddr undefined + pure $ case code' of Left xs -> sizeOf (head xs) * length xs Right xs -> sizeOf (head xs) * length xs -- Note: Must return proper pointer for use in a closure -newExecConItbl :: StgInfoTable -> ByteString -> IO (FunPtr ()) -newExecConItbl obj con_desc +newExecConItbl :: Bool -> StgInfoTable -> ByteString -> IO (FunPtr ()) +newExecConItbl tables_next_to_code obj con_desc = alloca $ \pcode -> do + sz0 <- sizeOfEntryCode tables_next_to_code let lcon_desc = BS.length con_desc + 1{- null terminator -} -- SCARY -- This size represents the number of bytes in an StgConInfoTable. - sz = fromIntegral (conInfoTableSizeB + sizeOfEntryCode) + sz = fromIntegral $ conInfoTableSizeB + sz0 -- Note: we need to allocate the conDesc string next to the info -- table, because on a 64-bit platform we reference this string -- with a 32-bit offset relative to the info table, so if we @@ -361,17 +359,13 @@ newExecConItbl obj con_desc ex_ptr <- peek pcode let cinfo = StgConInfoTable { conDesc = ex_ptr `plusPtr` fromIntegral sz , infoTable = obj } - pokeConItbl wr_ptr ex_ptr cinfo + pokeConItbl tables_next_to_code wr_ptr ex_ptr cinfo BS.useAsCStringLen con_desc $ \(src, len) -> copyBytes (castPtr wr_ptr `plusPtr` fromIntegral sz) src len - let null_off = fromIntegral sz + fromIntegral (BS.length con_desc) - poke (castPtr wr_ptr `plusPtr` null_off) (0 :: Word8) _flushExec sz ex_ptr -- Cache flush (if needed) -#if defined(TABLES_NEXT_TO_CODE) - return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) -#else - return (castPtrToFunPtr ex_ptr) -#endif + if tables_next_to_code + then return (castPtrToFunPtr (ex_ptr `plusPtr` conInfoTableSizeB)) + else return (castPtrToFunPtr ex_ptr) foreign import ccall unsafe "allocateExec" _allocateExec :: CUInt -> Ptr (Ptr a) -> IO (Ptr a) ===================================== libraries/ghci/GHCi/Message.hs ===================================== @@ -104,7 +104,8 @@ data Message a where -- | Create an info table for a constructor MkConInfoTable - :: Int -- ptr words + :: Bool -- TABLES_NEXT_TO_CODE + -> Int -- ptr words -> Int -- non-ptr words -> Int -- constr tag -> Int -- pointer tag @@ -468,7 +469,7 @@ getMessage = do 15 -> Msg <$> MallocStrings <$> get 16 -> Msg <$> (PrepFFI <$> get <*> get <*> get) 17 -> Msg <$> FreeFFI <$> get - 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get) + 18 -> Msg <$> (MkConInfoTable <$> get <*> get <*> get <*> get <*> get <*> get) 19 -> Msg <$> (EvalStmt <$> get <*> get) 20 -> Msg <$> (ResumeStmt <$> get <*> get) 21 -> Msg <$> (AbandonStmt <$> get) @@ -510,7 +511,7 @@ putMessage m = case m of MallocStrings bss -> putWord8 15 >> put bss PrepFFI conv args res -> putWord8 16 >> put conv >> put args >> put res FreeFFI p -> putWord8 17 >> put p - MkConInfoTable p n t pt d -> putWord8 18 >> put p >> put n >> put t >> put pt >> put d + MkConInfoTable tc p n t pt d -> putWord8 18 >> put tc >> put p >> put n >> put t >> put pt >> put d EvalStmt opts val -> putWord8 19 >> put opts >> put val ResumeStmt opts val -> putWord8 20 >> put opts >> put val AbandonStmt val -> putWord8 21 >> put val ===================================== libraries/ghci/GHCi/Run.hs ===================================== @@ -89,8 +89,8 @@ run m = case m of MallocStrings bss -> mapM mkString0 bss PrepFFI conv args res -> toRemotePtr <$> prepForeignCall conv args res FreeFFI p -> freeForeignCallInfo (fromRemotePtr p) - MkConInfoTable ptrs nptrs tag ptrtag desc -> - toRemotePtr <$> mkConInfoTable ptrs nptrs tag ptrtag desc + MkConInfoTable tc ptrs nptrs tag ptrtag desc -> + toRemotePtr <$> mkConInfoTable tc ptrs nptrs tag ptrtag desc StartTH -> startTH GetClosure ref -> do clos <- getClosureData =<< localRef ref ===================================== testsuite/tests/codeGen/should_compile/jmp_tbl.hs ===================================== @@ -4,7 +4,7 @@ This funny module was reduced from a failing build of stage2 using the new code generator and the linear register allocator, with this bug: -"inplace/bin/ghc-stage1" -fPIC -dynamic -H32m -O -Wall -H64m -O0 -package-name ghc-7.1.20110414 -hide-all-packages -i -icompiler/basicTypes -icompiler/cmm -icompiler/codeGen -icompiler/coreSyn -icompiler/deSugar -icompiler/ghci -icompiler/hsSyn -icompiler/iface -icompiler/llvmGen -icompiler/main -icompiler/nativeGen -icompiler/parser -icompiler/prelude -icompiler/profiling -icompiler/rename -icompiler/simplCore -icompiler/simplStg -icompiler/specialise -icompiler/stgSyn -icompiler/stranal -icompiler/typecheck -icompiler/types -icompiler/utils -icompiler/vectorise -icompiler/stage2/build -icompiler/stage2/build/autogen -Icompiler/stage2/build -Icompiler/stage2/build/autogen -Icompiler/../libffi/build/include -Icompiler/stage2 -Icompiler/../libraries/base/cbits -Icompiler/../libraries/base/include -Icompiler/. -Icompiler/parser -Icompiler/utils -optP-DGHCI -optP-include -optPcompiler/stage2/build/autogen/cabal_macros.h -package Cabal-1.11.0 -package array-0.3.0.2 -package base-4.3.1.0 -package ghc-boot-0.0.0.0 -package bytestring-0.9.1.10 -package containers-0.4.0.0 -package directory-1.1.0.0 -package filepath-1.2.0.0 -package hoopl-3.8.7.0 -package hpc-0.5.0.6 -package old-time-1.0.0.6 -package process-1.0.1.4 -package template-haskell-2.5.0.0 -package unix-2.4.1.0 -Wall -fno-warn-name-shadowing -fno-warn-orphans -XHaskell98 -XNondecreasingIndentation -XCPP -XMagicHash -XUnboxedTuples -XPatternGuards -XForeignFunctionInterface -XEmptyDataDecls -XTypeSynonymInstances -XMultiParamTypeClasses -XFlexibleInstances -XRank2Types -XScopedTypeVariables -XDeriveDataTypeable -DGHCI_TABLES_NEXT_TO_CODE -DSTAGE=2 -O2 -O -DGHC_DEFAULT_NEW_CODEGEN -no-user-package-db -rtsopts -odir compiler/stage2/build -hidir compiler/stage2/build -stubdir compiler/stage2/build -hisuf dyn_hi -osuf dyn_o -hcsuf dyn_hc -c compiler/main/DriverPipeline.hs -o compiler/stage2/build/DriverPipeline.dyn_o -fforce-recomp -dno-debug-output -fno-warn-unused-binds +"inplace/bin/ghc-stage1" -fPIC -dynamic -H32m -O -Wall -H64m -O0 -package-name ghc-7.1.20110414 -hide-all-packages -i -icompiler/basicTypes -icompiler/cmm -icompiler/codeGen -icompiler/coreSyn -icompiler/deSugar -icompiler/ghci -icompiler/hsSyn -icompiler/iface -icompiler/llvmGen -icompiler/main -icompiler/nativeGen -icompiler/parser -icompiler/prelude -icompiler/profiling -icompiler/rename -icompiler/simplCore -icompiler/simplStg -icompiler/specialise -icompiler/stgSyn -icompiler/stranal -icompiler/typecheck -icompiler/types -icompiler/utils -icompiler/vectorise -icompiler/stage2/build -icompiler/stage2/build/autogen -Icompiler/stage2/build -Icompiler/stage2/build/autogen -Icompiler/../libffi/build/include -Icompiler/stage2 -Icompiler/../libraries/base/cbits -Icompiler/../libraries/base/include -Icompiler/. -Icompiler/parser -Icompiler/utils -optP-DGHCI -optP-include -optPcompiler/stage2/build/autogen/cabal_macros.h -package Cabal-1.11.0 -package array-0.3.0.2 -package base-4.3.1.0 -package ghc-boot-0.0.0.0 -package bytestring-0.9.1.10 -package containers-0.4.0.0 -package directory-1.1.0.0 -package filepath-1.2.0.0 -package hoopl-3.8.7.0 -package hpc-0.5.0.6 -package old-time-1.0.0.6 -package process-1.0.1.4 -package template-haskell-2.5.0.0 -package unix-2.4.1.0 -Wall -fno-warn-name-shadowing -fno-warn-orphans -XHaskell98 -XNondecreasingIndentation -XCPP -XMagicHash -XUnboxedTuples -XPatternGuards -XForeignFunctionInterface -XEmptyDataDecls -XTypeSynonymInstances -XMultiParamTypeClasses -XFlexibleInstances -XRank2Types -XScopedTypeVariables -XDeriveDataTypeable -DSTAGE=2 -O2 -O -DGHC_DEFAULT_NEW_CODEGEN -no-user-package-db -rtsopts -odir compiler/stage2/build -hidir compiler/stage2/build -stubdir compiler/stage2/build -hisuf dyn_hi -osuf dyn_o -hcsuf dyn_hc -c compiler/main/DriverPipeline.hs -o compiler/stage2/build/DriverPipeline.dyn_o -fforce-recomp -dno-debug-output -fno-warn-unused-binds ghc-stage1: panic! (the 'impossible' happened) (GHC version 7.1.20110414 for x86_64-unknown-linux): View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/4a4467c3ffcbc931731044453df6a9f2db53406f -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/4a4467c3ffcbc931731044453df6a9f2db53406f You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:12:31 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 12:12:31 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 2 commits: testsuite: Skip T7919 in ghci way Message-ID: <5ca8cfef94867_62b33fa2eb42933419307ee@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 4cd2f07c by Ben Gamari at 2019-04-06T16:12:23Z testsuite: Skip T7919 in ghci way It times out pretty reliably. It's not clear that much is gained by running this test in the ghci way anyways. - - - - - 1df9027e by Ben Gamari at 2019-04-06T16:12:23Z testsuite: Fix fragile_for test modifier - - - - - 2 changed files: - testsuite/driver/testlib.py - testsuite/tests/rts/all.T Changes: ===================================== testsuite/driver/testlib.py ===================================== @@ -258,14 +258,14 @@ def fragile( bug ): return helper -def fragile_for( name, opts, bug, ways ): +def fragile_for( bug, ways ): """ Indicates that the test should be skipped due to fragility in the given test ways as documented in the given ticket. """ def helper( name, opts, bug=bug, ways=ways ): record_broken(name, opts, bug) - opts.omit_ways = ways + opts.omit_ways += ways return helper @@ -275,7 +275,7 @@ def omit_ways( ways ): return lambda name, opts, w=ways: _omit_ways( name, opts, w ) def _omit_ways( name, opts, ways ): - opts.omit_ways = ways + opts.omit_ways += ways # ----- ===================================== testsuite/tests/rts/all.T ===================================== @@ -272,7 +272,8 @@ test('T7815', [ multi_cpu_race, test('ffishutdown', [ignore_stderr, only_ways(['threaded1','threaded2'])], compile_and_run, ['']) -test('T7919', [when(fast(), skip), omit_ways(prof_ways)], compile_and_run, +# Times out in ghci way +test('T7919', [when(fast(), skip), omit_ways(['ghci'] + prof_ways)], compile_and_run, [config.ghc_th_way_flags]) test('T8035', normal, compile_and_run, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/4ee82fe1c4fbdef48b7abc5c830223d9e7d38e52...1df9027ede31ff1e591c5002914dd4f7f01ba72c -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/4ee82fe1c4fbdef48b7abc5c830223d9e7d38e52...1df9027ede31ff1e591c5002914dd4f7f01ba72c You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:16:40 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 12:16:40 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] 3 commits: Add werror function to Flavour.hs Message-ID: <5ca8d0e8cbfa1_62b3ca9d074193184e@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: a80de887 by Matthew Pickering at 2019-04-06T16:16:29Z Add werror function to Flavour.hs This function makes it easy to turn on `-Werror` in the correct manner to mimic how CI turns on -Werror. (cherry picked from commit 8dcd00cef7782c64b5484b106f4fd77c8c87e40a) - - - - - e4b537b2 by Alp Mestanogullari at 2019-04-06T16:16:29Z Hadrian: introduce ways to skip some documentation targets The initial motivation for this is to have a chance to run the binary distribution rules in our Windows CI without having to install sphinx-build and xelatex there, while retaining the ability to generate haddocks. I just ended up extending this idea a little bit so as to have control over whether we build haddocks, (sphinx) HTML manuals, (sphinx) PDF manuals and (sphinx) manpages. (cherry picked from commit 8442103aa575dc1cd25cb3231e729c6365dc1b5c) - - - - - 04bfc6e9 by Ben Gamari at 2019-04-06T16:16:29Z gitlab-ci: Backport from master - - - - - 8 changed files: - .gitlab-ci.yml - hadrian/doc/make.md - hadrian/doc/user-settings.md - hadrian/src/CommandLine.hs - hadrian/src/Flavour.hs - hadrian/src/Rules/Documentation.hs - hadrian/src/Settings/Default.hs - hadrian/src/UserSettings.hs Changes: ===================================== .gitlab-ci.yml ===================================== @@ -1,17 +1,31 @@ variables: GIT_SSL_NO_VERIFY: "1" + # Commit of ghc/ci-images repository from which to pull Docker images + DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + before_script: - python3 .gitlab/fix-submodules.py - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" stages: - - lint - - build - - full-build - - cleanup # See Note [Cleanup on Windows] + - lint # Source linting + - build # A quick smoke-test to weed out broken commits + - full-build # Build all the things + - cleanup # See Note [Cleanup on Windows] + - packaging # Source distribution, etc. + - hackage # head.hackage testing + - deploy # push documentation + +.only-default: &only-default + only: + - master + - /ghc-[0-9]+\.[0-9]+/ + - merge_requests + - tags ############################################################ # Runner Tags @@ -30,44 +44,106 @@ stages: ############################################################ ghc-linters: + allow_failure: true stage: lint - image: ghcci/linters:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" script: - - | - if [ -n "$CI_MERGE_REQUEST_ID" ]; then - base="$(git merge-base $CI_MERGE_REQUEST_BRANCH_NAME HEAD)" - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) - submodchecker .git $(git rev-list $base..$CI_COMMIT_SHA) - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA - fi + - git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME + - base="$(git merge-base FETCH_HEAD $CI_COMMIT_SHA)" + - "echo Merge base $base" + # - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) + - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) + - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + dependencies: [] + tags: + - lint + only: + refs: + - merge_requests + +# We allow the submodule checker to fail when run on merge requests (to +# accomodate, e.g., haddock changes not yet upstream) but not on `master` or +# Marge jobs. +.lint-submods: + stage: lint + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" + script: + - submodchecker .git $(git rev-list $base..$CI_COMMIT_SHA) + dependencies: [] + tags: + - lint + +lint-submods: + extends: .lint-submods + only: + refs: + - master + - /ghc-[0-9]+\.[0-9]+/ + - wip/marge_bot_batch_merge_job + +lint-submods-mr: + extends: .lint-submods + allow_failure: true + only: + refs: + - merge_requests + +.lint-changelogs: + stage: lint + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" dependencies: [] tags: - lint + script: + - | + grep TBA libraries/*/changelog.md && ( + echo "Error: Found \"TBA\"s in changelogs." + exit 1 + ) + +lint-changelogs: + extends: .lint-changelogs + allow_failure: true + only: + refs: + - /ghc-[0-9]+\.[0-9]+/ + +lint-release-changelogs: + extends: .lint-changelogs + only: + - tags + ############################################################ # Validation via Pipelines (hadrian) ############################################################ .validate-hadrian: + <<: *only-default allow_failure: true script: + - cabal update - git clean -xdf && git submodule foreach git clean -xdf - bash .circleci/prepare-system.sh - if [[ -d ./cabal-cache ]]; then cp -R ./.cabal-cache ~/.cabal-cache; fi - ./boot - ./configure $CONFIGURE_ARGS - - hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` + - hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --docs=no-sphinx binary-dist + - mv _build/bindist/ghc*.tar.xz ghc.tar.xz cache: key: hadrian paths: - cabal-cache + artifacts: + when: always + paths: + - ghc.tar.xz validate-x86_64-linux-deb8-hadrian: extends: .validate-hadrian stage: build - image: ghcci/x86_64-linux-deb8:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb8:$DOCKER_REV" before_script: # workaround for docker permissions - sudo chown ghc:ghc -R . @@ -75,14 +151,17 @@ validate-x86_64-linux-deb8-hadrian: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" tags: - x86_64-linux + ############################################################ # Validation via Pipelines (make) ############################################################ .validate: + <<: *only-default variables: TEST_TYPE: test before_script: @@ -92,22 +171,25 @@ validate-x86_64-linux-deb8-hadrian: - ./configure $CONFIGURE_ARGS - | THREADS=`mk/detect-cpu-count.sh` - make V=0 -j$THREADS + make V=0 -j$THREADS WERROR=-Werror - | make binary-dist TAR_COMP_OPTS="-1" - mv ghc-*.tar.xz ghc.tar.xz - | THREADS=`mk/detect-cpu-count.sh` - make $TEST_TYPE THREADS=$THREADS JUNIT_FILE=../../junit.xml + make $TEST_TYPE THREADS=$THREADS JUNIT_FILE=../../junit.xml METRICS_FILE=$METRICS_FILE dependencies: [] artifacts: reports: junit: junit.xml expire_in: 2 week paths: - - ghc.tar.xz + - ghc-*.tar.xz - junit.xml +################################# +# x86_64-darwin +################################# + validate-x86_64-darwin: extends: .validate stage: full-build @@ -115,17 +197,20 @@ validate-x86_64-darwin: - x86_64-darwin variables: GHC_VERSION: 8.6.3 + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-apple-darwin.tar.xz" MACOSX_DEPLOYMENT_TARGET: "10.7" # Only Sierra and onwards supports clock_gettime. See #12858 ac_cv_func_clock_gettime: "no" LANG: "en_US.UTF-8" CONFIGURE_ARGS: --with-intree-gmp + TEST_ENV: "x86_64-darwin" before_script: - git clean -xdf && git submodule foreach git clean -xdf - python3 .gitlab/fix-submodules.py - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" - bash .gitlab/darwin-init.sh - PATH="`pwd`/toolchain/bin:$PATH" @@ -150,6 +235,12 @@ validate-x86_64-darwin: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" + # Build hyperlinked sources for documentation when building releases + - | + if [[ -n "$CI_COMMIT_TAG" ]]; then + echo "EXTRA_HADDOCK_OPTS += --hyperlinked-source --quickjump" >> mk/build.mk + fi - bash .circleci/prepare-system.sh # workaround for docker permissions @@ -162,22 +253,31 @@ validate-x86_64-darwin: - cabal-cache - toolchain -validate-aarch64-linux-deb9: +################################# +# aarch64-linux-deb9 +################################# + +.build-aarch64-linux-deb9: extends: .validate-linux stage: full-build - image: ghcci/aarch64-linux-deb9:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/aarch64-linux-deb9:$DOCKER_REV" allow_failure: true - artifacts: - when: always - expire_in: 2 week + variables: + TEST_ENV: "aarch64-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-aarch64-linux-deb9.tar.xz" cache: key: linux-aarch64-deb9 tags: - aarch64-linux +validate-aarch64-linux-deb9: + extends: .build-aarch64-linux-deb9 + artifacts: + when: always + expire_in: 2 week + nightly-aarch64-linux-deb9: - extends: validate-aarch64-linux-deb9 - stage: full-build + extends: .build-aarch64-linux-deb9 artifacts: expire_in: 2 year variables: @@ -186,22 +286,28 @@ nightly-aarch64-linux-deb9: variables: - $NIGHTLY -validate-i386-linux-deb9: +################################# +# i386-linux-deb9 +################################# + +.build-i386-linux-deb9: extends: .validate-linux stage: full-build - image: ghcci/i386-linux-deb9:0.1 - allow_failure: true + image: "registry.gitlab.haskell.org/ghc/ci-images/i386-linux-deb9:$DOCKER_REV" + variables: + TEST_ENV: "i386-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-i386-deb9-linux.tar.xz" + cache: + key: linux-i386-deb9 + +validate-i386-linux-deb9: + extends: .build-i386-linux-deb9 artifacts: when: always expire_in: 2 week - cache: - key: linux-i386-deb9 nightly-i386-linux-deb9: - extends: .validate-linux - stage: full-build - image: ghcci/i386-linux-deb9:0.1 - allow_failure: true + extends: .build-i386-linux-deb9 variables: TEST_TYPE: slowtest artifacts: @@ -210,22 +316,29 @@ nightly-i386-linux-deb9: only: variables: - $NIGHTLY + +################################# +# x86_64-linux-deb9 +################################# + +.build-x86_64-linux-deb9: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb9:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb9-linux.tar.xz" cache: - key: linux-i386-deb9 + key: linux-x86_64-deb9 validate-x86_64-linux-deb9: - extends: .validate-linux - stage: build - image: ghcci/x86_64-linux-deb9:0.2 + extends: .build-x86_64-linux-deb9 artifacts: when: always expire_in: 2 week - cache: - key: linux-x86_64-deb9 nightly-x86_64-linux-deb9: - extends: validate-x86_64-linux-deb9 - stage: build + extends: .build-x86_64-linux-deb9 artifacts: expire_in: 2 year variables: @@ -234,70 +347,93 @@ nightly-x86_64-linux-deb9: variables: - $NIGHTLY +# N.B. Has DEBUG assertions enabled in stage2 +validate-x86_64-linux-deb9-debug: + extends: .build-x86_64-linux-deb9 + stage: build + variables: + BUILD_FLAVOUR: validate + TEST_ENV: "x86_64-linux-deb9-debug" + validate-x86_64-linux-deb9-llvm: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build - allow_failure: true - image: ghcci/x86_64-linux-deb9:0.2 variables: BUILD_FLAVOUR: perf-llvm - cache: - key: linux-x86_64-deb9 - -validate-x86_64-linux-deb8: - extends: .validate-linux - stage: full-build - image: ghcci/x86_64-linux-deb8:0.1 - cache: - key: linux-x86_64-deb8 - artifacts: - when: always - expire_in: 2 week - -validate-x86_64-linux-fedora27: - extends: .validate-linux - stage: full-build - image: ghcci/x86_64-linux-fedora27:0.1 - cache: - key: linux-x86_64-fedora27 - artifacts: - when: always - expire_in: 2 week + TEST_ENV: "x86_64-linux-deb9-llvm" validate-x86_64-linux-deb9-integer-simple: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build variables: INTEGER_LIBRARY: integer-simple - image: ghcci/x86_64-linux-deb9:0.2 - cache: - key: linux-x86_64-deb9 + TEST_ENV: "x86_64-linux-deb9-integer-simple" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb9-linux-integer-simple.tar.xz" nightly-x86_64-linux-deb9-integer-simple: - extends: validate-x86_64-linux-deb9-integer-simple + extends: .build-x86_64-linux-deb9 stage: full-build - artifacts: - expire_in: 2 year variables: + INTEGER_LIBRARY: integer-simple + TEST_ENV: "x86_64-linux-deb9-integer-simple" TEST_TYPE: slowtest + artifacts: + expire_in: 2 year only: variables: - $NIGHTLY validate-x86_64-linux-deb9-unreg: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build variables: CONFIGURE_ARGS: --enable-unregisterised - image: ghcci/x86_64-linux-deb9:0.2 + TEST_ENV: "x86_64-linux-deb9-unreg" + + +################################# +# x86_64-linux-deb8 +################################# + +release-x86_64-linux-deb8: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb8:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-deb8" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb8-linux.tar.xz" + only: + - tags cache: - key: linux-x86_64-deb9 + key: linux-x86_64-deb8 + artifacts: + when: always + expire_in: 2 week + + +################################# +# x86_64-linux-fedora27 +################################# + +validate-x86_64-linux-fedora27: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-fedora27:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-fedora27" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-fedora27-linux.tar.xz" + cache: + key: linux-x86_64-fedora27 + artifacts: + when: always + expire_in: 2 week ############################################################ # Validation via Pipelines (Windows) ############################################################ -.validate-windows: +.build-windows: + <<: *only-default before_script: - git clean -xdf - git submodule foreach git clean -xdf @@ -314,72 +450,131 @@ validate-x86_64-linux-deb9-unreg: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" - bash .gitlab/win32-init.sh after_script: - rd /s /q tmp - robocopy /np /nfl /ndl /e "%APPDATA%\cabal" cabal-cache - bash -c 'make clean || true' dependencies: [] + variables: + FORCE_SYMLINKS: 1 + LANG: "en_US.UTF-8" cache: paths: - cabal-cache - ghc-8.6.2 - ghc-tarballs -validate-x86_64-windows-hadrian: - extends: .validate-windows +.build-windows-hadrian: + extends: .build-windows stage: full-build variables: GHC_VERSION: "8.6.2" - LANG: "en_US.UTF-8" script: - | - set MSYSTEM=MINGW64 python boot bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex' - mkdir -p _build - cp -R inplace/mingw _build/mingw - - bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --flavour=Quick" - # FIXME: Bindist disabled due to #16073 - #- bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh binary-dist" + - bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --flavour=Quick --docs=no-sphinx binary-dist" + - mv _build/bindist/ghc*.tar.xz ghc.tar.xz # FIXME: Testsuite disabled due to #16156. - #- bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' - cache: - key: x86_64-windows + # - bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' tags: - x86_64-windows + artifacts: + when: always + paths: + - ghc.tar.xz -validate-x86_64-windows: - extends: .validate-windows +validate-x86_64-windows-hadrian: + extends: .build-windows-hadrian + variables: + MSYSTEM: MINGW64 + cache: + key: x86_64-windows-hadrian + +nightly-i386-windows-hadrian: + extends: .build-windows-hadrian + variables: + MSYSTEM: MINGW32 + only: + variables: + - $NIGHTLY + cache: + key: i386-windows-hadrian + +.build-windows-make: + extends: .build-windows stage: full-build # due to #16084 allow_failure: true variables: + BUILD_FLAVOUR: "quick" GHC_VERSION: "8.6.2" - LANG: "en_US.UTF-8" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-mingw32.tar.xz" script: - | - set MSYSTEM=MINGW64 python boot - bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex' - - bash -c "echo include mk/flavours/quick.mk > mk/build.mk" + bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex $CONFIGURE_ARGS' + - bash -c "echo include mk/flavours/${BUILD_FLAVOUR}.mk > mk/build.mk" - bash -c "PATH=`pwd`/toolchain/bin:$PATH make -j`mk/detect-cpu-count.sh`" - - | - bash -c "make binary-dist TAR_COMP_OPTS=-1" - mv ghc-*.tar.xz ghc.tar.xz + - bash -c "PATH=`pwd`/toolchain/bin:$PATH make binary-dist TAR_COMP_OPTS=-1" - bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' - cache: - key: x86_64-windows tags: - x86_64-windows artifacts: when: always + expire_in: 2 week reports: junit: junit.xml paths: - - ghc.tar.xz + - ghc-*.tar.xz - junit.xml +validate-x86_64-windows: + extends: .build-windows-make + variables: + MSYSTEM: MINGW64 + CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" + cache: + key: x86_64-windows + +# Normal Windows validate builds are profiled; that won't do for releases. +release-x86_64-windows: + extends: validate-x86_64-windows + variables: + MSYSTEM: MINGW64 + BUILD_FLAVOUR: "perf" + CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" + only: + - tags + +release-i386-windows: + extends: .build-windows-make + only: + - tags + variables: + MSYSTEM: MINGW32 + BUILD_FLAVOUR: "perf" + CONFIGURE_ARGS: "--target=i386-unknown-mingw32" + cache: + key: i386-windows + +nightly-i386-windows: + extends: .build-windows-make + only: + variables: + - $NIGHTLY + variables: + MSYSTEM: MINGW32 + CONFIGURE_ARGS: "--target=i386-unknown-mingw32" + cache: + key: i386-windows + +############################################################ +# Cleanup +############################################################ + # Note [Cleaning up after shell executor] # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # @@ -393,6 +588,7 @@ validate-x86_64-windows: # See Note [Cleanup after shell executor] cleanup-windows: + <<: *only-default stage: cleanup tags: - x86_64-windows @@ -415,10 +611,12 @@ cleanup-windows: # See Note [Cleanup after shell executor] cleanup-darwin: + <<: *only-default stage: cleanup tags: - x86_64-darwin when: always + dependencies: [] before_script: - echo "Time to clean up" script: @@ -430,3 +628,106 @@ cleanup-darwin: - rm -Rf $BUILD_DIR/* - exit 0 +############################################################ +# Packaging +############################################################ + +doc-tarball: + <<: *only-default + stage: packaging + tags: + - x86_64-linux + image: ghcci/x86_64-linux-deb9:0.2 + dependencies: + - validate-x86_64-linux-deb9 + - validate-x86_64-windows + artifacts: + paths: + - haddock.html.tar.xz + - libraries.html.tar.xz + - users_guide.html.tar.xz + - index.html + - "*.pdf" + script: + - rm -Rf docs + - bash -ex distrib/mkDocs/mkDocs ghc-x86_64-deb9-linux.tar.xz ghc-x86_64-mingw32.tar.xz + - ls -lh + - mv docs/*.tar.xz docs/index.html . + +source-tarball: + stage: packaging + tags: + - x86_64-linux + image: ghcci/x86_64-linux-deb9:0.2 + only: + - tags + artifacts: + paths: + - ghc-*.tar.xz + - version + script: + - mk/get-win32-tarballs.sh download all + - ./boot + - ./configure + - make sdist + - mv sdistprep/*.xz . + - make show VALUE=version > version + + +############################################################ +# Testing via head.hackage +############################################################ + +# Triggering jobs in the ghc/head.hackage project requires that we have a job +# token for that repository. Furthermore the head.hackage CI job must have +# access to an unprivileged access token with the ability to query the ghc/ghc +# project such that it can find the job ID of the fedora27 job for the current +# pipeline. + +.hackage: + <<: *only-default + stage: hackage + image: ghcci/x86_64-linux-deb9:0.2 + tags: + - x86_64-linux + dependencies: [] + variables: + HEAD_HACKAGE_PROJECT_ID: "78" + script: + - bash .gitlab/start-head.hackage.sh + +hackage: + extends: .hackage + when: manual + +hackage-label: + extends: .hackage + only: + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*user-facing.*/ + +nightly-hackage: + extends: .hackage + only: + variables: + - $NIGHTLY + +pages: + stage: deploy + dependencies: + - doc-tarball + image: ghcci/x86_64-linux-deb9:0.2 + tags: + - x86_64-linux + script: + - mkdir -p public/doc + - tar -xf haddock.html.tar.xz -C public/doc + - tar -xf libraries.html.tar.xz -C public/doc + - tar -xf users_guide.html.tar.xz -C public/doc + - cp -f index.html public/doc + only: + - master + artifacts: + paths: + - public + ===================================== hadrian/doc/make.md ===================================== @@ -174,6 +174,22 @@ time you fire up a build. This is not possible with the Make build system. build docs ``` +- Build documentation, but without haddocks (resp. without HTML or PDF manuals) + + ``` sh + # Make + echo 'HADDOCKS_DOCS = NO' > mk/build.mk + # For HTML manuals: BUILD_SPHINX_HTML = NO + # For PDF manuals: BUILD_SPHINX_PDF = NO + make + + # Hadrian + build docs --docs=no-haddocks + # Append --docs=no-sphinx-pdfs, --docs=no-sphinx-html or + # --docs=no-sphinx-man (or --docs=no-sphinx to encompass them all) + # to further reduce or even completely disable documentation targets. + ``` + - Running nofib ``` sh ===================================== hadrian/doc/user-settings.md ===================================== @@ -32,7 +32,10 @@ data Flavour = Flavour { -- | Build profiled GHC. ghcProfiled :: Bool, -- | Build GHC with debug information. - ghcDebugged :: Bool } + ghcDebugged :: Bool + -- | Whether to build docs and which ones + -- (haddocks, user manual, haddock manual) + ghcDocs :: Action DocTargets } ``` Hadrian provides several built-in flavours (`default`, `quick`, and a few others; see `hadrian/doc/flavours.md`), which can be activated from the command line, @@ -102,6 +105,17 @@ patterns such as `"//Prelude.*"` can be used when matching input and output file where `//` matches an arbitrary number of path components and `*` matches an entire path component, excluding any separators. +### Enabling -Werror + +It is useful to enable `-Werror` when building GHC as this setting is +used in the CI to ensure a warning free build. The `werror` function can be +used to easily modify a flavour to turn this setting on. + +``` +devel2WerrorFlavour :: Flavour +devel2WerrorFlavour = werror (developmentFlavour Stage2) +``` + ## Packages Users can add and remove packages from particular build stages. As an example, @@ -216,6 +230,45 @@ verboseCommand = output "//rts/sm/*" &&^ way threaded verboseCommand = return True ``` +## Documentation + +`Flavour`'s `ghcDocs :: Action DocTargets` field lets you +customize the "groups" of documentation targets that should +run when running `build docs` (or, transitively, +`build binary-dist`). + +```haskell +type DocTargets = Set DocTarget +data DocTarget = Haddocks | SphinxHTML | SphinxPDFs | SphinxMan +``` + +By default, `ghcDocs` contains all of them and `build docs` would +therefore attempt to build all the haddocks, manuals and manpages. +If, for some reason (e.g no easy way to install `sphinx-build` or +`xelatex` on your system), you're just interested in building the +haddocks, you could define a custom flavour as follows: + +```haskell +justHaddocksFlavour :: Flavour +justHaddocksFlavour = defaultFlavour + { name = "default-haddocks" + , ghcDocs = Set.singleton Haddocks } +``` + +and then run `build --flavour=default-haddocks`. Alternatively, +you can use the `--docs` CLI flag to selectively disable some or +all of the documentation targets: + +- `--docs=none`: don't build any docs +- `--docs=no-haddocks`: don't build haddocks +- `--docs=no-sphinx`: don't build any user manual or manpage +- `--docs=no-sphinx-html`: don't build HTML versions of manuals +- `--docs=no-sphinx-pdfs`: don't build PDF versions of manuals +- `--docs=no-sphinx-man`: don't build the manpage + +You can pass several `--docs=...` flags, Hadrian will combine +their effects. + ## Miscellaneous To change the default behaviour of Hadrian with respect to building split ===================================== hadrian/src/CommandLine.hs ===================================== @@ -1,17 +1,20 @@ module CommandLine ( optDescrs, cmdLineArgsMap, cmdFlavour, lookupFreeze1, cmdIntegerSimple, cmdProgressColour, cmdProgressInfo, cmdConfigure, cmdSplitObjects, - lookupBuildRoot, TestArgs(..), TestSpeed(..), defaultTestArgs + cmdDocsArgs, lookupBuildRoot, TestArgs(..), TestSpeed(..), defaultTestArgs ) where import Data.Either import qualified Data.HashMap.Strict as Map import Data.List.Extra import Development.Shake hiding (Normal) +import Flavour (DocTargets, DocTarget(..)) import Hadrian.Utilities hiding (buildRoot) import System.Console.GetOpt import System.Environment +import qualified Data.Set as Set + data TestSpeed = Slow | Average | Fast deriving (Show, Eq) -- | All arguments that can be passed to Hadrian via the command line. @@ -24,7 +27,8 @@ data CommandLineArgs = CommandLineArgs , progressInfo :: ProgressInfo , splitObjects :: Bool , buildRoot :: BuildRoot - , testArgs :: TestArgs } + , testArgs :: TestArgs + , docTargets :: DocTargets } deriving (Eq, Show) -- | Default values for 'CommandLineArgs'. @@ -38,7 +42,8 @@ defaultCommandLineArgs = CommandLineArgs , progressInfo = Brief , splitObjects = False , buildRoot = BuildRoot "_build" - , testArgs = defaultTestArgs } + , testArgs = defaultTestArgs + , docTargets = Set.fromList [minBound..maxBound] } -- | These arguments are used by the `test` target. data TestArgs = TestArgs @@ -179,6 +184,25 @@ readTestWay way = let newWays = way : testWays (testArgs flags) in flags { testArgs = (testArgs flags) {testWays = newWays} } +readDocsArg :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) +readDocsArg ms = maybe (Left "Cannot parse docs argument") (Right . set) (go =<< ms) + + where + go :: String -> Maybe (DocTargets -> DocTargets) + go "none" = Just (const Set.empty) + go "no-haddocks" = Just (Set.delete Haddocks) + go "no-sphinx-html" = Just (Set.delete SphinxHTML) + go "no-sphinx-pdfs" = Just (Set.delete SphinxPDFs) + go "no-sphinx-man" = Just (Set.delete SphinxMan) + go "no-sphinx" = Just (Set.delete SphinxHTML + . Set.delete SphinxPDFs + . Set.delete SphinxMan) + go _ = Nothing + + set :: (DocTargets -> DocTargets) -> CommandLineArgs -> CommandLineArgs + set tweakTargets flags = flags + { docTargets = tweakTargets (docTargets flags) } + -- | Standard 'OptDescr' descriptions of Hadrian's command line arguments. optDescrs :: [OptDescr (Either String (CommandLineArgs -> CommandLineArgs))] optDescrs = @@ -198,6 +222,8 @@ optDescrs = "Progress info style (None, Brief, Normal or Unicorn)." , Option [] ["split-objects"] (NoArg readSplitObjects) "Generate split objects (requires a full clean rebuild)." + , Option [] ["docs"] (OptArg readDocsArg "TARGET") + "Strip down docs targets (none, no-haddocks, no-sphinx[-{html, pdfs, man}]." , Option [] ["test-compiler"] (OptArg readTestCompiler "TEST_COMPILER") "Use given compiler [Default=stage2]." , Option [] ["test-config-file"] (OptArg readTestConfigFile "CONFIG_FILE") @@ -259,3 +285,6 @@ cmdProgressInfo = progressInfo <$> cmdLineArgs cmdSplitObjects :: Action Bool cmdSplitObjects = splitObjects <$> cmdLineArgs + +cmdDocsArgs :: Action DocTargets +cmdDocsArgs = docTargets <$> cmdLineArgs ===================================== hadrian/src/Flavour.hs ===================================== @@ -1,6 +1,10 @@ -module Flavour (Flavour (..)) where +module Flavour + ( Flavour (..), werror + , DocTargets, DocTarget(..) + ) where import Expression +import Data.Set (Set) -- Please update doc/{flavours.md, user-settings.md} when changing this file. -- | 'Flavour' is a collection of build settings that fully define a GHC build. @@ -31,4 +35,33 @@ data Flavour = Flavour { -- | Build profiled GHC. ghcProfiled :: Bool, -- | Build GHC with debug information. - ghcDebugged :: Bool } + ghcDebugged :: Bool, + -- | Whether to build docs and which ones + -- (haddocks, user manual, haddock manual) + ghcDocs :: Action DocTargets } + +-- | A set of documentation targets +type DocTargets = Set DocTarget + +-- | Documentation targets +-- +-- While we can't reasonably expose settings or CLI options +-- to selectively disable, say, base's haddocks, we can offer +-- a less fine-grained choice: +-- +-- - haddocks for libraries +-- - non-haddock html pages (e.g GHC's user manual) +-- - PDF documents (e.g haddock's manual) +-- - man pages (GHC's) +-- +-- The main goal being to have easy ways to do away with the need +-- for e.g @sphinx-build@ or @xelatex@ and associated packages +-- while still being able to build a(n almost) complete binary +-- distribution. +data DocTarget = Haddocks | SphinxHTML | SphinxPDFs | SphinxMan + deriving (Eq, Ord, Show, Bounded, Enum) + +-- | Turn on -Werror for packages built with the stage1 compiler. +-- It mimics the CI settings so is useful to turn on when developing. +werror :: Flavour -> Flavour +werror fl = fl { args = args fl <> (builder Ghc ? notStage0 ? arg "-Werror") } ===================================== hadrian/src/Rules/Documentation.hs ===================================== @@ -22,6 +22,7 @@ import Target import Utilities import Data.List (union) +import qualified Data.Set as Set import qualified Text.Parsec as Parsec docRoot :: FilePath @@ -79,10 +80,35 @@ documentationRules = do -- Haddock's manual, and builds man pages "docs" ~> do root <- buildRoot + doctargets <- ghcDocs =<< flavour let html = htmlRoot -/- "index.html" -- also implies "docs-haddock" archives = map pathArchive docPaths pdfs = map pathPdf $ docPaths \\ ["libraries"] - need $ map (root -/-) $ [html] ++ archives ++ pdfs ++ [manPageBuildPath] + + targets = -- include PDFs unless --docs=no-sphinx[-pdf] is + -- passed. + concat [ pdfs | SphinxPDFs `Set.member` doctargets ] + + -- include manpage unless --docs=no-sphinx[-man] is given. + ++ [ manPageBuildPath | SphinxMan `Set.member` doctargets ] + + -- include toplevel html target uness we neither want + -- haddocks nor html pages produced by sphinx. + ++ [ html | Set.size (doctargets `Set.intersection` + Set.fromList [Haddocks, SphinxHTML] + ) > 0 ] + + -- include archives for whatever targets remain from + -- the --docs arguments we got. + ++ [ ar + | (ar, doc) <- zip archives docPaths + , archiveTarget doc `Set.member` doctargets ] + + need $ map (root -/-) targets + + where archiveTarget "libraries" = Haddocks + archiveTarget _ = SphinxHTML + ------------------------------------- HTML ------------------------------------- @@ -94,7 +120,16 @@ buildHtmlDocumentation = do root <- buildRootRules root -/- htmlRoot -/- "index.html" %> \file -> do - need $ map ((root -/-) . pathIndex) docPaths + doctargets <- ghcDocs =<< flavour + + -- We include the HTML output of haddock for libraries unless + -- told not to (e.g with --docs=no-haddocks). Likewise for + -- the HTML version of the users guide or the Haddock manual. + let targets = [ "libraries" | Haddocks `Set.member` doctargets ] + ++ concat [ ["users_guide", "Haddock"] + | SphinxHTML `Set.member` doctargets ] + need $ map ((root -/-) . pathIndex) targets + copyFileUntracked "docs/index.html" file -- | Compile a Sphinx ReStructured Text package to HTML. ===================================== hadrian/src/Settings/Default.hs ===================================== @@ -222,7 +222,8 @@ defaultFlavour = Flavour , dynamicGhcPrograms = defaultDynamicGhcPrograms , ghciWithDebugger = False , ghcProfiled = False - , ghcDebugged = False } + , ghcDebugged = False + , ghcDocs = cmdDocsArgs } -- | Default logic for determining whether to build -- dynamic GHC programs. ===================================== hadrian/src/UserSettings.hs ===================================== @@ -2,6 +2,10 @@ -- hadrian/src/UserSettings.hs to hadrian/UserSettings.hs and edit your copy. -- If you don't copy the file your changes will be tracked by git and you can -- accidentally commit them. +-- +-- See doc/user-settings.md for instructions, and src/Flavour.hs for auxiliary +-- functions for manipulating flavours. +-- Please update doc/user-settings.md when committing changes to this file. module UserSettings ( userFlavours, userPackages, userDefaultFlavour, verboseCommand, buildProgressColour, successColour, finalStage View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/20364073caef323eea76332ee0b2c0f4cd539952...04bfc6e9475cde08d9756f3191cc99999e8a9d18 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/20364073caef323eea76332ee0b2c0f4cd539952...04bfc6e9475cde08d9756f3191cc99999e8a9d18 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:20:38 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 12:20:38 -0400 Subject: [Git][ghc/ghc] Pushed new tag ghc-8.6.5-rc1 Message-ID: <5ca8d1d617efc_62b3e1da548193259d@gitlab.haskell.org.mail> Ben Gamari pushed new tag ghc-8.6.5-rc1 at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/ghc-8.6.5-rc1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:27:57 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 12:27:57 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] users-guide: Add pretty to package list Message-ID: <5ca8d38df23eb_62b33fa2f24d845419357a2@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: 4c8519b6 by Ben Gamari at 2019-04-06T16:26:38Z users-guide: Add pretty to package list - - - - - 1 changed file: - docs/users_guide/8.8.1-notes.rst Changes: ===================================== docs/users_guide/8.8.1-notes.rst ===================================== @@ -206,6 +206,7 @@ for further change information. libraries/mtl/mtl.cabal: Dependency of ``Cabal`` library libraries/parsec/parsec.cabal: Dependency of ``Cabal`` library libraries/process/process.cabal: Dependency of ``ghc`` library + libraries/pretty/pretty.cabal: Dependency of ``ghc`` library libraries/stm/stm.cabal: Dependency of ``haskeline`` library libraries/template-haskell/template-haskell.cabal: Core library libraries/terminfo/terminfo.cabal: Dependency of ``haskeline`` library View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/4c8519b6a5c1dc99c1051098fa91e8450406d9f1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/4c8519b6a5c1dc99c1051098fa91e8450406d9f1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:38:37 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 12:38:37 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/T16544 Message-ID: <5ca8d60d5a3e5_62b3e1da5481937824@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/T16544 at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/T16544 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:40:12 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 12:40:12 -0400 Subject: [Git][ghc/ghc][ghc-8.6.5-rc1] 2 commits: Add release notes for 8.6.5 Message-ID: <5ca8d66c317db_62b33fa2c50c025419405da@gitlab.haskell.org.mail> Ben Gamari pushed to tag ghc-8.6.5-rc1 at Glasgow Haskell Compiler / GHC Commits: 7d3040ac by Ben Gamari at 2019-04-06T16:32:53Z Add release notes for 8.6.5 - - - - - f6cd3ae8 by Ben Gamari at 2019-04-06T16:35:41Z users-guide: Add missing libraries to release notes library list - - - - - 2 changed files: - + 8.6.5-notes.rst - docs/users_guide/index.rst Changes: ===================================== 8.6.5-notes.rst ===================================== @@ -0,0 +1,73 @@ +.. _release-8-6-4: + +Release notes for version 8.6.5 +=============================== + +GHC 8.6.5 is a bug-fix release, fixing a few regressions found in 8.6.4. + + +Highlights +---------- + +The highlights, since the 8.6.5 release, are: + +- Binary distributions once again ship with Haddock documentation including + syntax highlighted source of core libraries (:ghc-ticket:`16445`) + +- A build system issue where use of GCC with ``-flto`` broke ``configure`` + was fixed (:ghc-ticket:`16440`) + +- Several packaging issues with the Windows binary distributions have been resolved. + (:ghc-ticket:`16408`, :ghc-ticket:`16398`, :ghc-ticket:`16516`). + +Known issues +------------ + +Note that the LLVM code generator (:ghc-flag:`-fllvm`) in GHC 8.6, as well as +all earlier releases, are affected by :ghc-ticket:`14251`, which can result in +miscompilation of some programs calling functions with unboxed floating-point +arguments. While originally scheduled to be fixed for this release, the fix +ended up being more difficult than anticipated and, given that issue is not a +strict regression from 8.4, we decided to proceed with the release. + +Included libraries +------------------ + +The package database provided with this distribution also contains a number of +packages other than GHC itself. See the changelogs provided with these packages +for further change information. + +.. ghc-package-list:: + + libraries/array/array.cabal: Dependency of ``ghc`` library + libraries/base/base.cabal: Core library + libraries/binary/binary.cabal: Dependency of ``ghc`` library + libraries/bytestring/bytestring.cabal: Deppendency of ``ghc`` library + libraries/Cabal/Cabal/Cabal.cabal: Dependency of ``ghc-pkg`` utility + libraries/containers/containers.cabal: Dependency of ``ghc`` library + libraries/deepseq/deepseq.cabal: Dependency of ``ghc`` library + libraries/directory/directory.cabal: Dependency of ``ghc`` library + libraries/filepath/filepath.cabal: Dependency of ``ghc`` library + compiler/ghc.cabal: The compiler itself + libraries/ghci/ghci.cabal: The REPL interface + libraries/ghc-boot/ghc-boot.cabal: Internal compiler library + libraries/ghc-boot-th/ghc-boot-th.cabal: Internal compiler library + libraries/ghc-compact/ghc-compact.cabal: Core library + libraries/ghc-heap/ghc-heap.cabal: GHC heap-walking library + libraries/ghc-prim/ghc-prim.cabal: Core library + libraries/haskeline/haskeline.cabal: Dependency of ``ghci`` executable + libraries/hpc/hpc.cabal: Dependency of ``hpc`` executable + libraries/integer-gmp/integer-gmp.cabal: Core library + libraries/libiserv/libiserv.cabal: Internal compiler library + libraries/mtl/mtl.cabal: Dependency of ``Cabal`` library + libraries/parsec/parsec.cabal: Dependency of ``Cabal`` library + libraries/process/process.cabal: Dependency of ``ghc`` library + libraries/stm/stm.cabal: Dependency of ``haskeline`` library + libraries/template-haskell/template-haskell.cabal: Core library + libraries/terminfo/terminfo.cabal: Dependency of ``haskeline`` library + libraries/text/text.cabal: Dependency of ``Cabal`` library + libraries/time/time.cabal: Dependency of ``ghc`` library + libraries/transformers/transformers.cabal: Dependency of ``ghc`` library + libraries/unix/unix.cabal: Dependency of ``ghc`` library + libraries/Win32/Win32.cabal: Dependency of ``ghc`` library + libraries/xhtml/xhtml.cabal: Dependency of ``haddock`` executable ===================================== docs/users_guide/index.rst ===================================== @@ -16,6 +16,7 @@ Contents: 8.6.2-notes 8.6.3-notes 8.6.4-notes + 8.6.5-notes ghci runghc usage View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d2a284ab461681919cadaed394adebe42c4cc7bb...f6cd3ae8306497176589fd06b623f7ed113ccaa1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d2a284ab461681919cadaed394adebe42c4cc7bb...f6cd3ae8306497176589fd06b623f7ed113ccaa1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:41:57 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 12:41:57 -0400 Subject: [Git][ghc/ghc] Deleted branch wip/marge_bot_batch_merge_job Message-ID: <5ca8d6d51c20_62b33fa2f1f4fcbc194838a@gitlab.haskell.org.mail> Ben Gamari deleted branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC -- You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:41:59 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 12:41:59 -0400 Subject: [Git][ghc/ghc][master] 3 commits: testsuite: Add testcase for #16111 Message-ID: <5ca8d6d79fe02_62b33fa2f24d84541948534@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 9 changed files: - compiler/typecheck/TcTyClsDecls.hs - + libraries/base/tests/T16111.hs - + libraries/base/tests/T16111.stderr - libraries/base/tests/all.T - rts/StgCRun.c - testsuite/tests/indexed-types/should_compile/T11361a.stderr - + testsuite/tests/indexed-types/should_fail/T13971.hs - + testsuite/tests/indexed-types/should_fail/T13971.stderr - testsuite/tests/indexed-types/should_fail/all.T Changes: ===================================== compiler/typecheck/TcTyClsDecls.hs ===================================== @@ -1497,7 +1497,7 @@ tcDefaultAssocDecl fam_tc [dL->L loc (FamEqn { feqn_tycon = L _ tc_name = -- See Note [Type-checking default assoc decls] setSrcSpan loc $ tcAddFamInstCtxt (text "default type instance") tc_name $ - do { traceTc "tcDefaultAssocDecl" (ppr tc_name) + do { traceTc "tcDefaultAssocDecl 1" (ppr tc_name) ; let fam_tc_name = tyConName fam_tc fam_arity = length (tyConVisibleTyVars fam_tc) @@ -1524,14 +1524,46 @@ tcDefaultAssocDecl fam_tc [dL->L loc (FamEqn { feqn_tycon = L _ tc_name imp_vars exp_vars hs_pats hs_rhs_ty - -- See Note [Type-checking default assoc decls] - ; traceTc "tcDefault" (vcat [ppr (tyConTyVars fam_tc), ppr qtvs, ppr pats]) - ; case tcMatchTys pats (mkTyVarTys (tyConTyVars fam_tc)) of - Just subst -> return (Just (substTyUnchecked subst rhs_ty, loc) ) - Nothing -> failWithTc (defaultAssocKindErr fam_tc) - -- We check for well-formedness and validity later, - -- in checkValidClass + ; let fam_tvs = tyConTyVars fam_tc + ; traceTc "tcDefaultAssocDecl 2" (vcat + [ text "fam_tvs" <+> ppr fam_tvs + , text "qtvs" <+> ppr qtvs + , text "pats" <+> ppr pats + , text "rhs_ty" <+> ppr rhs_ty + ]) + ; pat_tvs <- traverse (extract_tv pats rhs_ty) pats + ; let subst = zipTvSubst pat_tvs (mkTyVarTys fam_tvs) + ; pure $ Just (substTyUnchecked subst rhs_ty, loc) + -- We also perform other checks for well-formedness and validity + -- later, in checkValidClass } + where + -- Checks that a pattern on the LHS of a default is a type + -- variable. If so, return the underlying type variable, and if + -- not, throw an error. + -- See Note [Type-checking default assoc decls] + extract_tv :: [Type] -- All default instance type patterns + -- (only used for error message purposes) + -> Type -- The default instance's right-hand side type + -- (only used for error message purposes) + -> Type -- The particular type pattern from which to extract + -- its underlying type variable + -> TcM TyVar + extract_tv pats rhs_ty pat = + case getTyVar_maybe pat of + Just tv -> pure tv + Nothing -> + -- Per Note [Type-checking default assoc decls], we already + -- know by this point that if any arguments in the default + -- instance aren't type variables, then they must be + -- invisible kind arguments. Therefore, always display the + -- error message with -fprint-explicit-kinds enabled. + failWithTc $ pprWithExplicitKindsWhen True $ + hang (text "Illegal argument" <+> quotes (ppr pat) <+> text "in:") + 2 (vcat [ quotes (text "type" <+> ppr (mkTyConApp fam_tc pats) + <+> equals <+> ppr rhs_ty) + , text "The arguments to" <+> quotes (ppr fam_tc) + <+> text "must all be type variables" ]) tcDefaultAssocDecl _ [dL->L _ (XFamEqn _)] = panic "tcDefaultAssocDecl" tcDefaultAssocDecl _ [dL->L _ (FamEqn _ _ _ (XLHsQTyVars _) _ _)] = panic "tcDefaultAssocDecl" @@ -1544,8 +1576,8 @@ tcDefaultAssocDecl _ [_] Consider this default declaration for an associated type class C a where - type F (a :: k) b :: * - type F x y = Proxy x -> y + type F (a :: k) b :: Type + type F (x :: j) y = Proxy x -> y Note that the class variable 'a' doesn't scope over the default assoc decl (rather oddly I think), and (less oddly) neither does the second @@ -1555,17 +1587,26 @@ instance. However we store the default rhs (Proxy x -> y) in F's TyCon, using F's own type variables, so we need to convert it to (Proxy a -> b). -We do this by calling tcMatchTys to match them up. This also ensures -that x's kind matches a's and similarly for y and b. The error -message isn't great, mind you. (#11361 was caused by not doing a -proper tcMatchTys here.) +We do this by creating a substitution [j |-> k, x |-> a, b |-> y] and +applying this substitution to the RHS. + +In order to create this substitution, we must first ensure that all of +the arguments in the default instance consist of type variables. The parser +already checks this to a certain degree (see RdrHsSyn.checkTyVars), but +we must be wary of kind arguments being instantiated, which the parser cannot +catch so easily. Consider this erroneous program (inspired by #11361): -Recall also that the left-hand side of an associated type family -default is always just variables -- no tycons here. Accordingly, -the patterns used in the tcMatchTys won't actually be knot-tied, -even though we're in the knot. This is too delicate for my taste, -but it works. + class C a where + type F (a :: k) b :: Type + type F x b = x +If you squint, you'll notice that the kind of `x` is actually Type. However, +we cannot substitute from [Type |-> k], so we reject this default. + +Since the LHS of an associated type family default is always just variables, +it won't contain any tycons. Accordingly, the patterns used in the substitution +won't actually be knot-tied, even though we're in the knot. This is too +delicate for my taste, but it works. -} {- ********************************************************************* @@ -3849,11 +3890,6 @@ wrongNumberOfParmsErr max_args = text "Number of parameters must match family declaration; expected" <+> ppr max_args -defaultAssocKindErr :: TyCon -> SDoc -defaultAssocKindErr fam_tc - = text "Kind mis-match on LHS of default declaration for" - <+> quotes (ppr fam_tc) - badRoleAnnot :: Name -> Role -> Role -> SDoc badRoleAnnot var annot inferred = hang (text "Role mismatch on variable" <+> ppr var <> colon) ===================================== libraries/base/tests/T16111.hs ===================================== @@ -0,0 +1,13 @@ +module Main (main) where + +import Data.Bits +import Data.Word + +main :: IO () +main = print $ toInteger (shiftL 1 hm :: Word64) + == toInteger (shiftL 1 hm :: Word64) + +hm :: Int +hm = -1 +{-# NOINLINE hm #-} + ===================================== libraries/base/tests/T16111.stderr ===================================== @@ -0,0 +1,2 @@ +T16111: arithmetic overflow + ===================================== libraries/base/tests/all.T ===================================== @@ -235,3 +235,4 @@ test('T10412', normal, compile_and_run, ['']) test('T13896', normal, compile_and_run, ['']) test('T13167', normal, compile_and_run, ['']) test('T15349', [exit_code(1), expect_broken_for(15349, 'ghci')], compile_and_run, ['']) +test('T16111', exit_code(1), compile_and_run, ['']) ===================================== rts/StgCRun.c ===================================== @@ -494,15 +494,15 @@ StgRunIsImplementedInAssembler(void) "movq 48(%%rsp),%%rdi\n\t" "movq 56(%%rsp),%%rsi\n\t" "movq 64(%%rsp),%%xmm6\n\t" - "movq 72(%%rax),%%xmm7\n\t" - "movq 80(%%rax),%%xmm8\n\t" - "movq 88(%%rax),%%xmm9\n\t" - "movq 96(%%rax),%%xmm10\n\t" - "movq 104(%%rax),%%xmm11\n\t" - "movq 112(%%rax),%%xmm12\n\t" - "movq 120(%%rax),%%xmm13\n\t" - "movq 128(%%rax),%%xmm14\n\t" - "movq 136(%%rax),%%xmm15\n\t" + "movq 72(%%rsp),%%xmm7\n\t" + "movq 80(%%rsp),%%xmm8\n\t" + "movq 88(%%rsp),%%xmm9\n\t" + "movq 96(%%rsp),%%xmm10\n\t" + "movq 104(%%rsp),%%xmm11\n\t" + "movq 112(%%rsp),%%xmm12\n\t" + "movq 120(%%rsp),%%xmm13\n\t" + "movq 128(%%rsp),%%xmm14\n\t" + "movq 136(%%rsp),%%xmm15\n\t" #endif "addq %1, %%rsp\n\t" "retq" ===================================== testsuite/tests/indexed-types/should_compile/T11361a.stderr ===================================== @@ -1,5 +1,7 @@ T11361a.hs:7:3: error: - • Kind mis-match on LHS of default declaration for ‘F’ + • Illegal argument ‘*’ in: + ‘type F @* x = x’ + The arguments to ‘F’ must all be type variables • In the default type instance declaration for ‘F’ In the class declaration for ‘C’ ===================================== testsuite/tests/indexed-types/should_fail/T13971.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE PolyKinds #-} +{-# LANGUAGE TypeFamilies #-} +module T13971 where + +class C a where + type T a :: k + type T a = Int ===================================== testsuite/tests/indexed-types/should_fail/T13971.stderr ===================================== @@ -0,0 +1,7 @@ + +T13971.hs:7:3: error: + • Illegal argument ‘*’ in: + ‘type T @{k} @* a = Int’ + The arguments to ‘T’ must all be type variables + • In the default type instance declaration for ‘T’ + In the class declaration for ‘C’ ===================================== testsuite/tests/indexed-types/should_fail/all.T ===================================== @@ -136,6 +136,7 @@ test('T13271', normal, compile_fail, ['']) test('T13674', normal, compile_fail, ['']) test('T13784', normal, compile_fail, ['']) test('T13877', normal, compile_fail, ['']) +test('T13971', normal, compile_fail, ['']) test('T13972', normal, compile, ['']) test('T14033', normal, compile_fail, ['']) test('T14045a', normal, compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/51fd357119b357c52e990ccce9059c423cc49406...33b0a291898b6a35d822fde59864c5c94a53d039 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/51fd357119b357c52e990ccce9059c423cc49406...33b0a291898b6a35d822fde59864c5c94a53d039 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 16:43:09 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sat, 06 Apr 2019 12:43:09 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/marge_bot_batch_merge_job Message-ID: <5ca8d71db89da_62b3e1da54819492c4@gitlab.haskell.org.mail> Marge Bot pushed new branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/marge_bot_batch_merge_job You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 6 18:47:07 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 14:47:07 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 2 commits: users-guide: Add pretty to package list Message-ID: <5ca8f42beca94_62b33fa2c45f9d342044412@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 0b309e98 by Ben Gamari at 2019-04-06T16:34:40Z users-guide: Add pretty to package list - - - - - e0c8a532 by Ben Gamari at 2019-04-06T18:45:52Z Bump unix submodule Marks posix002 as fragile in threaded2 way due to #16550. - - - - - 2 changed files: - docs/users_guide/8.8.1-notes.rst - libraries/unix Changes: ===================================== docs/users_guide/8.8.1-notes.rst ===================================== @@ -212,6 +212,7 @@ for further change information. libraries/libiserv/libiserv.cabal: Internal compiler library libraries/mtl/mtl.cabal: Dependency of ``Cabal`` library libraries/parsec/parsec.cabal: Dependency of ``Cabal`` library + libraries/pretty/pretty.cabal: Dependency of ``ghc`` library libraries/process/process.cabal: Dependency of ``ghc`` library libraries/stm/stm.cabal: Dependency of ``haskeline`` library libraries/template-haskell/template-haskell.cabal: Core library ===================================== libraries/unix ===================================== @@ -1 +1 @@ -Subproject commit a102df29c107e8f853129dd40fbbb487e1818149 +Subproject commit a67b5e9e70a2a0c438d4f186fda9f38409ff1435 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1df9027ede31ff1e591c5002914dd4f7f01ba72c...e0c8a5324abe85e40962355119057dcd64fefe3b -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1df9027ede31ff1e591c5002914dd4f7f01ba72c...e0c8a5324abe85e40962355119057dcd64fefe3b You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 01:52:50 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 21:52:50 -0400 Subject: [Git][ghc/ghc][wip/slowtest] testsuite: Fix omit_ways usage for stack003 and integerGmpInternals Message-ID: <5ca957f22a352_62b33fa2c45f9d34207366c@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 7e415170 by Ben Gamari at 2019-04-07T01:52:38Z testsuite: Fix omit_ways usage for stack003 and integerGmpInternals - - - - - 3 changed files: - testsuite/driver/testlib.py - testsuite/tests/lib/integer/all.T - testsuite/tests/rts/all.T Changes: ===================================== testsuite/driver/testlib.py ===================================== @@ -275,6 +275,7 @@ def omit_ways( ways ): return lambda name, opts, w=ways: _omit_ways( name, opts, w ) def _omit_ways( name, opts, ways ): + assert ways.__class__ is list opts.omit_ways += ways # ----- ===================================== testsuite/tests/lib/integer/all.T ===================================== @@ -1,7 +1,7 @@ test('integerBits', normal, compile_and_run, ['']) test('integerConversions', normal, compile_and_run, ['']) # skip ghci as it doesn't support unboxed tuples -test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways('ghci')], compile_and_run, ['']) +test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways(['ghci'])], compile_and_run, ['']) test('plusMinusInteger', [omit_ways('ghci')], compile_and_run, ['']) test('integerConstantFolding', normal, makefile_test, ['integerConstantFolding']) test('fromToInteger', [], makefile_test, ['fromToInteger']) ===================================== testsuite/tests/rts/all.T ===================================== @@ -93,7 +93,7 @@ test('stack002', [extra_files(['stack001.hs']), # run this test with very small stack chunks to exercise the stack # overflow/underflow machinery. -test('stack003', [ omit_ways('ghci'), # uses unboxed tuples +test('stack003', [ omit_ways(['ghci']), # uses unboxed tuples extra_run_opts('500000 +RTS -kc1k -kb100 -K96m -RTS') ], compile_and_run, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/7e415170a012bb3f9dc8de849396598d1769d881 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/7e415170a012bb3f9dc8de849396598d1769d881 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 01:55:52 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 06 Apr 2019 21:55:52 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] gitlab-ci: Allow failing build jobs to fail Message-ID: <5ca958a84ca4c_62b33fa2ee6b8d90207439@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: d84a4c00 by Ben Gamari at 2019-04-07T01:55:37Z gitlab-ci: Allow failing build jobs to fail - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -358,6 +358,7 @@ validate-x86_64-linux-deb9-debug: validate-x86_64-linux-deb9-llvm: extends: .build-x86_64-linux-deb9 stage: full-build + allow_failure: true variables: BUILD_FLAVOUR: perf-llvm TEST_ENV: "x86_64-linux-deb9-llvm" @@ -365,6 +366,7 @@ validate-x86_64-linux-deb9-llvm: validate-x86_64-linux-deb9-integer-simple: extends: .build-x86_64-linux-deb9 stage: full-build + allow_failure: true variables: INTEGER_LIBRARY: integer-simple TEST_ENV: "x86_64-linux-deb9-integer-simple" @@ -469,6 +471,7 @@ validate-x86_64-linux-fedora27: .build-windows-hadrian: extends: .build-windows stage: full-build + allow_failure: true variables: GHC_VERSION: "8.6.2" script: View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d84a4c004fddc1e12c9c09503c7147460f8b9202 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d84a4c004fddc1e12c9c09503c7147460f8b9202 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 04:07:41 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 00:07:41 -0400 Subject: [Git][ghc/ghc][wip/slowtest] testsuite: Fix omit_ways usage Message-ID: <5ca9778ddaf3f_62b3e28435420867a1@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 93eb36bb by Ben Gamari at 2019-04-07T04:07:10Z testsuite: Fix omit_ways usage omit_ways expects a list but this was broken in several cases. - - - - - 7 changed files: - testsuite/driver/testlib.py - testsuite/tests/concurrent/should_run/all.T - testsuite/tests/lib/integer/all.T - testsuite/tests/programs/barton-mangler-bug/test.T - testsuite/tests/rts/all.T - testsuite/tests/typecheck/should_run/all.T - testsuite/tests/utils/should_run/all.T Changes: ===================================== testsuite/driver/testlib.py ===================================== @@ -275,6 +275,7 @@ def omit_ways( ways ): return lambda name, opts, w=ways: _omit_ways( name, opts, w ) def _omit_ways( name, opts, ways ): + assert ways.__class__ is list opts.omit_ways += ways # ----- ===================================== testsuite/tests/concurrent/should_run/all.T ===================================== @@ -239,7 +239,7 @@ test('conc067', ignore_stdout, compile_and_run, ['']) # omit threaded2, the behaviour of this test is non-deterministic with more # than one CPU. -test('conc068', [ omit_ways('threaded2'), exit_code(1) ], compile_and_run, ['']) +test('conc068', [ omit_ways(['threaded2']), exit_code(1) ], compile_and_run, ['']) test('setnumcapabilities001', [ only_ways(['threaded1','threaded2']), ===================================== testsuite/tests/lib/integer/all.T ===================================== @@ -1,8 +1,8 @@ test('integerBits', normal, compile_and_run, ['']) test('integerConversions', normal, compile_and_run, ['']) # skip ghci as it doesn't support unboxed tuples -test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways('ghci')], compile_and_run, ['']) -test('plusMinusInteger', [omit_ways('ghci')], compile_and_run, ['']) +test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways(['ghci'])], compile_and_run, ['']) +test('plusMinusInteger', [omit_ways(['ghci'])], compile_and_run, ['']) test('integerConstantFolding', normal, makefile_test, ['integerConstantFolding']) test('fromToInteger', [], makefile_test, ['fromToInteger']) test('IntegerConversionRules', [], makefile_test, ['IntegerConversionRules']) ===================================== testsuite/tests/programs/barton-mangler-bug/test.T ===================================== @@ -8,7 +8,7 @@ test('barton-mangler-bug', 'Plot.hi', 'Plot.o', 'PlotExample.hi', 'PlotExample.o', 'TypesettingTricks.hi', 'TypesettingTricks.o']), - omit_ways('debug') # Fails for debug way due to annotation linting timeout + omit_ways(['debug']) # Fails for debug way due to annotation linting timeout ], multimod_compile_and_run, ['Main', '']) ===================================== testsuite/tests/rts/all.T ===================================== @@ -93,7 +93,7 @@ test('stack002', [extra_files(['stack001.hs']), # run this test with very small stack chunks to exercise the stack # overflow/underflow machinery. -test('stack003', [ omit_ways('ghci'), # uses unboxed tuples +test('stack003', [ omit_ways(['ghci']), # uses unboxed tuples extra_run_opts('500000 +RTS -kc1k -kb100 -K96m -RTS') ], compile_and_run, ['']) ===================================== testsuite/tests/typecheck/should_run/all.T ===================================== @@ -96,7 +96,7 @@ test('T6117', normal, compile_and_run, ['']) test('T5751', normal, compile_and_run, ['']) test('T5913', normal, compile_and_run, ['']) test('T7748', normal, compile_and_run, ['']) -test('T7861', [omit_ways('debug'), exit_code(1)], compile_and_run, ['']) +test('T7861', [omit_ways(['debug']), exit_code(1)], compile_and_run, ['']) test('TcTypeNatSimpleRun', normal, compile_and_run, ['']) test('TcTypeSymbolSimpleRun', normal, compile_and_run, ['']) test('T8119', normal, ghci_script, ['T8119.script']) ===================================== testsuite/tests/utils/should_run/all.T ===================================== @@ -1,6 +1,6 @@ test('T14854', [only_ways(threaded_ways), - omit_ways('ghci'), + omit_ways(['ghci']), reqlib('random'), ignore_stderr], compile_and_run, View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/93eb36bb102541c93f6d8c7d71d0b075bc83eb5a -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/93eb36bb102541c93f6d8c7d71d0b075bc83eb5a You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 16:52:07 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 12:52:07 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/T16546 Message-ID: <5caa2ab746b6e_62b3de95e0021305c@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/T16546 at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/T16546 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 17:20:18 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 13:20:18 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/T16551 Message-ID: <5caa3152a7f74_62b33fa2ef3eecbc2140932@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/T16551 at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/T16551 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 19:13:59 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 15:13:59 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/lint-check-version-number Message-ID: <5caa4bf7e1f8c_62b33fa2ef3eecbc215902e@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/lint-check-version-number at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/lint-check-version-number You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 19:19:24 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 15:19:24 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 22 commits: testsuite: Add testcase for #16111 Message-ID: <5caa4d3c9b3ec_62b33fa2cdca397421637d4@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - b85fb5f4 by Ben Gamari at 2019-04-07T19:19:03Z gitlab-ci: Test using slowtest in deb9-debug job - - - - - 6432a164 by Ben Gamari at 2019-04-07T19:19:03Z testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways As noted in #16535. - - - - - a91497c4 by Ömer Sinan Ağacan at 2019-04-07T19:19:03Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - dffe695c by Ben Gamari at 2019-04-07T19:19:03Z testsuite: Make closureSize less sensitive to optimisation - - - - - 19cc39cc by Ben Gamari at 2019-04-07T19:19:04Z process: Bump submodule * Skip process005 in ghci way * Mark process002 as fragile in threaded2 - - - - - 071c5dc7 by Ben Gamari at 2019-04-07T19:19:04Z testsuite: Mark T13167 as fragile in threaded2 As noted in #16536. - - - - - c787b7f5 by Ben Gamari at 2019-04-07T19:19:04Z testsuite: Mark T13910 as broken in optasm Due to #16537. - - - - - 4e53a2a3 by Ben Gamari at 2019-04-07T19:19:04Z testsuite: Mark T14272 as broken in optasm - - - - - 8dd0d706 by Ben Gamari at 2019-04-07T19:19:04Z testsuite: Mark T14761c as broken in hpc and optasm ways As noted in #16540. - - - - - 750a84d1 by Ben Gamari at 2019-04-07T19:19:04Z testsuite: Mark T16180 as broken in ghci and ext-interp ways As noted in #16541. - - - - - 4d32266f by Ben Gamari at 2019-04-07T19:19:13Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - 2834490b by Ben Gamari at 2019-04-07T19:19:13Z testsuite: Mark Overflow as broken in hpc way As noted in #16543. - - - - - d1f8bacf by Ben Gamari at 2019-04-07T19:19:13Z testsuite: Mark closure_size as broken in threaded2, optasm, dyn ways As noted in #16531. - - - - - 84f16bd1 by Ben Gamari at 2019-04-07T19:19:14Z testsuite: Mark T2783 as fragile in threaded1 It was previously marked as broken but it passes non-deterministically. See #2783. - - - - - fa5271f8 by Ben Gamari at 2019-04-07T19:19:14Z testsuite: Skip T7919 in ghci way It times out pretty reliably. It's not clear that much is gained by running this test in the ghci way anyways. - - - - - 144153d6 by Ben Gamari at 2019-04-07T19:19:14Z testsuite: Fix fragile_for test modifier - - - - - 07f0d7b5 by Ben Gamari at 2019-04-07T19:19:14Z users-guide: Add pretty to package list - - - - - 55cfc721 by Ben Gamari at 2019-04-07T19:19:14Z Bump unix submodule Marks posix002 as fragile in threaded2 way due to #16550. - - - - - 1ac79339 by Ben Gamari at 2019-04-07T19:19:14Z testsuite: Fix omit_ways usage omit_ways expects a list but this was broken in several cases. - - - - - 28 changed files: - .gitlab-ci.yml - compiler/typecheck/TcTyClsDecls.hs - docs/users_guide/8.8.1-notes.rst - + libraries/base/tests/T16111.hs - + libraries/base/tests/T16111.stderr - libraries/base/tests/all.T - libraries/ghc-heap/tests/all.T - libraries/ghc-heap/tests/closure_size.hs - libraries/process - libraries/unix - rts/StgCRun.c - testsuite/driver/testlib.py - testsuite/tests/concurrent/should_run/all.T - testsuite/tests/dependent/should_compile/all.T - testsuite/tests/indexed-types/should_compile/T11361a.stderr - + testsuite/tests/indexed-types/should_fail/T13971.hs - + testsuite/tests/indexed-types/should_fail/T13971.stderr - testsuite/tests/indexed-types/should_fail/all.T - testsuite/tests/lib/integer/all.T - testsuite/tests/primops/should_run/all.T - testsuite/tests/programs/barton-mangler-bug/test.T - testsuite/tests/rts/all.T - testsuite/tests/simplCore/should_compile/all.T - testsuite/tests/th/all.T - testsuite/tests/typecheck/should_compile/all.T - testsuite/tests/typecheck/should_run/all.T - testsuite/tests/utils/should_run/all.T - testsuite/tests/warnings/should_compile/all.T Changes: ===================================== .gitlab-ci.yml ===================================== @@ -387,6 +387,7 @@ validate-x86_64-linux-deb9-debug: stage: build variables: BUILD_FLAVOUR: validate + TEST_TYPE: slowtest TEST_ENV: "x86_64-linux-deb9-debug" validate-x86_64-linux-deb9-llvm: ===================================== compiler/typecheck/TcTyClsDecls.hs ===================================== @@ -1497,7 +1497,7 @@ tcDefaultAssocDecl fam_tc [dL->L loc (FamEqn { feqn_tycon = L _ tc_name = -- See Note [Type-checking default assoc decls] setSrcSpan loc $ tcAddFamInstCtxt (text "default type instance") tc_name $ - do { traceTc "tcDefaultAssocDecl" (ppr tc_name) + do { traceTc "tcDefaultAssocDecl 1" (ppr tc_name) ; let fam_tc_name = tyConName fam_tc fam_arity = length (tyConVisibleTyVars fam_tc) @@ -1524,14 +1524,46 @@ tcDefaultAssocDecl fam_tc [dL->L loc (FamEqn { feqn_tycon = L _ tc_name imp_vars exp_vars hs_pats hs_rhs_ty - -- See Note [Type-checking default assoc decls] - ; traceTc "tcDefault" (vcat [ppr (tyConTyVars fam_tc), ppr qtvs, ppr pats]) - ; case tcMatchTys pats (mkTyVarTys (tyConTyVars fam_tc)) of - Just subst -> return (Just (substTyUnchecked subst rhs_ty, loc) ) - Nothing -> failWithTc (defaultAssocKindErr fam_tc) - -- We check for well-formedness and validity later, - -- in checkValidClass + ; let fam_tvs = tyConTyVars fam_tc + ; traceTc "tcDefaultAssocDecl 2" (vcat + [ text "fam_tvs" <+> ppr fam_tvs + , text "qtvs" <+> ppr qtvs + , text "pats" <+> ppr pats + , text "rhs_ty" <+> ppr rhs_ty + ]) + ; pat_tvs <- traverse (extract_tv pats rhs_ty) pats + ; let subst = zipTvSubst pat_tvs (mkTyVarTys fam_tvs) + ; pure $ Just (substTyUnchecked subst rhs_ty, loc) + -- We also perform other checks for well-formedness and validity + -- later, in checkValidClass } + where + -- Checks that a pattern on the LHS of a default is a type + -- variable. If so, return the underlying type variable, and if + -- not, throw an error. + -- See Note [Type-checking default assoc decls] + extract_tv :: [Type] -- All default instance type patterns + -- (only used for error message purposes) + -> Type -- The default instance's right-hand side type + -- (only used for error message purposes) + -> Type -- The particular type pattern from which to extract + -- its underlying type variable + -> TcM TyVar + extract_tv pats rhs_ty pat = + case getTyVar_maybe pat of + Just tv -> pure tv + Nothing -> + -- Per Note [Type-checking default assoc decls], we already + -- know by this point that if any arguments in the default + -- instance aren't type variables, then they must be + -- invisible kind arguments. Therefore, always display the + -- error message with -fprint-explicit-kinds enabled. + failWithTc $ pprWithExplicitKindsWhen True $ + hang (text "Illegal argument" <+> quotes (ppr pat) <+> text "in:") + 2 (vcat [ quotes (text "type" <+> ppr (mkTyConApp fam_tc pats) + <+> equals <+> ppr rhs_ty) + , text "The arguments to" <+> quotes (ppr fam_tc) + <+> text "must all be type variables" ]) tcDefaultAssocDecl _ [dL->L _ (XFamEqn _)] = panic "tcDefaultAssocDecl" tcDefaultAssocDecl _ [dL->L _ (FamEqn _ _ _ (XLHsQTyVars _) _ _)] = panic "tcDefaultAssocDecl" @@ -1544,8 +1576,8 @@ tcDefaultAssocDecl _ [_] Consider this default declaration for an associated type class C a where - type F (a :: k) b :: * - type F x y = Proxy x -> y + type F (a :: k) b :: Type + type F (x :: j) y = Proxy x -> y Note that the class variable 'a' doesn't scope over the default assoc decl (rather oddly I think), and (less oddly) neither does the second @@ -1555,17 +1587,26 @@ instance. However we store the default rhs (Proxy x -> y) in F's TyCon, using F's own type variables, so we need to convert it to (Proxy a -> b). -We do this by calling tcMatchTys to match them up. This also ensures -that x's kind matches a's and similarly for y and b. The error -message isn't great, mind you. (#11361 was caused by not doing a -proper tcMatchTys here.) +We do this by creating a substitution [j |-> k, x |-> a, b |-> y] and +applying this substitution to the RHS. + +In order to create this substitution, we must first ensure that all of +the arguments in the default instance consist of type variables. The parser +already checks this to a certain degree (see RdrHsSyn.checkTyVars), but +we must be wary of kind arguments being instantiated, which the parser cannot +catch so easily. Consider this erroneous program (inspired by #11361): -Recall also that the left-hand side of an associated type family -default is always just variables -- no tycons here. Accordingly, -the patterns used in the tcMatchTys won't actually be knot-tied, -even though we're in the knot. This is too delicate for my taste, -but it works. + class C a where + type F (a :: k) b :: Type + type F x b = x +If you squint, you'll notice that the kind of `x` is actually Type. However, +we cannot substitute from [Type |-> k], so we reject this default. + +Since the LHS of an associated type family default is always just variables, +it won't contain any tycons. Accordingly, the patterns used in the substitution +won't actually be knot-tied, even though we're in the knot. This is too +delicate for my taste, but it works. -} {- ********************************************************************* @@ -3849,11 +3890,6 @@ wrongNumberOfParmsErr max_args = text "Number of parameters must match family declaration; expected" <+> ppr max_args -defaultAssocKindErr :: TyCon -> SDoc -defaultAssocKindErr fam_tc - = text "Kind mis-match on LHS of default declaration for" - <+> quotes (ppr fam_tc) - badRoleAnnot :: Name -> Role -> Role -> SDoc badRoleAnnot var annot inferred = hang (text "Role mismatch on variable" <+> ppr var <> colon) ===================================== docs/users_guide/8.8.1-notes.rst ===================================== @@ -212,6 +212,7 @@ for further change information. libraries/libiserv/libiserv.cabal: Internal compiler library libraries/mtl/mtl.cabal: Dependency of ``Cabal`` library libraries/parsec/parsec.cabal: Dependency of ``Cabal`` library + libraries/pretty/pretty.cabal: Dependency of ``ghc`` library libraries/process/process.cabal: Dependency of ``ghc`` library libraries/stm/stm.cabal: Dependency of ``haskeline`` library libraries/template-haskell/template-haskell.cabal: Core library ===================================== libraries/base/tests/T16111.hs ===================================== @@ -0,0 +1,13 @@ +module Main (main) where + +import Data.Bits +import Data.Word + +main :: IO () +main = print $ toInteger (shiftL 1 hm :: Word64) + == toInteger (shiftL 1 hm :: Word64) + +hm :: Int +hm = -1 +{-# NOINLINE hm #-} + ===================================== libraries/base/tests/T16111.stderr ===================================== @@ -0,0 +1,2 @@ +T16111: arithmetic overflow + ===================================== libraries/base/tests/all.T ===================================== @@ -202,7 +202,7 @@ test('T8089', compile_and_run, ['']) test('hWaitForInput-accurate-socket', reqlib('unix'), compile_and_run, ['']) test('T8684', expect_broken(8684), compile_and_run, ['']) -test('hWaitForInput-accurate-stdin', normal, compile_and_run, ['']) +test('hWaitForInput-accurate-stdin', [expect_broken_for(16535, ['threaded1', 'threaded2']), omit_ways(['ghci'])], compile_and_run, ['']) test('hWaitForInput-accurate-pipe', reqlib('unix'), compile_and_run, ['']) test('T9826',normal, compile_and_run,['']) test('T9848', @@ -233,5 +233,6 @@ test('T3474', test('T14425', normal, compile_and_run, ['']) test('T10412', normal, compile_and_run, ['']) test('T13896', normal, compile_and_run, ['']) -test('T13167', normal, compile_and_run, ['']) +test('T13167', fragile_for(16536, ['threaded2']), compile_and_run, ['']) test('T15349', [exit_code(1), expect_broken_for(15349, 'ghci')], compile_and_run, ['']) +test('T16111', exit_code(1), compile_and_run, ['']) ===================================== libraries/ghc-heap/tests/all.T ===================================== @@ -7,5 +7,5 @@ test('heap_all', ], compile_and_run, ['']) test('closure_size', - omit_ways(['ghci', 'hpc', 'prof']), + [omit_ways(['ghci', 'hpc', 'prof']), expect_broken_for(16544, ['dyn', 'optasm', 'threaded2'])], compile_and_run, ['']) ===================================== libraries/ghc-heap/tests/closure_size.hs ===================================== @@ -1,6 +1,7 @@ {-# LANGUAGE BangPatterns #-} {-# LANGUAGE TypeApplications #-} {-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE MagicHash #-} import Control.Monad import Type.Reflection @@ -17,12 +18,17 @@ assertSize !x expected = do putStrLn $ prettyCallStack callStack {-# NOINLINE assertSize #-} -pap :: Int -> Char -> Int +pap :: Int -> Maybe Char -> Int pap x _ = x {-# NOINLINE pap #-} main :: IO () main = do + -- Ensure that GHC can't turn PAP into a FUN (see #16531) + let x :: Int + x = 42 + {-# NOINLINE x #-} + assertSize 'a' 2 assertSize (Just ()) 2 assertSize (Nothing :: Maybe ()) 2 @@ -30,5 +36,5 @@ main = do assertSize ((1,2,3) :: (Int,Int,Int)) 4 assertSize (id :: Int -> Int) 1 assertSize (fst :: (Int,Int) -> Int) 1 - assertSize (pap 1) 2 + assertSize (pap x) 2 ===================================== libraries/process ===================================== @@ -1 +1 @@ -Subproject commit d860209e53c1b40b7c251fc8378886bbcb394402 +Subproject commit 5c81524694ceaf8523a1846718a7a7c3f402124f ===================================== libraries/unix ===================================== @@ -1 +1 @@ -Subproject commit a102df29c107e8f853129dd40fbbb487e1818149 +Subproject commit a67b5e9e70a2a0c438d4f186fda9f38409ff1435 ===================================== rts/StgCRun.c ===================================== @@ -494,15 +494,15 @@ StgRunIsImplementedInAssembler(void) "movq 48(%%rsp),%%rdi\n\t" "movq 56(%%rsp),%%rsi\n\t" "movq 64(%%rsp),%%xmm6\n\t" - "movq 72(%%rax),%%xmm7\n\t" - "movq 80(%%rax),%%xmm8\n\t" - "movq 88(%%rax),%%xmm9\n\t" - "movq 96(%%rax),%%xmm10\n\t" - "movq 104(%%rax),%%xmm11\n\t" - "movq 112(%%rax),%%xmm12\n\t" - "movq 120(%%rax),%%xmm13\n\t" - "movq 128(%%rax),%%xmm14\n\t" - "movq 136(%%rax),%%xmm15\n\t" + "movq 72(%%rsp),%%xmm7\n\t" + "movq 80(%%rsp),%%xmm8\n\t" + "movq 88(%%rsp),%%xmm9\n\t" + "movq 96(%%rsp),%%xmm10\n\t" + "movq 104(%%rsp),%%xmm11\n\t" + "movq 112(%%rsp),%%xmm12\n\t" + "movq 120(%%rsp),%%xmm13\n\t" + "movq 128(%%rsp),%%xmm14\n\t" + "movq 136(%%rsp),%%xmm15\n\t" #endif "addq %1, %%rsp\n\t" "retq" ===================================== testsuite/driver/testlib.py ===================================== @@ -258,14 +258,14 @@ def fragile( bug ): return helper -def fragile_for( name, opts, bug, ways ): +def fragile_for( bug, ways ): """ Indicates that the test should be skipped due to fragility in the given test ways as documented in the given ticket. """ def helper( name, opts, bug=bug, ways=ways ): record_broken(name, opts, bug) - opts.omit_ways = ways + opts.omit_ways += ways return helper @@ -275,7 +275,8 @@ def omit_ways( ways ): return lambda name, opts, w=ways: _omit_ways( name, opts, w ) def _omit_ways( name, opts, ways ): - opts.omit_ways = ways + assert ways.__class__ is list + opts.omit_ways += ways # ----- @@ -1408,7 +1409,6 @@ def simple_run(name, way, prog, extra_run_opts): return failBecause('bad stderr') if not (opts.ignore_stdout or stdout_ok(name, way)): return failBecause('bad stdout') - check_hp = '-h' in my_rts_flags and opts.check_hp check_prof = '-p' in my_rts_flags ===================================== testsuite/tests/concurrent/should_run/all.T ===================================== @@ -239,7 +239,7 @@ test('conc067', ignore_stdout, compile_and_run, ['']) # omit threaded2, the behaviour of this test is non-deterministic with more # than one CPU. -test('conc068', [ omit_ways('threaded2'), exit_code(1) ], compile_and_run, ['']) +test('conc068', [ omit_ways(['threaded2']), exit_code(1) ], compile_and_run, ['']) test('setnumcapabilities001', [ only_ways(['threaded1','threaded2']), ===================================== testsuite/tests/dependent/should_compile/all.T ===================================== @@ -40,7 +40,7 @@ test('T12742', normal, compile, ['']) # (1) Use -fexternal-interpreter, or # (2) Build the program twice: once with -dynamic, and then # with -prof using -osuf to set a different object file suffix. -test('T13910', omit_ways(['profasm']), compile, ['']) +test('T13910', [expect_broken_for(16537, ['optasm']), omit_ways(['profasm'])], compile, ['']) test('T13938', [extra_files(['T13938a.hs'])], makefile_test, ['T13938']) test('T14556', normal, compile, ['']) test('T14720', normal, compile, ['']) ===================================== testsuite/tests/indexed-types/should_compile/T11361a.stderr ===================================== @@ -1,5 +1,7 @@ T11361a.hs:7:3: error: - • Kind mis-match on LHS of default declaration for ‘F’ + • Illegal argument ‘*’ in: + ‘type F @* x = x’ + The arguments to ‘F’ must all be type variables • In the default type instance declaration for ‘F’ In the class declaration for ‘C’ ===================================== testsuite/tests/indexed-types/should_fail/T13971.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE PolyKinds #-} +{-# LANGUAGE TypeFamilies #-} +module T13971 where + +class C a where + type T a :: k + type T a = Int ===================================== testsuite/tests/indexed-types/should_fail/T13971.stderr ===================================== @@ -0,0 +1,7 @@ + +T13971.hs:7:3: error: + • Illegal argument ‘*’ in: + ‘type T @{k} @* a = Int’ + The arguments to ‘T’ must all be type variables + • In the default type instance declaration for ‘T’ + In the class declaration for ‘C’ ===================================== testsuite/tests/indexed-types/should_fail/all.T ===================================== @@ -136,6 +136,7 @@ test('T13271', normal, compile_fail, ['']) test('T13674', normal, compile_fail, ['']) test('T13784', normal, compile_fail, ['']) test('T13877', normal, compile_fail, ['']) +test('T13971', normal, compile_fail, ['']) test('T13972', normal, compile, ['']) test('T14033', normal, compile_fail, ['']) test('T14045a', normal, compile, ['']) ===================================== testsuite/tests/lib/integer/all.T ===================================== @@ -1,8 +1,8 @@ test('integerBits', normal, compile_and_run, ['']) test('integerConversions', normal, compile_and_run, ['']) # skip ghci as it doesn't support unboxed tuples -test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways('ghci')], compile_and_run, ['']) -test('plusMinusInteger', [omit_ways('ghci')], compile_and_run, ['']) +test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways(['ghci'])], compile_and_run, ['']) +test('plusMinusInteger', [omit_ways(['ghci'])], compile_and_run, ['']) test('integerConstantFolding', normal, makefile_test, ['integerConstantFolding']) test('fromToInteger', [], makefile_test, ['fromToInteger']) test('IntegerConversionRules', [], makefile_test, ['IntegerConversionRules']) ===================================== testsuite/tests/primops/should_run/all.T ===================================== @@ -20,7 +20,9 @@ test('CmpInt8', normal, compile_and_run, ['']) test('CmpWord8', normal, compile_and_run, ['']) test('ShowPrim', normal, compile_and_run, ['']) -test('ArithInt16', normal, compile_and_run, ['']) -test('ArithWord16', normal, compile_and_run, ['']) +# These two tests use unboxed tuples, which GHCi doesn't support +test('ArithInt16', omit_ways(['ghci']), compile_and_run, ['']) +test('ArithWord16', omit_ways(['ghci']), compile_and_run, ['']) + test('CmpInt16', normal, compile_and_run, ['']) -test('CmpWord16', normal, compile_and_run, ['']) \ No newline at end of file +test('CmpWord16', normal, compile_and_run, ['']) ===================================== testsuite/tests/programs/barton-mangler-bug/test.T ===================================== @@ -8,7 +8,7 @@ test('barton-mangler-bug', 'Plot.hi', 'Plot.o', 'PlotExample.hi', 'PlotExample.o', 'TypesettingTricks.hi', 'TypesettingTricks.o']), - omit_ways('debug') # Fails for debug way due to annotation linting timeout + omit_ways(['debug']) # Fails for debug way due to annotation linting timeout ], multimod_compile_and_run, ['Main', '']) ===================================== testsuite/tests/rts/all.T ===================================== @@ -73,7 +73,7 @@ test('T2047', [ignore_stdout, extra_run_opts('+RTS -c -RTS')], # Blackhole-detection test. # Skip GHCi due to #2786 test('T2783', [ omit_ways(['ghci']), exit_code(1) - , expect_broken_for(2783, ['threaded1']) + , fragile_for(2783, ['threaded1']) ], compile_and_run, ['']) # Test the work-stealing deque implementation. We run this test in @@ -93,7 +93,7 @@ test('stack002', [extra_files(['stack001.hs']), # run this test with very small stack chunks to exercise the stack # overflow/underflow machinery. -test('stack003', [ omit_ways('ghci'), # uses unboxed tuples +test('stack003', [ omit_ways(['ghci']), # uses unboxed tuples extra_run_opts('500000 +RTS -kc1k -kb100 -K96m -RTS') ], compile_and_run, ['']) @@ -272,7 +272,8 @@ test('T7815', [ multi_cpu_race, test('ffishutdown', [ignore_stderr, only_ways(['threaded1','threaded2'])], compile_and_run, ['']) -test('T7919', [when(fast(), skip), omit_ways(prof_ways)], compile_and_run, +# Times out in ghci way +test('T7919', [when(fast(), skip), omit_ways(['ghci'] + prof_ways)], compile_and_run, [config.ghc_th_way_flags]) test('T8035', normal, compile_and_run, ['']) ===================================== testsuite/tests/simplCore/should_compile/all.T ===================================== @@ -275,7 +275,7 @@ test('T14140', normal, makefile_test, ['T14140']) -test('T14272', normal, compile, ['']) +test('T14272', expect_broken_for(16539, ['optasm']), compile, ['']) test('T14270a', normal, compile, ['']) test('T14152', [ only_ways(['optasm']), check_errmsg(r'dead code') ], compile, ['-ddump-simpl']) test('T14152a', [extra_files(['T14152.hs']), pre_cmd('cp T14152.hs T14152a.hs'), ===================================== testsuite/tests/th/all.T ===================================== @@ -467,7 +467,7 @@ test('T15437', expect_broken(15437), multimod_compile, test('T15985', normal, compile, ['']) test('T16133', normal, compile_fail, ['']) test('T15471', normal, multimod_compile, ['T15471.hs', '-v0']) -test('T16180', normal, compile_and_run, ['-package ghc']) +test('T16180', expect_broken_for(16541, ['ext-interp', 'ghci']), compile_and_run, ['-package ghc']) test('T16183', normal, compile, ['-v0 -ddump-splices -dsuppress-uniques']) test('T16195', normal, multimod_compile, ['T16195.hs', '-v0']) test('T16293b', normal, compile, ['']) ===================================== testsuite/tests/typecheck/should_compile/all.T ===================================== @@ -659,7 +659,7 @@ test('T15586', normal, compile, ['']) test('T15368', normal, compile, ['-fdefer-type-errors']) test('T15645', normal, compile, ['-Wwarn=missing-monadfail-instances']) test('T15778', normal, compile, ['']) -test('T14761c', normal, compile, ['']) +test('T14761c', expect_broken_for(16540, ['hpc', 'optasm']), compile, ['']) test('T16008', normal, compile, ['']) test('T16033', normal, compile, ['']) test('T16141', normal, compile, ['-O']) ===================================== testsuite/tests/typecheck/should_run/all.T ===================================== @@ -35,7 +35,7 @@ test('tcrun018', normal, compile_and_run, ['']) test('tcrun019', normal, compile_and_run, ['']) test('tcrun020', normal, compile_and_run, ['']) test('tcrun021', normal, compile_and_run, ['-package containers']) -test('tcrun022', omit_ways(['ghci']), compile_and_run, ['-O']) +test('tcrun022', omit_ways(['hpc', 'ghci']), compile_and_run, ['-O']) test('tcrun023', [when(doing_ghci(), extra_hc_opts('-fobject-code'))], compile_and_run, ['-O']) test('tcrun024', normal, compile_and_run, ['-O']) @@ -96,7 +96,7 @@ test('T6117', normal, compile_and_run, ['']) test('T5751', normal, compile_and_run, ['']) test('T5913', normal, compile_and_run, ['']) test('T7748', normal, compile_and_run, ['']) -test('T7861', [omit_ways('debug'), exit_code(1)], compile_and_run, ['']) +test('T7861', [omit_ways(['debug']), exit_code(1)], compile_and_run, ['']) test('TcTypeNatSimpleRun', normal, compile_and_run, ['']) test('TcTypeSymbolSimpleRun', normal, compile_and_run, ['']) test('T8119', normal, ghci_script, ['T8119.script']) ===================================== testsuite/tests/utils/should_run/all.T ===================================== @@ -1,6 +1,6 @@ test('T14854', [only_ways(threaded_ways), - omit_ways('ghci'), + omit_ways(['ghci']), reqlib('random'), ignore_stderr], compile_and_run, ===================================== testsuite/tests/warnings/should_compile/all.T ===================================== @@ -25,4 +25,4 @@ test('MissingMod', normal, multimod_compile, ['MissingMod', '-Wmissing-home-modu test('StarBinder', normal, compile, ['']) -test('Overflow', normal, compile, ['']) +test('Overflow', expect_broken_for(16543, ['hpc']), compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/93eb36bb102541c93f6d8c7d71d0b075bc83eb5a...1ac79339a782c850a5affdd75d56b192d1565414 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/93eb36bb102541c93f6d8c7d71d0b075bc83eb5a...1ac79339a782c850a5affdd75d56b192d1565414 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 19:22:04 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 15:22:04 -0400 Subject: [Git][ghc/ghc][master] Fix #16282. Message-ID: <5caa4ddca7b64_62b33fa2eafe01c421674d3@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 5 changed files: - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - + testsuite/tests/warnings/should_compile/T16282/T16282.hs - + testsuite/tests/warnings/should_compile/T16282/T16282.stderr - + testsuite/tests/warnings/should_compile/T16282/all.T Changes: ===================================== compiler/simplCore/CoreMonad.hs ===================================== @@ -778,8 +778,8 @@ we aren't using annotations heavily. ************************************************************************ -} -msg :: Severity -> SDoc -> CoreM () -msg sev doc +msg :: Severity -> WarnReason -> SDoc -> CoreM () +msg sev reason doc = do { dflags <- getDynFlags ; loc <- getSrcSpanM ; unqual <- getPrintUnqualified @@ -791,7 +791,7 @@ msg sev doc err_sty = mkErrStyle dflags unqual user_sty = mkUserStyle dflags unqual AllTheWay dump_sty = mkDumpStyle dflags unqual - ; liftIO $ putLogMsg dflags NoReason sev loc sty doc } + ; liftIO $ putLogMsg dflags reason sev loc sty doc } -- | Output a String message to the screen putMsgS :: String -> CoreM () @@ -799,7 +799,7 @@ putMsgS = putMsg . text -- | Output a message to the screen putMsg :: SDoc -> CoreM () -putMsg = msg SevInfo +putMsg = msg SevInfo NoReason -- | Output an error to the screen. Does not cause the compiler to die. errorMsgS :: String -> CoreM () @@ -807,9 +807,9 @@ errorMsgS = errorMsg . text -- | Output an error to the screen. Does not cause the compiler to die. errorMsg :: SDoc -> CoreM () -errorMsg = msg SevError +errorMsg = msg SevError NoReason -warnMsg :: SDoc -> CoreM () +warnMsg :: WarnReason -> SDoc -> CoreM () warnMsg = msg SevWarning -- | Output a fatal error to the screen. Does not cause the compiler to die. @@ -818,7 +818,7 @@ fatalErrorMsgS = fatalErrorMsg . text -- | Output a fatal error to the screen. Does not cause the compiler to die. fatalErrorMsg :: SDoc -> CoreM () -fatalErrorMsg = msg SevFatal +fatalErrorMsg = msg SevFatal NoReason -- | Output a string debugging message at verbosity level of @-v@ or higher debugTraceMsgS :: String -> CoreM () @@ -826,7 +826,7 @@ debugTraceMsgS = debugTraceMsg . text -- | Outputs a debugging message at verbosity level of @-v@ or higher debugTraceMsg :: SDoc -> CoreM () -debugTraceMsg = msg SevDump +debugTraceMsg = msg SevDump NoReason -- | Show some labelled 'SDoc' if a particular flag is set or at a verbosity level of @-v -ddump-most@ or higher dumpIfSet_dyn :: DumpFlag -> String -> SDoc -> CoreM () ===================================== compiler/specialise/Specialise.hs ===================================== @@ -730,28 +730,35 @@ specImport dflags this_mod top_env done callers rb fn calls_for_fn ; return (rules2 ++ rules1, final_binds) } - | warnMissingSpecs dflags callers - = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn)) - 2 (vcat [ text "when specialising" <+> quotes (ppr caller) - | caller <- callers]) - , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn)) - , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ]) - ; return ([], []) } + | otherwise = do { tryWarnMissingSpecs dflags callers fn calls_for_fn + ; return ([], [])} - | otherwise - = return ([], []) where unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers -warnMissingSpecs :: DynFlags -> [Id] -> Bool +-- | Returns whether or not to show a missed-spec warning. +-- If -Wall-missed-specializations is on, show the warning. +-- Otherwise, if -Wmissed-specializations is on, only show a warning +-- if there is at least one imported function being specialized, +-- and if all imported functions are marked with an inline pragma +-- Use the most specific warning as the reason. +tryWarnMissingSpecs :: DynFlags -> [Id] -> Id -> [CallInfo] -> CoreM () -- See Note [Warning about missed specialisations] -warnMissingSpecs dflags callers - | wopt Opt_WarnAllMissedSpecs dflags = True - | not (wopt Opt_WarnMissedSpecs dflags) = False - | null callers = False - | otherwise = all has_inline_prag callers +tryWarnMissingSpecs dflags callers fn calls_for_fn + | wopt Opt_WarnMissedSpecs dflags + && not (null callers) + && allCallersInlined = doWarn $ Reason Opt_WarnMissedSpecs + | wopt Opt_WarnAllMissedSpecs dflags = doWarn $ Reason Opt_WarnAllMissedSpecs + | otherwise = return () where - has_inline_prag id = isAnyInlinePragma (idInlinePragma id) + allCallersInlined = all (isAnyInlinePragma . idInlinePragma) callers + doWarn reason = + warnMsg reason + (vcat [ hang (text ("Could not specialise imported function") <+> quotes (ppr fn)) + 2 (vcat [ text "when specialising" <+> quotes (ppr caller) + | caller <- callers]) + , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn)) + , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ]) wantSpecImport :: DynFlags -> Unfolding -> Bool -- See Note [Specialise imported INLINABLE things] ===================================== testsuite/tests/warnings/should_compile/T16282/T16282.hs ===================================== @@ -0,0 +1,14 @@ +import Data.Map + +-- If someone improves the specializer so that +-- GHC no longer misses the specialization below, +-- then this test will fail, as it expects a warning +-- to be issued. +-- Another reason this could fail is due to spelling: +-- the test checks for the "specialisation" spelling, +-- but due to changes in how the warnings are listed in DynFalgs.hs +-- the compiler may spit out the "specialization" spelling. +main :: IO () +main = do + let m = [] :: [Map String Bool] + mapM_ print m ===================================== testsuite/tests/warnings/should_compile/T16282/T16282.stderr ===================================== @@ -0,0 +1,5 @@ + +T16282.hs: warning: [-Wall-missed-specialisations] + Could not specialise imported function ‘Data.Map.Internal.$w$cshowsPrec’ + when specialising ‘Data.Map.Internal.$fShowMap_$cshowsPrec’ + Probable fix: add INLINABLE pragma on ‘Data.Map.Internal.$w$cshowsPrec’ ===================================== testsuite/tests/warnings/should_compile/T16282/all.T ===================================== @@ -0,0 +1 @@ +test('T16282', normal, compile, ['-O2 -Wall-missed-specialisations']) \ No newline at end of file View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/3a38ea4487173f0f8e3693a75d1c5c7d33f12f05 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/3a38ea4487173f0f8e3693a75d1c5c7d33f12f05 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 19:24:58 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 15:24:58 -0400 Subject: [Git][ghc/ghc][wip/lint-check-version-number] gitlab-ci: Ensure that version number has three components Message-ID: <5caa4e8a6fe87_62b33fa2c84a275c21695a0@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/lint-check-version-number at Glasgow Haskell Compiler / GHC Commits: a3ef3478 by Ben Gamari at 2019-04-07T19:24:50Z gitlab-ci: Ensure that version number has three components - - - - - 2 changed files: - .gitlab-ci.yml - configure.ac Changes: ===================================== .gitlab-ci.yml ===================================== @@ -55,6 +55,9 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - | + egrep -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) dependencies: [] tags: - lint ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} @@ -386,6 +386,20 @@ then fp_prog_ar="${mingwbin}ar.exe" AC_PATH_PROG([Genlib],[genlib]) + + # NB. Download the perl binaries if required + if ! test -d inplace/perl || + test inplace/perl -ot ghc-tarballs/perl/ghc-perl*.tar.gz + then + AC_MSG_NOTICE([Making in-tree perl tree]) + rm -rf inplace/perl + mkdir inplace/perl + ( + cd inplace/perl && + tar -zxf ../../ghc-tarballs/perl/ghc-perl*.tar.gz + ) + AC_MSG_NOTICE([In-tree perl tree created]) + fi fi # We don't want to bundle a MinGW-w64 toolchain @@ -529,7 +543,6 @@ FP_SET_CFLAGS_C99([CC],[CONF_CC_OPTS_STAGE2],[CONF_CPP_OPTS_STAGE2]) dnl ** Which ld to use dnl -------------------------------------------------------------- -AC_ARG_VAR(LD,[Use as the path to ld. See also --disable-ld-override.]) FIND_LD([$target],[GccUseLdOpt]) CONF_GCC_LINKER_OPTS_STAGE1="$CONF_GCC_LINKER_OPTS_STAGE1 $GccUseLdOpt" CONF_GCC_LINKER_OPTS_STAGE2="$CONF_GCC_LINKER_OPTS_STAGE2 $GccUseLdOpt" @@ -559,14 +572,14 @@ if test "$TargetOS_CPP" = "darwin" then AC_MSG_CHECKING(whether nm program is broken) # Some versions of XCode ship a broken version of `nm`. Detect and work - # around this issue. See : https://gitlab.haskell.org/ghc/ghc/issues/11744 + # around this issue. See : https://ghc.haskell.org/trac/ghc/ticket/11744 nmver=$(${NM} --version | grep version | sed 's/ //g') case "$nmver" in LLVMversion7.3.0|LLVMversion7.3.1) AC_MSG_RESULT(yes) echo "The detected nm program is broken." echo - echo "See: https://gitlab.haskell.org/ghc/ghc/issues/11744" + echo "See: https://ghc.haskell.org/trac/ghc/ticket/11744" echo echo "Try re-running configure with:" echo @@ -635,21 +648,18 @@ AC_DEFINE_UNQUOTED([sUPPORTED_LLVM_VERSION], ${sUPPORTED_LLVM_VERSION}, [The sup dnl ** Which LLVM clang to use? dnl -------------------------------------------------------------- -AC_ARG_VAR(CLANG,[Use as the path to clang [default=autodetect]]) AC_CHECK_TARGET_TOOL([CLANG], [clang]) ClangCmd="$CLANG" AC_SUBST([ClangCmd]) dnl ** Which LLVM llc to use? dnl -------------------------------------------------------------- -AC_ARG_VAR(LLC,[Use as the path to LLVM's llc [default=autodetect]]) FIND_LLVM_PROG([LLC], [llc], [$LlvmVersion]) LlcCmd="$LLC" AC_SUBST([LlcCmd]) dnl ** Which LLVM opt to use? dnl -------------------------------------------------------------- -AC_ARG_VAR(OPT,[Use as the path to LLVM's opt [default=autodetect]]) FIND_LLVM_PROG([OPT], [opt], [$LlvmVersion]) OptCmd="$OPT" AC_SUBST([OptCmd]) @@ -683,6 +693,36 @@ AC_SUBST([GhcLibsWithUnix]) dnl ** does #! work? AC_SYS_INTERPRETER() +# Check for split-objs +SplitObjsBroken=NO +dnl ** look for `perl' +case $HostOS_CPP in +cygwin32|mingw32) + if test "$EnableDistroToolchain" = "NO"; then + PerlCmd=$hardtop/inplace/perl/perl + else + AC_PATH_PROG([PerlCmd],[perl]) + fi + # because of Trac #15051 SplitObjs is useless on Windows. It regresses + # build times to days for a build, and this effect is also there for end users + # of GHC. So unfortunately we have to disable it, even without having + # split-sections. Though the compile time hit for split-sections should be + # tiny compared to this so maybe we should enable it for x86_64. + SplitObjsBroken=YES + ;; +*) + AC_PATH_PROG([PerlCmd],[perl]) + if test -z "$PerlCmd" + then + AC_MSG_WARN([No Perl on PATH, disabling split object support]) + SplitObjsBroken=YES + else + FPTOOLS_CHECK_PERL_VERSION + fi + ;; +esac +AC_SUBST([SplitObjsBroken]) + dnl ** look for GCC and find out which version dnl Figure out which C compiler to use. Gcc is preferred. dnl If gcc, make sure it's at least 3.0 @@ -774,9 +814,6 @@ dnl ** check for patch dnl if GNU patch is named gpatch, look for it first AC_PATH_PROGS(PatchCmd,gpatch patch, patch) -dnl ** check for autoreconf -AC_PATH_PROG(AutoreconfCmd, autoreconf, autoreconf) - dnl ** check for dtrace (currently only implemented for Mac OS X) AC_ARG_ENABLE(dtrace, [AC_HELP_STRING([--enable-dtrace], @@ -937,7 +974,7 @@ dnl ** check for math library dnl Keep that check as early as possible. dnl as we need to know whether we need libm dnl for math functions or not -dnl (see https://gitlab.haskell.org/ghc/ghc/issues/3730) +dnl (see http://ghc.haskell.org/trac/ghc/ticket/3730) AC_CHECK_LIB(m, atan, HaveLibM=YES, HaveLibM=NO) if test $HaveLibM = YES then @@ -1292,7 +1329,7 @@ checkMake380() { checkMake380 make checkMake380 gmake -AC_CONFIG_FILES([mk/config.mk mk/install.mk mk/project.mk rts/rts.cabal compiler/ghc.cabal ghc/ghc-bin.cabal utils/iserv/iserv.cabal utils/iserv-proxy/iserv-proxy.cabal utils/remote-iserv/remote-iserv.cabal utils/runghc/runghc.cabal utils/gen-dll/gen-dll.cabal libraries/ghc-boot/ghc-boot.cabal libraries/ghc-boot-th/ghc-boot-th.cabal libraries/ghci/ghci.cabal libraries/ghc-heap/ghc-heap.cabal libraries/libiserv/libiserv.cabal libraries/template-haskell/template-haskell.cabal settings docs/users_guide/ghc_config.py docs/index.html libraries/prologue.txt distrib/configure.ac]) +AC_CONFIG_FILES([mk/config.mk mk/install.mk mk/project.mk rts/rts.cabal compiler/ghc.cabal ghc/ghc-bin.cabal utils/iserv/iserv.cabal utils/iserv-proxy/iserv-proxy.cabal utils/runghc/runghc.cabal utils/gen-dll/gen-dll.cabal libraries/ghc-boot/ghc-boot.cabal libraries/ghc-boot-th/ghc-boot-th.cabal libraries/ghci/ghci.cabal libraries/ghc-heap/ghc-heap.cabal libraries/libiserv/libiserv.cabal settings docs/users_guide/ghc_config.py docs/index.html libraries/prologue.txt distrib/configure.ac]) AC_OUTPUT [ if test "$print_make_warning" = "true"; then @@ -1348,6 +1385,7 @@ echo "\ genlib : $GenlibCmd Happy : $HappyCmd ($HappyVersion) Alex : $AlexCmd ($AlexVersion) + Perl : $PerlCmd sphinx-build : $SPHINXBUILD xelatex : $XELATEX @@ -1380,5 +1418,5 @@ To make changes to the default build configuration, copy the file mk/build.mk.sample to mk/build.mk, and edit the settings in there. For more information on how to configure your GHC build, see - https://gitlab.haskell.org/ghc/ghc/wikis/building + http://ghc.haskell.org/trac/ghc/wiki/Building "] View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/a3ef3478189170c08e8371c600b328cf370d84d0 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/a3ef3478189170c08e8371c600b328cf370d84d0 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 19:40:23 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 15:40:23 -0400 Subject: [Git][ghc/ghc][ghc-8.6] 4 commits: Add release notes for 8.6.5 Message-ID: <5caa522710446_62b33fa2ec39dd4c2176992@gitlab.haskell.org.mail> Ben Gamari pushed to branch ghc-8.6 at Glasgow Haskell Compiler / GHC Commits: 7d3040ac by Ben Gamari at 2019-04-06T16:32:53Z Add release notes for 8.6.5 - - - - - f6cd3ae8 by Ben Gamari at 2019-04-06T16:35:41Z users-guide: Add missing libraries to release notes library list - - - - - d6c93748 by Ben Gamari at 2019-04-07T19:35:51Z users-guide: Fix version number reference - - - - - b9001408 by Ben Gamari at 2019-04-07T19:39:32Z users-guide: Mention fix to #16514 - - - - - 2 changed files: - + 8.6.5-notes.rst - docs/users_guide/index.rst Changes: ===================================== 8.6.5-notes.rst ===================================== @@ -0,0 +1,76 @@ +.. _release-8-6-4: + +Release notes for version 8.6.5 +=============================== + +GHC 8.6.5 is a bug-fix release, fixing a few regressions found in 8.6.4. + + +Highlights +---------- + +The highlights, since the 8.6.4 release, are: + +- Binary distributions once again ship with Haddock documentation including + syntax highlighted source of core libraries (:ghc-ticket:`16445`) + +- A build system issue where use of GCC with ``-flto`` broke ``configure`` + was fixed (:ghc-ticket:`16440`) + +- An bug affecting Windows platforms wherein XMM register values could be + mangled when entering STG has been fixed (:ghc-ticket:`16514`) + +- Several packaging issues with the Windows binary distributions have been resolved. + (:ghc-ticket:`16408`, :ghc-ticket:`16398`, :ghc-ticket:`16516`). + +Known issues +------------ + +Note that the LLVM code generator (:ghc-flag:`-fllvm`) in GHC 8.6, as well as +all earlier releases, are affected by :ghc-ticket:`14251`, which can result in +miscompilation of some programs calling functions with unboxed floating-point +arguments. While originally scheduled to be fixed for this release, the fix +ended up being more difficult than anticipated and, given that issue is not a +strict regression from 8.4, we decided to proceed with the release. + +Included libraries +------------------ + +The package database provided with this distribution also contains a number of +packages other than GHC itself. See the changelogs provided with these packages +for further change information. + +.. ghc-package-list:: + + libraries/array/array.cabal: Dependency of ``ghc`` library + libraries/base/base.cabal: Core library + libraries/binary/binary.cabal: Dependency of ``ghc`` library + libraries/bytestring/bytestring.cabal: Deppendency of ``ghc`` library + libraries/Cabal/Cabal/Cabal.cabal: Dependency of ``ghc-pkg`` utility + libraries/containers/containers.cabal: Dependency of ``ghc`` library + libraries/deepseq/deepseq.cabal: Dependency of ``ghc`` library + libraries/directory/directory.cabal: Dependency of ``ghc`` library + libraries/filepath/filepath.cabal: Dependency of ``ghc`` library + compiler/ghc.cabal: The compiler itself + libraries/ghci/ghci.cabal: The REPL interface + libraries/ghc-boot/ghc-boot.cabal: Internal compiler library + libraries/ghc-boot-th/ghc-boot-th.cabal: Internal compiler library + libraries/ghc-compact/ghc-compact.cabal: Core library + libraries/ghc-heap/ghc-heap.cabal: GHC heap-walking library + libraries/ghc-prim/ghc-prim.cabal: Core library + libraries/haskeline/haskeline.cabal: Dependency of ``ghci`` executable + libraries/hpc/hpc.cabal: Dependency of ``hpc`` executable + libraries/integer-gmp/integer-gmp.cabal: Core library + libraries/libiserv/libiserv.cabal: Internal compiler library + libraries/mtl/mtl.cabal: Dependency of ``Cabal`` library + libraries/parsec/parsec.cabal: Dependency of ``Cabal`` library + libraries/process/process.cabal: Dependency of ``ghc`` library + libraries/stm/stm.cabal: Dependency of ``haskeline`` library + libraries/template-haskell/template-haskell.cabal: Core library + libraries/terminfo/terminfo.cabal: Dependency of ``haskeline`` library + libraries/text/text.cabal: Dependency of ``Cabal`` library + libraries/time/time.cabal: Dependency of ``ghc`` library + libraries/transformers/transformers.cabal: Dependency of ``ghc`` library + libraries/unix/unix.cabal: Dependency of ``ghc`` library + libraries/Win32/Win32.cabal: Dependency of ``ghc`` library + libraries/xhtml/xhtml.cabal: Dependency of ``haddock`` executable ===================================== docs/users_guide/index.rst ===================================== @@ -16,6 +16,7 @@ Contents: 8.6.2-notes 8.6.3-notes 8.6.4-notes + 8.6.5-notes ghci runghc usage View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d2a284ab461681919cadaed394adebe42c4cc7bb...b9001408d5ef2564a9fcb847d0ec9159b09e68c1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d2a284ab461681919cadaed394adebe42c4cc7bb...b9001408d5ef2564a9fcb847d0ec9159b09e68c1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 20:45:13 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 16:45:13 -0400 Subject: [Git][ghc/ghc][wip/lint-check-version-number] gitlab-ci: Ensure that version number has three components Message-ID: <5caa61597d66b_62b33fa2eef0a9942186086@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/lint-check-version-number at Glasgow Haskell Compiler / GHC Commits: 1dd72c10 by Ben Gamari at 2019-04-07T20:45:01Z gitlab-ci: Ensure that version number has three components - - - - - 2 changed files: - .gitlab-ci.yml - configure.ac Changes: ===================================== .gitlab-ci.yml ===================================== @@ -55,6 +55,9 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - | + egrep -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) dependencies: [] tags: - lint ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1dd72c10df80c6ea7b68f69d4c4d2ff030331fdb -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1dd72c10df80c6ea7b68f69d4c4d2ff030331fdb You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 7 20:48:06 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 16:48:06 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 20 commits: Fix #16282. Message-ID: <5caa6206ad22f_62b33fa2f02336c42187328@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 0444686f by Ben Gamari at 2019-04-07T20:47:56Z gitlab-ci: Test using slowtest in deb9-debug job - - - - - 5c6bc7e6 by Ben Gamari at 2019-04-07T20:47:56Z testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways As noted in #16535. - - - - - 659fda69 by Ömer Sinan Ağacan at 2019-04-07T20:47:56Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 5848c290 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Make closureSize less sensitive to optimisation - - - - - d94924fa by Ben Gamari at 2019-04-07T20:47:57Z process: Bump submodule * Skip process005 in ghci way * Mark process002 as fragile in threaded2 - - - - - 80d08448 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Mark T13167 as fragile in threaded2 As noted in #16536. - - - - - 2954a91e by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Mark T13910 as broken in optasm Due to #16537. - - - - - ca847286 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Mark T14272 as broken in optasm - - - - - a36acf15 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Mark T14761c as broken in hpc and optasm ways As noted in #16540. - - - - - 2b546d80 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Mark T16180 as broken in ghci and ext-interp ways As noted in #16541. - - - - - 4eac3fd9 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - a52b8edf by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Mark Overflow as broken in hpc way As noted in #16543. - - - - - b18ceb7e by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Mark closure_size as broken in threaded2, optasm, dyn ways As noted in #16531. - - - - - c9a1412c by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Mark T2783 as fragile in threaded1 It was previously marked as broken but it passes non-deterministically. See #2783. - - - - - 109fd2b3 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Skip T7919 in ghci way It times out pretty reliably. It's not clear that much is gained by running this test in the ghci way anyways. - - - - - 6c7dc902 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Fix fragile_for test modifier - - - - - e3ed757a by Ben Gamari at 2019-04-07T20:47:57Z users-guide: Add pretty to package list - - - - - 9a685b82 by Ben Gamari at 2019-04-07T20:47:57Z Bump unix submodule Marks posix002 as fragile in threaded2 way due to #16550. - - - - - d7106e24 by Ben Gamari at 2019-04-07T20:47:57Z testsuite: Fix omit_ways usage omit_ways expects a list but this was broken in several cases. - - - - - 25 changed files: - .gitlab-ci.yml - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - docs/users_guide/8.8.1-notes.rst - libraries/base/tests/all.T - libraries/ghc-heap/tests/all.T - libraries/ghc-heap/tests/closure_size.hs - libraries/process - libraries/unix - testsuite/driver/testlib.py - testsuite/tests/concurrent/should_run/all.T - testsuite/tests/dependent/should_compile/all.T - testsuite/tests/lib/integer/all.T - testsuite/tests/primops/should_run/all.T - testsuite/tests/programs/barton-mangler-bug/test.T - testsuite/tests/rts/all.T - testsuite/tests/simplCore/should_compile/all.T - testsuite/tests/th/all.T - testsuite/tests/typecheck/should_compile/all.T - testsuite/tests/typecheck/should_run/all.T - testsuite/tests/utils/should_run/all.T - + testsuite/tests/warnings/should_compile/T16282/T16282.hs - + testsuite/tests/warnings/should_compile/T16282/T16282.stderr - + testsuite/tests/warnings/should_compile/T16282/all.T - testsuite/tests/warnings/should_compile/all.T Changes: ===================================== .gitlab-ci.yml ===================================== @@ -387,6 +387,7 @@ validate-x86_64-linux-deb9-debug: stage: build variables: BUILD_FLAVOUR: validate + TEST_TYPE: slowtest TEST_ENV: "x86_64-linux-deb9-debug" validate-x86_64-linux-deb9-llvm: ===================================== compiler/simplCore/CoreMonad.hs ===================================== @@ -778,8 +778,8 @@ we aren't using annotations heavily. ************************************************************************ -} -msg :: Severity -> SDoc -> CoreM () -msg sev doc +msg :: Severity -> WarnReason -> SDoc -> CoreM () +msg sev reason doc = do { dflags <- getDynFlags ; loc <- getSrcSpanM ; unqual <- getPrintUnqualified @@ -791,7 +791,7 @@ msg sev doc err_sty = mkErrStyle dflags unqual user_sty = mkUserStyle dflags unqual AllTheWay dump_sty = mkDumpStyle dflags unqual - ; liftIO $ putLogMsg dflags NoReason sev loc sty doc } + ; liftIO $ putLogMsg dflags reason sev loc sty doc } -- | Output a String message to the screen putMsgS :: String -> CoreM () @@ -799,7 +799,7 @@ putMsgS = putMsg . text -- | Output a message to the screen putMsg :: SDoc -> CoreM () -putMsg = msg SevInfo +putMsg = msg SevInfo NoReason -- | Output an error to the screen. Does not cause the compiler to die. errorMsgS :: String -> CoreM () @@ -807,9 +807,9 @@ errorMsgS = errorMsg . text -- | Output an error to the screen. Does not cause the compiler to die. errorMsg :: SDoc -> CoreM () -errorMsg = msg SevError +errorMsg = msg SevError NoReason -warnMsg :: SDoc -> CoreM () +warnMsg :: WarnReason -> SDoc -> CoreM () warnMsg = msg SevWarning -- | Output a fatal error to the screen. Does not cause the compiler to die. @@ -818,7 +818,7 @@ fatalErrorMsgS = fatalErrorMsg . text -- | Output a fatal error to the screen. Does not cause the compiler to die. fatalErrorMsg :: SDoc -> CoreM () -fatalErrorMsg = msg SevFatal +fatalErrorMsg = msg SevFatal NoReason -- | Output a string debugging message at verbosity level of @-v@ or higher debugTraceMsgS :: String -> CoreM () @@ -826,7 +826,7 @@ debugTraceMsgS = debugTraceMsg . text -- | Outputs a debugging message at verbosity level of @-v@ or higher debugTraceMsg :: SDoc -> CoreM () -debugTraceMsg = msg SevDump +debugTraceMsg = msg SevDump NoReason -- | Show some labelled 'SDoc' if a particular flag is set or at a verbosity level of @-v -ddump-most@ or higher dumpIfSet_dyn :: DumpFlag -> String -> SDoc -> CoreM () ===================================== compiler/specialise/Specialise.hs ===================================== @@ -730,28 +730,35 @@ specImport dflags this_mod top_env done callers rb fn calls_for_fn ; return (rules2 ++ rules1, final_binds) } - | warnMissingSpecs dflags callers - = do { warnMsg (vcat [ hang (text "Could not specialise imported function" <+> quotes (ppr fn)) - 2 (vcat [ text "when specialising" <+> quotes (ppr caller) - | caller <- callers]) - , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn)) - , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ]) - ; return ([], []) } + | otherwise = do { tryWarnMissingSpecs dflags callers fn calls_for_fn + ; return ([], [])} - | otherwise - = return ([], []) where unfolding = realIdUnfolding fn -- We want to see the unfolding even for loop breakers -warnMissingSpecs :: DynFlags -> [Id] -> Bool +-- | Returns whether or not to show a missed-spec warning. +-- If -Wall-missed-specializations is on, show the warning. +-- Otherwise, if -Wmissed-specializations is on, only show a warning +-- if there is at least one imported function being specialized, +-- and if all imported functions are marked with an inline pragma +-- Use the most specific warning as the reason. +tryWarnMissingSpecs :: DynFlags -> [Id] -> Id -> [CallInfo] -> CoreM () -- See Note [Warning about missed specialisations] -warnMissingSpecs dflags callers - | wopt Opt_WarnAllMissedSpecs dflags = True - | not (wopt Opt_WarnMissedSpecs dflags) = False - | null callers = False - | otherwise = all has_inline_prag callers +tryWarnMissingSpecs dflags callers fn calls_for_fn + | wopt Opt_WarnMissedSpecs dflags + && not (null callers) + && allCallersInlined = doWarn $ Reason Opt_WarnMissedSpecs + | wopt Opt_WarnAllMissedSpecs dflags = doWarn $ Reason Opt_WarnAllMissedSpecs + | otherwise = return () where - has_inline_prag id = isAnyInlinePragma (idInlinePragma id) + allCallersInlined = all (isAnyInlinePragma . idInlinePragma) callers + doWarn reason = + warnMsg reason + (vcat [ hang (text ("Could not specialise imported function") <+> quotes (ppr fn)) + 2 (vcat [ text "when specialising" <+> quotes (ppr caller) + | caller <- callers]) + , whenPprDebug (text "calls:" <+> vcat (map (pprCallInfo fn) calls_for_fn)) + , text "Probable fix: add INLINABLE pragma on" <+> quotes (ppr fn) ]) wantSpecImport :: DynFlags -> Unfolding -> Bool -- See Note [Specialise imported INLINABLE things] ===================================== docs/users_guide/8.8.1-notes.rst ===================================== @@ -212,6 +212,7 @@ for further change information. libraries/libiserv/libiserv.cabal: Internal compiler library libraries/mtl/mtl.cabal: Dependency of ``Cabal`` library libraries/parsec/parsec.cabal: Dependency of ``Cabal`` library + libraries/pretty/pretty.cabal: Dependency of ``ghc`` library libraries/process/process.cabal: Dependency of ``ghc`` library libraries/stm/stm.cabal: Dependency of ``haskeline`` library libraries/template-haskell/template-haskell.cabal: Core library ===================================== libraries/base/tests/all.T ===================================== @@ -202,7 +202,7 @@ test('T8089', compile_and_run, ['']) test('hWaitForInput-accurate-socket', reqlib('unix'), compile_and_run, ['']) test('T8684', expect_broken(8684), compile_and_run, ['']) -test('hWaitForInput-accurate-stdin', normal, compile_and_run, ['']) +test('hWaitForInput-accurate-stdin', [expect_broken_for(16535, ['threaded1', 'threaded2']), omit_ways(['ghci'])], compile_and_run, ['']) test('hWaitForInput-accurate-pipe', reqlib('unix'), compile_and_run, ['']) test('T9826',normal, compile_and_run,['']) test('T9848', @@ -233,6 +233,6 @@ test('T3474', test('T14425', normal, compile_and_run, ['']) test('T10412', normal, compile_and_run, ['']) test('T13896', normal, compile_and_run, ['']) -test('T13167', normal, compile_and_run, ['']) +test('T13167', fragile_for(16536, ['threaded2']), compile_and_run, ['']) test('T15349', [exit_code(1), expect_broken_for(15349, 'ghci')], compile_and_run, ['']) test('T16111', exit_code(1), compile_and_run, ['']) ===================================== libraries/ghc-heap/tests/all.T ===================================== @@ -7,5 +7,5 @@ test('heap_all', ], compile_and_run, ['']) test('closure_size', - omit_ways(['ghci', 'hpc', 'prof']), + [omit_ways(['ghci', 'hpc', 'prof']), expect_broken_for(16544, ['dyn', 'optasm', 'threaded2'])], compile_and_run, ['']) ===================================== libraries/ghc-heap/tests/closure_size.hs ===================================== @@ -1,6 +1,7 @@ {-# LANGUAGE BangPatterns #-} {-# LANGUAGE TypeApplications #-} {-# LANGUAGE ScopedTypeVariables #-} +{-# LANGUAGE MagicHash #-} import Control.Monad import Type.Reflection @@ -17,12 +18,17 @@ assertSize !x expected = do putStrLn $ prettyCallStack callStack {-# NOINLINE assertSize #-} -pap :: Int -> Char -> Int +pap :: Int -> Maybe Char -> Int pap x _ = x {-# NOINLINE pap #-} main :: IO () main = do + -- Ensure that GHC can't turn PAP into a FUN (see #16531) + let x :: Int + x = 42 + {-# NOINLINE x #-} + assertSize 'a' 2 assertSize (Just ()) 2 assertSize (Nothing :: Maybe ()) 2 @@ -30,5 +36,5 @@ main = do assertSize ((1,2,3) :: (Int,Int,Int)) 4 assertSize (id :: Int -> Int) 1 assertSize (fst :: (Int,Int) -> Int) 1 - assertSize (pap 1) 2 + assertSize (pap x) 2 ===================================== libraries/process ===================================== @@ -1 +1 @@ -Subproject commit d860209e53c1b40b7c251fc8378886bbcb394402 +Subproject commit 5c81524694ceaf8523a1846718a7a7c3f402124f ===================================== libraries/unix ===================================== @@ -1 +1 @@ -Subproject commit a102df29c107e8f853129dd40fbbb487e1818149 +Subproject commit a67b5e9e70a2a0c438d4f186fda9f38409ff1435 ===================================== testsuite/driver/testlib.py ===================================== @@ -258,14 +258,14 @@ def fragile( bug ): return helper -def fragile_for( name, opts, bug, ways ): +def fragile_for( bug, ways ): """ Indicates that the test should be skipped due to fragility in the given test ways as documented in the given ticket. """ def helper( name, opts, bug=bug, ways=ways ): record_broken(name, opts, bug) - opts.omit_ways = ways + opts.omit_ways += ways return helper @@ -275,7 +275,8 @@ def omit_ways( ways ): return lambda name, opts, w=ways: _omit_ways( name, opts, w ) def _omit_ways( name, opts, ways ): - opts.omit_ways = ways + assert ways.__class__ is list + opts.omit_ways += ways # ----- @@ -1408,7 +1409,6 @@ def simple_run(name, way, prog, extra_run_opts): return failBecause('bad stderr') if not (opts.ignore_stdout or stdout_ok(name, way)): return failBecause('bad stdout') - check_hp = '-h' in my_rts_flags and opts.check_hp check_prof = '-p' in my_rts_flags ===================================== testsuite/tests/concurrent/should_run/all.T ===================================== @@ -239,7 +239,7 @@ test('conc067', ignore_stdout, compile_and_run, ['']) # omit threaded2, the behaviour of this test is non-deterministic with more # than one CPU. -test('conc068', [ omit_ways('threaded2'), exit_code(1) ], compile_and_run, ['']) +test('conc068', [ omit_ways(['threaded2']), exit_code(1) ], compile_and_run, ['']) test('setnumcapabilities001', [ only_ways(['threaded1','threaded2']), ===================================== testsuite/tests/dependent/should_compile/all.T ===================================== @@ -40,7 +40,7 @@ test('T12742', normal, compile, ['']) # (1) Use -fexternal-interpreter, or # (2) Build the program twice: once with -dynamic, and then # with -prof using -osuf to set a different object file suffix. -test('T13910', omit_ways(['profasm']), compile, ['']) +test('T13910', [expect_broken_for(16537, ['optasm']), omit_ways(['profasm'])], compile, ['']) test('T13938', [extra_files(['T13938a.hs'])], makefile_test, ['T13938']) test('T14556', normal, compile, ['']) test('T14720', normal, compile, ['']) ===================================== testsuite/tests/lib/integer/all.T ===================================== @@ -1,8 +1,8 @@ test('integerBits', normal, compile_and_run, ['']) test('integerConversions', normal, compile_and_run, ['']) # skip ghci as it doesn't support unboxed tuples -test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways('ghci')], compile_and_run, ['']) -test('plusMinusInteger', [omit_ways('ghci')], compile_and_run, ['']) +test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways(['ghci'])], compile_and_run, ['']) +test('plusMinusInteger', [omit_ways(['ghci'])], compile_and_run, ['']) test('integerConstantFolding', normal, makefile_test, ['integerConstantFolding']) test('fromToInteger', [], makefile_test, ['fromToInteger']) test('IntegerConversionRules', [], makefile_test, ['IntegerConversionRules']) ===================================== testsuite/tests/primops/should_run/all.T ===================================== @@ -20,7 +20,9 @@ test('CmpInt8', normal, compile_and_run, ['']) test('CmpWord8', normal, compile_and_run, ['']) test('ShowPrim', normal, compile_and_run, ['']) -test('ArithInt16', normal, compile_and_run, ['']) -test('ArithWord16', normal, compile_and_run, ['']) +# These two tests use unboxed tuples, which GHCi doesn't support +test('ArithInt16', omit_ways(['ghci']), compile_and_run, ['']) +test('ArithWord16', omit_ways(['ghci']), compile_and_run, ['']) + test('CmpInt16', normal, compile_and_run, ['']) -test('CmpWord16', normal, compile_and_run, ['']) \ No newline at end of file +test('CmpWord16', normal, compile_and_run, ['']) ===================================== testsuite/tests/programs/barton-mangler-bug/test.T ===================================== @@ -8,7 +8,7 @@ test('barton-mangler-bug', 'Plot.hi', 'Plot.o', 'PlotExample.hi', 'PlotExample.o', 'TypesettingTricks.hi', 'TypesettingTricks.o']), - omit_ways('debug') # Fails for debug way due to annotation linting timeout + omit_ways(['debug']) # Fails for debug way due to annotation linting timeout ], multimod_compile_and_run, ['Main', '']) ===================================== testsuite/tests/rts/all.T ===================================== @@ -73,7 +73,7 @@ test('T2047', [ignore_stdout, extra_run_opts('+RTS -c -RTS')], # Blackhole-detection test. # Skip GHCi due to #2786 test('T2783', [ omit_ways(['ghci']), exit_code(1) - , expect_broken_for(2783, ['threaded1']) + , fragile_for(2783, ['threaded1']) ], compile_and_run, ['']) # Test the work-stealing deque implementation. We run this test in @@ -93,7 +93,7 @@ test('stack002', [extra_files(['stack001.hs']), # run this test with very small stack chunks to exercise the stack # overflow/underflow machinery. -test('stack003', [ omit_ways('ghci'), # uses unboxed tuples +test('stack003', [ omit_ways(['ghci']), # uses unboxed tuples extra_run_opts('500000 +RTS -kc1k -kb100 -K96m -RTS') ], compile_and_run, ['']) @@ -272,7 +272,8 @@ test('T7815', [ multi_cpu_race, test('ffishutdown', [ignore_stderr, only_ways(['threaded1','threaded2'])], compile_and_run, ['']) -test('T7919', [when(fast(), skip), omit_ways(prof_ways)], compile_and_run, +# Times out in ghci way +test('T7919', [when(fast(), skip), omit_ways(['ghci'] + prof_ways)], compile_and_run, [config.ghc_th_way_flags]) test('T8035', normal, compile_and_run, ['']) ===================================== testsuite/tests/simplCore/should_compile/all.T ===================================== @@ -275,7 +275,7 @@ test('T14140', normal, makefile_test, ['T14140']) -test('T14272', normal, compile, ['']) +test('T14272', expect_broken_for(16539, ['optasm']), compile, ['']) test('T14270a', normal, compile, ['']) test('T14152', [ only_ways(['optasm']), check_errmsg(r'dead code') ], compile, ['-ddump-simpl']) test('T14152a', [extra_files(['T14152.hs']), pre_cmd('cp T14152.hs T14152a.hs'), ===================================== testsuite/tests/th/all.T ===================================== @@ -467,7 +467,10 @@ test('T15437', expect_broken(15437), multimod_compile, test('T15985', normal, compile, ['']) test('T16133', normal, compile_fail, ['']) test('T15471', normal, multimod_compile, ['T15471.hs', '-v0']) -test('T16180', normal, compile_and_run, ['-package ghc']) +test('T16180', + [when(llvm_build(), expect_broken_for(16541, ['ext-interp']), + expect_broken_for(16541, ['ghci'])], + compile_and_run, ['-package ghc']) test('T16183', normal, compile, ['-v0 -ddump-splices -dsuppress-uniques']) test('T16195', normal, multimod_compile, ['T16195.hs', '-v0']) test('T16293b', normal, compile, ['']) ===================================== testsuite/tests/typecheck/should_compile/all.T ===================================== @@ -659,7 +659,7 @@ test('T15586', normal, compile, ['']) test('T15368', normal, compile, ['-fdefer-type-errors']) test('T15645', normal, compile, ['-Wwarn=missing-monadfail-instances']) test('T15778', normal, compile, ['']) -test('T14761c', normal, compile, ['']) +test('T14761c', expect_broken_for(16540, ['hpc', 'optasm']), compile, ['']) test('T16008', normal, compile, ['']) test('T16033', normal, compile, ['']) test('T16141', normal, compile, ['-O']) ===================================== testsuite/tests/typecheck/should_run/all.T ===================================== @@ -35,7 +35,7 @@ test('tcrun018', normal, compile_and_run, ['']) test('tcrun019', normal, compile_and_run, ['']) test('tcrun020', normal, compile_and_run, ['']) test('tcrun021', normal, compile_and_run, ['-package containers']) -test('tcrun022', omit_ways(['ghci']), compile_and_run, ['-O']) +test('tcrun022', omit_ways(['hpc', 'ghci']), compile_and_run, ['-O']) test('tcrun023', [when(doing_ghci(), extra_hc_opts('-fobject-code'))], compile_and_run, ['-O']) test('tcrun024', normal, compile_and_run, ['-O']) @@ -96,7 +96,7 @@ test('T6117', normal, compile_and_run, ['']) test('T5751', normal, compile_and_run, ['']) test('T5913', normal, compile_and_run, ['']) test('T7748', normal, compile_and_run, ['']) -test('T7861', [omit_ways('debug'), exit_code(1)], compile_and_run, ['']) +test('T7861', [omit_ways(['debug']), exit_code(1)], compile_and_run, ['']) test('TcTypeNatSimpleRun', normal, compile_and_run, ['']) test('TcTypeSymbolSimpleRun', normal, compile_and_run, ['']) test('T8119', normal, ghci_script, ['T8119.script']) ===================================== testsuite/tests/utils/should_run/all.T ===================================== @@ -1,6 +1,6 @@ test('T14854', [only_ways(threaded_ways), - omit_ways('ghci'), + omit_ways(['ghci']), reqlib('random'), ignore_stderr], compile_and_run, ===================================== testsuite/tests/warnings/should_compile/T16282/T16282.hs ===================================== @@ -0,0 +1,14 @@ +import Data.Map + +-- If someone improves the specializer so that +-- GHC no longer misses the specialization below, +-- then this test will fail, as it expects a warning +-- to be issued. +-- Another reason this could fail is due to spelling: +-- the test checks for the "specialisation" spelling, +-- but due to changes in how the warnings are listed in DynFalgs.hs +-- the compiler may spit out the "specialization" spelling. +main :: IO () +main = do + let m = [] :: [Map String Bool] + mapM_ print m ===================================== testsuite/tests/warnings/should_compile/T16282/T16282.stderr ===================================== @@ -0,0 +1,5 @@ + +T16282.hs: warning: [-Wall-missed-specialisations] + Could not specialise imported function ‘Data.Map.Internal.$w$cshowsPrec’ + when specialising ‘Data.Map.Internal.$fShowMap_$cshowsPrec’ + Probable fix: add INLINABLE pragma on ‘Data.Map.Internal.$w$cshowsPrec’ ===================================== testsuite/tests/warnings/should_compile/T16282/all.T ===================================== @@ -0,0 +1 @@ +test('T16282', normal, compile, ['-O2 -Wall-missed-specialisations']) \ No newline at end of file ===================================== testsuite/tests/warnings/should_compile/all.T ===================================== @@ -25,4 +25,4 @@ test('MissingMod', normal, multimod_compile, ['MissingMod', '-Wmissing-home-modu test('StarBinder', normal, compile, ['']) -test('Overflow', normal, compile, ['']) +test('Overflow', expect_broken_for(16543, ['hpc']), compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1ac79339a782c850a5affdd75d56b192d1565414...d7106e24e223cb38baf6dde90545a4208250591d -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1ac79339a782c850a5affdd75d56b192d1565414...d7106e24e223cb38baf6dde90545a4208250591d You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 00:53:26 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 07 Apr 2019 20:53:26 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 10 commits: testsuite: Mark T16180 as broken in ghci and ext-interp ways Message-ID: <5caa9b86f066e_62b33fa2cd1d76082203990@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: e211572f by Ben Gamari at 2019-04-08T00:53:13Z testsuite: Mark T16180 as broken in ghci and ext-interp ways As noted in #16541. - - - - - 913d2f00 by Ben Gamari at 2019-04-08T00:53:13Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - f9bc88fe by Ben Gamari at 2019-04-08T00:53:13Z testsuite: Mark Overflow as broken in hpc way As noted in #16543. - - - - - 2ed06d21 by Ben Gamari at 2019-04-08T00:53:13Z testsuite: Mark closure_size as broken in threaded2, optasm, dyn ways As noted in #16531. - - - - - 20919605 by Ben Gamari at 2019-04-08T00:53:13Z testsuite: Mark T2783 as fragile in threaded1 It was previously marked as broken but it passes non-deterministically. See #2783. - - - - - e7a18a49 by Ben Gamari at 2019-04-08T00:53:13Z testsuite: Skip T7919 in ghci way It times out pretty reliably. It's not clear that much is gained by running this test in the ghci way anyways. - - - - - 3a67d460 by Ben Gamari at 2019-04-08T00:53:13Z testsuite: Fix fragile_for test modifier - - - - - c1391f20 by Ben Gamari at 2019-04-08T00:53:13Z users-guide: Add pretty to package list - - - - - 2e8a4874 by Ben Gamari at 2019-04-08T00:53:14Z Bump unix submodule Marks posix002 as fragile in threaded2 way due to #16550. - - - - - 1202085b by Ben Gamari at 2019-04-08T00:53:14Z testsuite: Fix omit_ways usage omit_ways expects a list but this was broken in several cases. - - - - - 12 changed files: - docs/users_guide/8.8.1-notes.rst - libraries/ghc-heap/tests/all.T - libraries/unix - testsuite/driver/testlib.py - testsuite/tests/concurrent/should_run/all.T - testsuite/tests/lib/integer/all.T - testsuite/tests/programs/barton-mangler-bug/test.T - testsuite/tests/rts/all.T - testsuite/tests/th/all.T - testsuite/tests/typecheck/should_run/all.T - testsuite/tests/utils/should_run/all.T - testsuite/tests/warnings/should_compile/all.T Changes: ===================================== docs/users_guide/8.8.1-notes.rst ===================================== @@ -212,6 +212,7 @@ for further change information. libraries/libiserv/libiserv.cabal: Internal compiler library libraries/mtl/mtl.cabal: Dependency of ``Cabal`` library libraries/parsec/parsec.cabal: Dependency of ``Cabal`` library + libraries/pretty/pretty.cabal: Dependency of ``ghc`` library libraries/process/process.cabal: Dependency of ``ghc`` library libraries/stm/stm.cabal: Dependency of ``haskeline`` library libraries/template-haskell/template-haskell.cabal: Core library ===================================== libraries/ghc-heap/tests/all.T ===================================== @@ -7,5 +7,5 @@ test('heap_all', ], compile_and_run, ['']) test('closure_size', - omit_ways(['ghci', 'hpc', 'prof']), + [omit_ways(['ghci', 'hpc', 'prof']), expect_broken_for(16544, ['dyn', 'optasm', 'threaded2'])], compile_and_run, ['']) ===================================== libraries/unix ===================================== @@ -1 +1 @@ -Subproject commit a102df29c107e8f853129dd40fbbb487e1818149 +Subproject commit a67b5e9e70a2a0c438d4f186fda9f38409ff1435 ===================================== testsuite/driver/testlib.py ===================================== @@ -258,14 +258,14 @@ def fragile( bug ): return helper -def fragile_for( name, opts, bug, ways ): +def fragile_for( bug, ways ): """ Indicates that the test should be skipped due to fragility in the given test ways as documented in the given ticket. """ def helper( name, opts, bug=bug, ways=ways ): record_broken(name, opts, bug) - opts.omit_ways = ways + opts.omit_ways += ways return helper @@ -275,7 +275,8 @@ def omit_ways( ways ): return lambda name, opts, w=ways: _omit_ways( name, opts, w ) def _omit_ways( name, opts, ways ): - opts.omit_ways = ways + assert ways.__class__ is list + opts.omit_ways += ways # ----- ===================================== testsuite/tests/concurrent/should_run/all.T ===================================== @@ -239,7 +239,7 @@ test('conc067', ignore_stdout, compile_and_run, ['']) # omit threaded2, the behaviour of this test is non-deterministic with more # than one CPU. -test('conc068', [ omit_ways('threaded2'), exit_code(1) ], compile_and_run, ['']) +test('conc068', [ omit_ways(['threaded2']), exit_code(1) ], compile_and_run, ['']) test('setnumcapabilities001', [ only_ways(['threaded1','threaded2']), ===================================== testsuite/tests/lib/integer/all.T ===================================== @@ -1,8 +1,8 @@ test('integerBits', normal, compile_and_run, ['']) test('integerConversions', normal, compile_and_run, ['']) # skip ghci as it doesn't support unboxed tuples -test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways('ghci')], compile_and_run, ['']) -test('plusMinusInteger', [omit_ways('ghci')], compile_and_run, ['']) +test('integerGmpInternals', [reqlib('integer-gmp'), omit_ways(['ghci'])], compile_and_run, ['']) +test('plusMinusInteger', [omit_ways(['ghci'])], compile_and_run, ['']) test('integerConstantFolding', normal, makefile_test, ['integerConstantFolding']) test('fromToInteger', [], makefile_test, ['fromToInteger']) test('IntegerConversionRules', [], makefile_test, ['IntegerConversionRules']) ===================================== testsuite/tests/programs/barton-mangler-bug/test.T ===================================== @@ -8,7 +8,7 @@ test('barton-mangler-bug', 'Plot.hi', 'Plot.o', 'PlotExample.hi', 'PlotExample.o', 'TypesettingTricks.hi', 'TypesettingTricks.o']), - omit_ways('debug') # Fails for debug way due to annotation linting timeout + omit_ways(['debug']) # Fails for debug way due to annotation linting timeout ], multimod_compile_and_run, ['Main', '']) ===================================== testsuite/tests/rts/all.T ===================================== @@ -73,7 +73,7 @@ test('T2047', [ignore_stdout, extra_run_opts('+RTS -c -RTS')], # Blackhole-detection test. # Skip GHCi due to #2786 test('T2783', [ omit_ways(['ghci']), exit_code(1) - , expect_broken_for(2783, ['threaded1']) + , fragile_for(2783, ['threaded1']) ], compile_and_run, ['']) # Test the work-stealing deque implementation. We run this test in @@ -93,7 +93,7 @@ test('stack002', [extra_files(['stack001.hs']), # run this test with very small stack chunks to exercise the stack # overflow/underflow machinery. -test('stack003', [ omit_ways('ghci'), # uses unboxed tuples +test('stack003', [ omit_ways(['ghci']), # uses unboxed tuples extra_run_opts('500000 +RTS -kc1k -kb100 -K96m -RTS') ], compile_and_run, ['']) @@ -272,7 +272,8 @@ test('T7815', [ multi_cpu_race, test('ffishutdown', [ignore_stderr, only_ways(['threaded1','threaded2'])], compile_and_run, ['']) -test('T7919', [when(fast(), skip), omit_ways(prof_ways)], compile_and_run, +# Times out in ghci way +test('T7919', [when(fast(), skip), omit_ways(['ghci'] + prof_ways)], compile_and_run, [config.ghc_th_way_flags]) test('T8035', normal, compile_and_run, ['']) ===================================== testsuite/tests/th/all.T ===================================== @@ -467,7 +467,10 @@ test('T15437', expect_broken(15437), multimod_compile, test('T15985', normal, compile, ['']) test('T16133', normal, compile_fail, ['']) test('T15471', normal, multimod_compile, ['T15471.hs', '-v0']) -test('T16180', normal, compile_and_run, ['-package ghc']) +test('T16180', + [when(llvm_build(), expect_broken_for(16541, ['ext-interp'])), + expect_broken_for(16541, ['ghci'])], + compile_and_run, ['-package ghc']) test('T16183', normal, compile, ['-v0 -ddump-splices -dsuppress-uniques']) test('T16195', normal, multimod_compile, ['T16195.hs', '-v0']) test('T16293b', normal, compile, ['']) ===================================== testsuite/tests/typecheck/should_run/all.T ===================================== @@ -35,7 +35,7 @@ test('tcrun018', normal, compile_and_run, ['']) test('tcrun019', normal, compile_and_run, ['']) test('tcrun020', normal, compile_and_run, ['']) test('tcrun021', normal, compile_and_run, ['-package containers']) -test('tcrun022', omit_ways(['ghci']), compile_and_run, ['-O']) +test('tcrun022', omit_ways(['hpc', 'ghci']), compile_and_run, ['-O']) test('tcrun023', [when(doing_ghci(), extra_hc_opts('-fobject-code'))], compile_and_run, ['-O']) test('tcrun024', normal, compile_and_run, ['-O']) @@ -96,7 +96,7 @@ test('T6117', normal, compile_and_run, ['']) test('T5751', normal, compile_and_run, ['']) test('T5913', normal, compile_and_run, ['']) test('T7748', normal, compile_and_run, ['']) -test('T7861', [omit_ways('debug'), exit_code(1)], compile_and_run, ['']) +test('T7861', [omit_ways(['debug']), exit_code(1)], compile_and_run, ['']) test('TcTypeNatSimpleRun', normal, compile_and_run, ['']) test('TcTypeSymbolSimpleRun', normal, compile_and_run, ['']) test('T8119', normal, ghci_script, ['T8119.script']) ===================================== testsuite/tests/utils/should_run/all.T ===================================== @@ -1,6 +1,6 @@ test('T14854', [only_ways(threaded_ways), - omit_ways('ghci'), + omit_ways(['ghci']), reqlib('random'), ignore_stderr], compile_and_run, ===================================== testsuite/tests/warnings/should_compile/all.T ===================================== @@ -25,4 +25,4 @@ test('MissingMod', normal, multimod_compile, ['MissingMod', '-Wmissing-home-modu test('StarBinder', normal, compile, ['']) -test('Overflow', normal, compile, ['']) +test('Overflow', expect_broken_for(16543, ['hpc']), compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d7106e24e223cb38baf6dde90545a4208250591d...1202085b58db1608b5761b68fafe54d0fafe09be -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d7106e24e223cb38baf6dde90545a4208250591d...1202085b58db1608b5761b68fafe54d0fafe09be You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 08:29:35 2019 From: gitlab at gitlab.haskell.org (Sebastian Graf) Date: Mon, 08 Apr 2019 04:29:35 -0400 Subject: [Git][ghc/ghc][wip/dmd-arity] Compute demand signatures assuming idArity Message-ID: <5cab066ff2142_62b33fa2cd1d760822307f0@gitlab.haskell.org.mail> Sebastian Graf pushed to branch wip/dmd-arity at Glasgow Haskell Compiler / GHC Commits: e00de399 by Sebastian Graf at 2019-04-08T08:28:30Z Compute demand signatures assuming idArity This does four things: 1. Look at `idArity` instead of manifest lambdas to decide whether to use LetUp 2. Compute the strictness signature in LetDown assuming at least `idArity` incoming arguments 3. Remove the special case for trivial RHSs, which is subsumed by 2 4. Don't perform the W/W split when doing so would eta expand a binding. Otherwise we would eta expand PAPs, causing unnecessary churn in the Simplifier. NoFib Results -------------------------------------------------------------------------------- Program Allocs Instrs -------------------------------------------------------------------------------- fannkuch-redux +0.3% 0.0% gg -0.0% -0.1% maillist +0.2% +0.2% minimax 0.0% +0.8% pretty 0.0% -0.1% reptile -0.0% -1.2% -------------------------------------------------------------------------------- Min -0.0% -1.2% Max +0.3% +0.8% Geometric Mean +0.0% -0.0% - - - - - 17 changed files: - compiler/basicTypes/Demand.hs - compiler/basicTypes/Id.hs - compiler/basicTypes/IdInfo.hs - compiler/basicTypes/Var.hs - compiler/coreSyn/CoreArity.hs - compiler/coreSyn/CoreLint.hs - compiler/coreSyn/CoreUnfold.hs - compiler/simplCore/SimplMonad.hs - compiler/simplCore/SimplUtils.hs - compiler/stranal/DmdAnal.hs - compiler/stranal/WorkWrap.hs - compiler/stranal/WwLib.hs - + testsuite/tests/perf/compiler/WWRec.hs - testsuite/tests/perf/compiler/all.T - + testsuite/tests/stranal/sigs/NewtypeArity.hs - + testsuite/tests/stranal/sigs/NewtypeArity.stderr - testsuite/tests/stranal/sigs/all.T Changes: ===================================== compiler/basicTypes/Demand.hs ===================================== @@ -22,7 +22,7 @@ module Demand ( DmdType(..), dmdTypeDepth, lubDmdType, bothDmdType, nopDmdType, botDmdType, mkDmdType, - addDemand, removeDmdTyArgs, + addDemand, ensureArgs, BothDmdArg, mkBothDmdArg, toBothDmdArg, DmdEnv, emptyDmdEnv, @@ -34,7 +34,7 @@ module Demand ( vanillaCprProdRes, cprSumRes, appIsBottom, isBottomingSig, pprIfaceStrictSig, trimCPRInfo, returnsCPR_maybe, - StrictSig(..), mkStrictSig, mkClosedStrictSig, + StrictSig(..), mkStrictSigForArity, mkClosedStrictSig, nopSig, botSig, cprProdSig, isTopSig, hasDemandEnvSig, splitStrictSig, strictSigDmdEnv, @@ -47,10 +47,10 @@ module Demand ( deferAfterIO, postProcessUnsat, postProcessDmdType, - splitProdDmd_maybe, peelCallDmd, peelManyCalls, mkCallDmd, + splitProdDmd_maybe, peelCallDmd, peelManyCalls, mkCallDmd, mkCallDmds, mkWorkerDemand, dmdTransformSig, dmdTransformDataConSig, dmdTransformDictSelSig, argOneShots, argsOneShots, saturatedByOneShots, - trimToType, TypeShape(..), + TypeShape(..), peelTsFuns, trimToType, useCount, isUsedOnce, reuseEnv, killUsageDemand, killUsageSig, zapUsageDemand, zapUsageEnvSig, @@ -675,10 +675,15 @@ mkProdDmd dx = JD { sd = mkSProd $ map getStrDmd dx , ud = mkUProd $ map getUseDmd dx } +-- | Wraps the 'CleanDemand' with a one-shot call demand: @d@ -> @C1(d)@. mkCallDmd :: CleanDemand -> CleanDemand mkCallDmd (JD {sd = d, ud = u}) = JD { sd = mkSCall d, ud = mkUCall One u } +-- | @mkCallDmds n d@ returns @C1(C1...(C1 d))@ where there are @n@ @C1@'s. +mkCallDmds :: Arity -> CleanDemand -> CleanDemand +mkCallDmds arity cd = iterate mkCallDmd cd !! arity + -- See Note [Demand on the worker] in WorkWrap mkWorkerDemand :: Int -> Demand mkWorkerDemand n = JD { sd = Lazy, ud = Use One (go n) } @@ -804,6 +809,13 @@ instance Outputable TypeShape where ppr (TsFun ts) = text "TsFun" <> parens (ppr ts) ppr (TsProd tss) = parens (hsep $ punctuate comma $ map ppr tss) +-- | @peelTsFuns n ts@ tries to peel off @n@ 'TsFun' constructors from @ts@ and +-- returns 'Just' the wrapped 'TypeShape' on success, and 'Nothing' otherwise. +peelTsFuns :: Arity -> TypeShape -> Maybe TypeShape +peelTsFuns 0 ts = Just ts +peelTsFuns n (TsFun ts) = peelTsFuns (n-1) ts +peelTsFuns _ _ = Nothing + trimToType :: Demand -> TypeShape -> Demand -- See Note [Trimming a demand to a type] trimToType (JD { sd = ms, ud = mu }) ts @@ -1207,12 +1219,8 @@ mkDmdType fv ds res = DmdType fv ds res dmdTypeDepth :: DmdType -> Arity dmdTypeDepth (DmdType _ ds _) = length ds --- Remove any demand on arguments. This is used in dmdAnalRhs on the body -removeDmdTyArgs :: DmdType -> DmdType -removeDmdTyArgs = ensureArgs 0 - --- This makes sure we can use the demand type with n arguments, --- It extends the argument list with the correct resTypeArgDmd +-- | This makes sure we can use the demand type with n arguments. +-- It extends the argument list with the correct resTypeArgDmd. -- It also adjusts the DmdResult: Divergence survives additional arguments, -- CPR information does not (and definite converge also would not). ensureArgs :: Arity -> DmdType -> DmdType @@ -1567,8 +1575,56 @@ and on the second, then returning a constructor. If this same function is applied to one arg, all we can say is that it uses x with , and its arg with demand . + +Note [Understanding DmdType and StrictSig] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Demand types are sound approximations of an expression's semantics relative to +the incoming demand we put the expression under. Consider the following +expression: + + \x y -> x `seq` (y, 2*x) + +Here is a table with demand types resulting from different incoming demands we +put that expression under. Note the monotonicity; a stronger incoming demand +yields a more precise demand type: + + incoming demand | demand type + ---------------------------------------------------- + | {} + | {} + | {} + +Note that in the first example, the depth of the demand type was *higher* than +the arity of the incoming call demand due to the anonymous lambda. +The converse is also possible and happens when we unleash demand signatures. +In @f x y@, the incoming call demand on f has arity 2. But if all we have is a +demand signature with depth 1 for @f@ (which we can safely unleash, see below), +the demand type of @f@ under a call demand of arity 2 has a *lower* depth of 1. + +So: Demand types are elicited by putting an expression under an incoming (call) +demand, the arity of which can be lower or higher than the depth of the +resulting demand type. +In contrast, a demand signature summarises a function's semantics *without* +immediately specifying the incoming demand it was produced under. Despite StrSig +being a newtype wrapper around DmdType, it actually encodes two things: + + * The threshold (i.e., minimum arity) to unleash the signature + * A demand type that is sound to unleash when the minimum arity requirement is + met. + +Here comes the subtle part: The threshold is encoded in the wrapped demand +type's depth! So in mkStrictSigForArity we make sure to trim the list of +argument demands to the given threshold arity. Call sites will make sure that +this corresponds to the arity of the call demand that elicited the wrapped +demand type. See also Note [What are demand signatures?] in DmdAnal. + +Besides trimming argument demands, mkStrictSigForArity will also trim CPR +information if necessary. -} +-- | The depth of the wrapped 'DmdType' encodes the arity at which it is safe +-- to unleash. Better construct this through 'mkStrictSigForArity'. +-- See Note [Understanding DmdType and StrictSig] newtype StrictSig = StrictSig DmdType deriving( Eq ) @@ -1580,34 +1636,43 @@ pprIfaceStrictSig :: StrictSig -> SDoc pprIfaceStrictSig (StrictSig (DmdType _ dmds res)) = hcat (map ppr dmds) <> ppr res -mkStrictSig :: DmdType -> StrictSig -mkStrictSig dmd_ty = StrictSig dmd_ty +-- | Turns a 'DmdType' computed for the particular 'Arity' into a 'StrictSig' +-- unleashable at that arity. See Note [Understanding DmdType and StrictSig] +mkStrictSigForArity :: Arity -> DmdType -> StrictSig +mkStrictSigForArity arity dmd_ty = StrictSig (ensureArgs arity dmd_ty) mkClosedStrictSig :: [Demand] -> DmdResult -> StrictSig -mkClosedStrictSig ds res = mkStrictSig (DmdType emptyDmdEnv ds res) +mkClosedStrictSig ds res = mkStrictSigForArity (length ds) (DmdType emptyDmdEnv ds res) splitStrictSig :: StrictSig -> ([Demand], DmdResult) splitStrictSig (StrictSig (DmdType _ dmds res)) = (dmds, res) increaseStrictSigArity :: Int -> StrictSig -> StrictSig --- Add extra arguments to a strictness signature +-- ^ Add extra arguments to a strictness signature. +-- In contrast to 'etaExpandStrictSig', this /prepends/ additional argument +-- demands and leaves CPR info intact. increaseStrictSigArity arity_increase sig@(StrictSig dmd_ty@(DmdType env dmds res)) | isTopDmdType dmd_ty = sig - | arity_increase <= 0 = sig + | arity_increase == 0 = sig + | arity_increase < 0 = WARN( True, text "increaseStrictSigArity:" + <+> text "negative arity increase" + <+> ppr arity_increase ) + nopSig | otherwise = StrictSig (DmdType env dmds' res) where dmds' = replicate arity_increase topDmd ++ dmds etaExpandStrictSig :: Arity -> StrictSig -> StrictSig --- We are expanding (\x y. e) to (\x y z. e z) --- Add exta demands to the /end/ of the arg demands if necessary -etaExpandStrictSig arity sig@(StrictSig dmd_ty@(DmdType env dmds res)) - | isTopDmdType dmd_ty = sig - | arity_increase <= 0 = sig - | otherwise = StrictSig (DmdType env dmds' res) - where - arity_increase = arity - length dmds - dmds' = dmds ++ replicate arity_increase topDmd +-- ^ We are expanding (\x y. e) to (\x y z. e z). +-- In contrast to 'increaseStrictSigArity', this /appends/ extra arg demands if +-- necessary, potentially destroying the signature's CPR property. +etaExpandStrictSig arity (StrictSig dmd_ty) + | arity < dmdTypeDepth dmd_ty + -- an arity decrease must zap the whole signature, because it was possibly + -- computed for a higher incoming call demand. + = nopSig + | otherwise + = StrictSig $ ensureArgs arity dmd_ty isTopSig :: StrictSig -> Bool isTopSig (StrictSig ty) = isTopDmdType ty ===================================== compiler/basicTypes/Id.hs ===================================== @@ -668,6 +668,7 @@ isBottomingId v | isId v = isBottomingSig (idStrictness v) | otherwise = False +-- | Accesses the 'Id''s 'strictnessInfo'. idStrictness :: Id -> StrictSig idStrictness id = strictnessInfo (idInfo id) ===================================== compiler/basicTypes/IdInfo.hs ===================================== @@ -237,22 +237,34 @@ pprIdDetails other = brackets (pp other) -- too big. data IdInfo = IdInfo { - arityInfo :: !ArityInfo, -- ^ 'Id' arity - ruleInfo :: RuleInfo, -- ^ Specialisations of the 'Id's function which exist - -- See Note [Specialisations and RULES in IdInfo] - unfoldingInfo :: Unfolding, -- ^ The 'Id's unfolding - cafInfo :: CafInfo, -- ^ 'Id' CAF info - oneShotInfo :: OneShotInfo, -- ^ Info about a lambda-bound variable, if the 'Id' is one - inlinePragInfo :: InlinePragma, -- ^ Any inline pragma atached to the 'Id' - occInfo :: OccInfo, -- ^ How the 'Id' occurs in the program - - strictnessInfo :: StrictSig, -- ^ A strictness signature - - demandInfo :: Demand, -- ^ ID demand information - callArityInfo :: !ArityInfo, -- ^ How this is called. - -- n <=> all calls have at least n arguments - - levityInfo :: LevityInfo -- ^ when applied, will this Id ever have a levity-polymorphic type? + arityInfo :: !ArityInfo, + -- ^ 'Id' arity, as computed by 'CoreArity'. Specifies how many + -- arguments this 'Id' has to be applied to before it doesn any + -- meaningful work. + ruleInfo :: RuleInfo, + -- ^ Specialisations of the 'Id's function which exist. + -- See Note [Specialisations and RULES in IdInfo] + unfoldingInfo :: Unfolding, + -- ^ The 'Id's unfolding + cafInfo :: CafInfo, + -- ^ 'Id' CAF info + oneShotInfo :: OneShotInfo, + -- ^ Info about a lambda-bound variable, if the 'Id' is one + inlinePragInfo :: InlinePragma, + -- ^ Any inline pragma atached to the 'Id' + occInfo :: OccInfo, + -- ^ How the 'Id' occurs in the program + strictnessInfo :: StrictSig, + -- ^ A strictness signature. Digests how a function uses its arguments + -- if applied to at least 'arityInfo' arguments. + demandInfo :: Demand, + -- ^ ID demand information + callArityInfo :: !ArityInfo, + -- ^ How this is called. This is the number of arguments to which a + -- binding can be eta-expanded without losing any sharing. + -- n <=> all calls have at least n arguments + levityInfo :: LevityInfo + -- ^ when applied, will this Id ever have a levity-polymorphic type? } -- Setters ===================================== compiler/basicTypes/Var.hs ===================================== @@ -700,6 +700,8 @@ setIdNotExported id = ASSERT( isLocalId id ) ************************************************************************ -} +-- | Is this a type-level (i.e., computationally irrelevant, thus erasable) +-- variable? Satisfies @isTyVar = not . isId at . isTyVar :: Var -> Bool -- True of both TyVar and TcTyVar isTyVar (TyVar {}) = True isTyVar (TcTyVar {}) = True @@ -712,17 +714,21 @@ isTcTyVar _ = False isTyCoVar :: Var -> Bool isTyCoVar v = isTyVar v || isCoVar v +-- | Is this a value-level (i.e., computationally relevant) 'Id'entifier? +-- Satisfies @isId = not . isTyVar at . isId :: Var -> Bool isId (Id {}) = True isId _ = False +-- | Is this a coercion variable? +-- Satisfies @'isId' v ==> 'isCoVar' v == not ('isNonCoVarId' v)@. isCoVar :: Var -> Bool --- A coercion variable isCoVar (Id { id_details = details }) = isCoVarDetails details isCoVar _ = False +-- | Is this a term variable ('Id') that is /not/ a coercion variable? +-- Satisfies @'isId' v ==> 'isCoVar' v == not ('isNonCoVarId' v)@. isNonCoVarId :: Var -> Bool --- A term variable (Id) that is /not/ a coercion variable isNonCoVarId (Id { id_details = details }) = not (isCoVarDetails details) isNonCoVarId _ = False ===================================== compiler/coreSyn/CoreArity.hs ===================================== @@ -158,7 +158,7 @@ exprBotStrictness_maybe e {- Note [exprArity invariant] ~~~~~~~~~~~~~~~~~~~~~~~~~~ -exprArity has the following invariant: +exprArity has the following invariants: (1) If typeArity (exprType e) = n, then manifestArity (etaExpand e n) = n ===================================== compiler/coreSyn/CoreLint.hs ===================================== @@ -570,15 +570,9 @@ lintSingleBinding top_lvl_flag rec_flag (binder,rhs) (addWarnL (text "INLINE binder is (non-rule) loop breaker:" <+> ppr binder)) -- Only non-rule loop breakers inhibit inlining - -- Check whether arity and demand type are consistent (only if demand analysis - -- already happened) - -- - -- Note (Apr 2014): this is actually ok. See Note [Demand analysis for trivial right-hand sides] - -- in DmdAnal. After eta-expansion in CorePrep the rhs is no longer trivial. - -- ; let dmdTy = idStrictness binder - -- ; checkL (case dmdTy of - -- StrictSig dmd_ty -> idArity binder >= dmdTypeDepth dmd_ty || exprIsTrivial rhs) - -- (mkArityMsg binder) + -- We used to check that the dmdTypeDepth of a demand signature never + -- exceeds idArity, but that is an unnecessary complication, see + -- Note [idArity varies independently of dmdTypeDepth] in DmdAnal -- Check that the binder's arity is within the bounds imposed by -- the type and the strictness signature. See Note [exprArity invariant] @@ -2562,20 +2556,6 @@ mkKindErrMsg tyvar arg_ty hang (text "Arg type:") 4 (ppr arg_ty <+> dcolon <+> ppr (typeKind arg_ty))] -{- Not needed now -mkArityMsg :: Id -> MsgDoc -mkArityMsg binder - = vcat [hsep [text "Demand type has", - ppr (dmdTypeDepth dmd_ty), - text "arguments, rhs has", - ppr (idArity binder), - text "arguments,", - ppr binder], - hsep [text "Binder's strictness signature:", ppr dmd_ty] - - ] - where (StrictSig dmd_ty) = idStrictness binder --} mkCastErr :: CoreExpr -> Coercion -> Type -> Type -> MsgDoc mkCastErr expr = mk_cast_err "expression" "type" (ppr expr) ===================================== compiler/coreSyn/CoreUnfold.hs ===================================== @@ -1149,15 +1149,15 @@ certainlyWillInline dflags fn_info -- INLINABLE functions come via this path -- See Note [certainlyWillInline: INLINABLE] do_cunf expr (UnfIfGoodArgs { ug_size = size, ug_args = args }) - | not (null args) -- See Note [certainlyWillInline: be careful of thunks] + | arityInfo fn_info > 0 -- See Note [certainlyWillInline: be careful of thunks] , not (isBottomingSig (strictnessInfo fn_info)) -- Do not unconditionally inline a bottoming functions even if -- it seems smallish. We've carefully lifted it out to top level, -- so we don't want to re-inline it. - , let arity = length args - , size - (10 * (arity + 1)) <= ufUseThreshold dflags + , let unf_arity = length args + , size - (10 * (unf_arity + 1)) <= ufUseThreshold dflags = Just (fn_unf { uf_src = InlineStable - , uf_guidance = UnfWhen { ug_arity = arity + , uf_guidance = UnfWhen { ug_arity = unf_arity , ug_unsat_ok = unSaturatedOk , ug_boring_ok = inlineBoringOk expr } }) -- Note the "unsaturatedOk". A function like f = \ab. a @@ -1175,6 +1175,17 @@ found that the WorkWrap phase thought that y = case x of F# v -> F# (v +# v) was certainlyWillInline, so the addition got duplicated. +Note that we check arityInfo instead of the arity of the unfolding to detect +this case. This is so that we don't accidentally fail to inline small partial +applications, like `f = g 42` (where `g` recurses into `f`) where g has arity 2 +(say). Here there is no risk of work duplication, and the RHS is tiny, so +certainlyWillInline should return True. But `unf_arity` is zero! However f's +arity, gotten from `arityInfo fn_info`, is 1. + +Failing to say that `f` will inline forces W/W to generate a potentially huge +worker for f that will immediately cancel with `g`'s wrapper anyway, causing +unnecessary churn in the Simplifier while arriving at the same result. + Note [certainlyWillInline: INLINABLE] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ certainlyWillInline /must/ return Nothing for a large INLINABLE thing, ===================================== compiler/simplCore/SimplMonad.hs ===================================== @@ -21,7 +21,7 @@ module SimplMonad ( import GhcPrelude -import Var ( Var, isTyVar, mkLocalVar ) +import Var ( Var, isId, mkLocalVar ) import Name ( mkSystemVarName ) import Id ( Id, mkSysLocalOrCoVar ) import IdInfo ( IdDetails(..), vanillaIdInfo, setArityInfo ) @@ -187,7 +187,8 @@ newJoinId bndrs body_ty = do { uniq <- getUniqueM ; let name = mkSystemVarName uniq (fsLit "$j") join_id_ty = mkLamTypes bndrs body_ty -- Note [Funky mkLamTypes] - arity = length (filter (not . isTyVar) bndrs) + -- Note [idArity for join points] in SimplUtils + arity = length (filter isId bndrs) join_arity = length bndrs details = JoinId join_arity id_info = vanillaIdInfo `setArityInfo` arity ===================================== compiler/simplCore/SimplUtils.hs ===================================== @@ -1508,7 +1508,7 @@ tryEtaExpandRhs :: SimplMode -> OutId -> OutExpr -> SimplM (Arity, Bool, OutExpr) -- See Note [Eta-expanding at let bindings] -- If tryEtaExpandRhs rhs = (n, is_bot, rhs') then --- (a) rhs' has manifest arity +-- (a) rhs' has manifest arity n -- (b) if is_bot is True then rhs' applied to n args is guaranteed bottom tryEtaExpandRhs mode bndr rhs | Just join_arity <- isJoinId_maybe bndr @@ -1517,6 +1517,7 @@ tryEtaExpandRhs mode bndr rhs -- Note [Do not eta-expand join points] -- But do return the correct arity and bottom-ness, because -- these are used to set the bndr's IdInfo (#15517) + -- Note [idArity for join points] | otherwise = do { (new_arity, is_bot, new_rhs) <- try_expand @@ -1610,6 +1611,13 @@ CorePrep comes around, the code is very likely to look more like this: $j2 = if n > 0 then $j1 else (...) eta +Note [idArity for join points] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Because of Note [Do not eta-expand join points] we have it that the idArity +of a join point is always (less than or) equal to the join arity. +Essentially, for join points we set `idArity $j = count isId join_lam_bndrs`. +It really can be less if there are type-level binders in join_lam_bndrs. + Note [Do not eta-expand PAPs] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We used to have old_arity = manifestArity rhs, which meant that we ===================================== compiler/stranal/DmdAnal.hs ===================================== @@ -206,7 +206,6 @@ dmdAnal' env dmd (App fun arg) -- , text "overall res dmd_ty =" <+> ppr (res_ty `bothDmdType` arg_ty) ]) (res_ty `bothDmdType` arg_ty, App fun' arg') --- this is an anonymous lambda, since @dmdAnalRhsLetDown@ uses @collectBinders@ dmdAnal' env dmd (Lam var body) | isTyVar var = let @@ -286,10 +285,7 @@ dmdAnal' env dmd (Case scrut case_bndr ty alts) -- This is used for a non-recursive local let without manifest lambdas. -- This is the LetUp rule in the paper “Higher-Order Cardinality Analysis”. dmdAnal' env dmd (Let (NonRec id rhs) body) - | useLetUp id rhs - , Nothing <- unpackTrivial rhs - -- dmdAnalRhsLetDown treats trivial right hand sides specially - -- so if we have a trival right hand side, fall through to that. + | useLetUp id = (final_ty, Let (NonRec id' rhs') body') where (body_ty, body') = dmdAnal env dmd body @@ -582,25 +578,6 @@ environment, which effectively assigns them 'nopSig' (see "getStrictness") -} --- Trivial RHS --- See Note [Demand analysis for trivial right-hand sides] -dmdAnalTrivialRhs :: - AnalEnv -> Id -> CoreExpr -> Var -> - (DmdEnv, Id, CoreExpr) -dmdAnalTrivialRhs env id rhs fn - = (fn_fv, set_idStrictness env id fn_str, rhs) - where - fn_str = getStrictness env fn - fn_fv | isLocalId fn = unitVarEnv fn topDmd - | otherwise = emptyDmdEnv - -- Note [Remember to demand the function itself] - -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -- fn_fv: don't forget to produce a demand for fn itself - -- Lacking this caused #9128 - -- The demand is very conservative (topDmd), but that doesn't - -- matter; trivial bindings are usually inlined, so it only - -- kicks in for top-level bindings and NOINLINE bindings - -- Let bindings can be processed in two ways: -- Down (RHS before body) or Up (body before RHS). -- dmdAnalRhsLetDown implements the Down variant: @@ -621,28 +598,23 @@ dmdAnalRhsLetDown :: TopLevelFlag -- Process the RHS of the binding, add the strictness signature -- to the Id, and augment the environment with the signature as well. dmdAnalRhsLetDown top_lvl rec_flag env let_dmd id rhs - | Just fn <- unpackTrivial rhs -- See Note [Demand analysis for trivial right-hand sides] - = dmdAnalTrivialRhs env id rhs fn - - | otherwise - = (lazy_fv, id', mkLams bndrs' body') + = (lazy_fv, id', rhs') where - (bndrs, body, body_dmd) - = case isJoinId_maybe id of - Just join_arity -- See Note [Demand analysis for join points] - | (bndrs, body) <- collectNBinders join_arity rhs - -> (bndrs, body, let_dmd) - - Nothing | (bndrs, body) <- collectBinders rhs - -> (bndrs, body, mkBodyDmd env body) - - env_body = foldl' extendSigsWithLam env bndrs - (body_ty, body') = dmdAnal env_body body_dmd body - body_ty' = removeDmdTyArgs body_ty -- zap possible deep CPR info - (DmdType rhs_fv rhs_dmds rhs_res, bndrs') - = annotateLamBndrs env (isDFunId id) body_ty' bndrs - sig_ty = mkStrictSig (mkDmdType sig_fv rhs_dmds rhs_res') - id' = set_idStrictness env id sig_ty + rhs_arity = idArity id + rhs_dmd + -- See Note [Demand analysis for join points] + -- See Note [idArity for join points] in SimplUtils + -- rhs_arity matches the join arity of the join point + | isJoinId id + = mkCallDmds rhs_arity let_dmd + | otherwise + -- NB: rhs_arity + -- See Note [Demand signatures are computed for a threshold demand based on idArity] + = mkRhsDmd env rhs_arity rhs + (DmdType rhs_fv rhs_dmds rhs_res, rhs') + = dmdAnal env rhs_dmd rhs + sig = mkStrictSigForArity rhs_arity (mkDmdType sig_fv rhs_dmds rhs_res') + id' = set_idStrictness env id sig -- See Note [NOINLINE and strictness] @@ -666,36 +638,63 @@ dmdAnalRhsLetDown top_lvl rec_flag env let_dmd id rhs || not (isStrictDmd (idDemandInfo id) || ae_virgin env) -- See Note [Optimistic CPR in the "virgin" case] -mkBodyDmd :: AnalEnv -> CoreExpr -> CleanDemand --- See Note [Product demands for function body] -mkBodyDmd env body - = case deepSplitProductType_maybe (ae_fam_envs env) (exprType body) of - Nothing -> cleanEvalDmd - Just (dc, _, _, _) -> cleanEvalProdDmd (dataConRepArity dc) - -unpackTrivial :: CoreExpr -> Maybe Id --- Returns (Just v) if the arg is really equal to v, modulo --- casts, type applications etc --- See Note [Demand analysis for trivial right-hand sides] -unpackTrivial (Var v) = Just v -unpackTrivial (Cast e _) = unpackTrivial e -unpackTrivial (Lam v e) | isTyVar v = unpackTrivial e -unpackTrivial (App e a) | isTypeArg a = unpackTrivial e -unpackTrivial _ = Nothing - --- | If given the RHS of a let-binding, this 'useLetUp' determines --- whether we should process the binding up (body before rhs) or --- down (rhs before body). +-- | @mkRhsDmd env rhs_arity rhs@ creates a 'CleanDemand' for +-- unleashing on the given function's @rhs@, by creating a call demand of +-- @rhs_arity@ with a body demand appropriate for possible product types. +-- See Note [Product demands for function body]. +-- For example, a call of the form @mkRhsDmd _ 2 (\x y -> (x, y))@ returns a +-- clean usage demand of @C1(C1(U(U,U)))@. +mkRhsDmd :: AnalEnv -> Arity -> CoreExpr -> CleanDemand +mkRhsDmd env rhs_arity rhs = + case peelTsFuns rhs_arity (findTypeShape (ae_fam_envs env) (exprType rhs)) of + Just (TsProd tss) -> mkCallDmds rhs_arity (cleanEvalProdDmd (length tss)) + _ -> mkCallDmds rhs_arity cleanEvalDmd + +-- | If given the let-bound 'Id', 'useLetUp' determines whether we should +-- process the binding up (body before rhs) or down (rhs before body). -- --- We use LetDown if there is a chance to get a useful strictness signature. --- This is the case when there are manifest value lambdas or the binding is a --- join point (hence always acts like a function, not a value). -useLetUp :: Var -> CoreExpr -> Bool -useLetUp f _ | isJoinId f = False -useLetUp f (Lam v e) | isTyVar v = useLetUp f e -useLetUp _ (Lam _ _) = False -useLetUp _ _ = True - +-- We use LetDown if there is a chance to get a useful strictness signature to +-- unleash at call sites. LetDown is generally more precise than LetUp if we can +-- correctly guess how it will be used in the body, that is, for which incoming +-- demand the strictness signature should be computed, which allows us to +-- unleash higher-order demands on arguments at call sites. This is mostly the +-- case when +-- +-- * The binding takes any arguments before performing meaningful work (cf. +-- 'idArity'), in which case we are interested to see how it uses them. +-- * The binding is a join point, hence acting like a function, not a value. +-- As a big plus, we know *precisely* how it will be used in the body; since +-- it's always tail-called, we can directly unleash the incoming demand of +-- the let binding on its RHS when computing a strictness signature. See +-- [Demand analysis for join points]. +-- +-- Thus, if the binding is not a join point and its arity is 0, we have a thunk +-- and use LetUp, implying that we have no usable demand signature available +-- when we analyse the let body. +-- +-- Since thunk evaluation is memoised, we want to unleash its 'DmdEnv' of free +-- vars at most once, regardless of how many times it was forced in the body. +-- This makes a real difference wrt. usage demands. The other reason is being +-- able to unleash a more precise product demand on its RHS once we know how the +-- thunk was used in the let body. +-- +-- Characteristic examples, always assuming a single evaluation: +-- +-- * @let x = 2*y in x + x@ => LetUp. Compared to LetDown, we find out that +-- the expression uses @y@ at most once. +-- * @let x = (a,b) in fst x@ => LetUp. Compared to LetDown, we find out that +-- @b@ is absent. +-- * @let f x = x*2 in f y@ => LetDown. Compared to LetUp, we find out that +-- the expression uses @y@ strictly, because we have @f@'s demand signature +-- available at the call site. +-- * @join exit = 2*y in if a then exit else if b then exit else 3*y@ => +-- LetDown. Compared to LetUp, we find out that the expression uses @y@ +-- strictly, because we can unleash @exit@'s signature at each call site. +-- * For a more convincing example with join points, see Note [Demand analysis +-- for join points]. +-- +useLetUp :: Var -> Bool +useLetUp f = idArity f == 0 && not (isJoinId f) {- Note [Demand analysis for join points] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -728,22 +727,141 @@ let_dmd here). Another win for join points! #13543. +Note [Demand signatures are computed for a threshold demand based on idArity] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We compute demand signatures assuming idArity incoming arguments to approximate +behavior for when we have a call site with at least that many arguments. idArity +is /at least/ the number of manifest lambdas, but might be higher for PAPs and +trivial RHS (see Note [Demand analysis for trivial right-hand sides]). + +Because idArity of a function varies independently of its cardinality properties +(cf. Note [idArity varies independently of dmdTypeDepth]), we implicitly encode +the arity for when a demand signature is sound to unleash in its 'dmdTypeDepth' +(cf. Note [Understanding DmdType and StrictSig] in Demand). It is unsound to +unleash a demand signature when the incoming number of arguments is less than +that. See Note [What are demand signatures?] for more details on soundness. + +Why idArity arguments? Because that's a conservative estimate of how many +arguments we must feed a function before it does anything interesting with them. +Also it elegantly subsumes the trivial RHS and PAP case. + +There might be functions for which we might want to analyse for more incoming +arguments than idArity. Example: + + f x = + if expensive + then \y -> ... y ... + else \y -> ... y ... + +We'd analyse `f` under a unary call demand C(S), corresponding to idArity +being 1. That's enough to look under the manifest lambda and find out how a +unary call would use `x`, but not enough to look into the lambdas in the if +branches. + +On the other hand, if we analysed for call demand C(C(S)), we'd get useful +strictness info for `y` (and more precise info on `x`) and possibly CPR +information, but + + * We would no longer be able to unleash the signature at unary call sites + * Performing the worker/wrapper split based on this information would be + implicitly eta-expanding `f`, playing fast and loose with divergence and + even being unsound in the presence of newtypes, so we refrain from doing so. + Also see Note [Don't eta expand in w/w] in WorkWrap. + +Since we only compute one signature, we do so for arity 1. Computing multiple +signatures for different arities (i.e., polyvariance) would be entirely +possible, if it weren't for the additional runtime and implementation +complexity. + +Note [idArity varies independently of dmdTypeDepth] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We used to check in CoreLint that dmdTypeDepth <= idArity for a let-bound +identifier. But that means we would have to zap demand signatures every time we +reset or decrease arity. That's an unnecessary dependency, because + + * The demand signature captures a semantic property that is independent of + what the binding's current arity is + * idArity is analysis information itself, thus volatile + * We already *have* dmdTypeDepth, wo why not just use it to encode the + threshold for when to unleash the signature + (cf. Note [Understanding DmdType and StrictSig] in Demand) + +Consider the following expression, for example: + + (let go x y = `x` seq ... in go) |> co + +`go` might have a strictness signature of ``. The simplifier will identify +`go` as a nullary join point through `joinPointBinding_maybe` and float the +coercion into the binding, leading to an arity decrease: + + join go = (\x y -> `x` seq ...) |> co in go + +With the CoreLint check, we would have to zap `go`'s perfectly viable strictness +signature. + +Note [What are demand signatures?] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Demand analysis interprets expressions in the abstract domain of demand +transformers. Given an incoming demand we put an expression under, its abstract +transformer gives us back a demand type denoting how other things (like +arguments and free vars) were used when the expression was evaluated. +Here's an example: + + f x y = + if x + expensive + then \z -> z + y * ... + else \z -> z * ... + +The abstract transformer (let's call it F_e) of the if expression (let's call it +e) would transform an incoming head demand into a demand type like +{x->,y->}. In pictures: + + Demand ---F_e---> DmdType + {x->,y->} + +Let's assume that the demand transformers we compute for an expression are +correct wrt. to some concrete semantics for Core. How do demand signatures fit +in? They are strange beasts, given that they come with strict rules when to +it's sound to unleash them. + +Fortunately, we can formalise the rules with Galois connections. Consider +f's strictness signature, {}. It's a single-point approximation of +the actual abstract transformer of f's RHS for arity 2. So, what happens is that +we abstract *once more* from the abstract domain we already are in, replacing +the incoming Demand by a simple lattice with two elements denoting incoming +arity: A_2 = {<2, >=2} (where '<2' is the top element and >=2 the bottom +element). Here's the diagram: + + A_2 -----f_f----> DmdType + ^ | + | α γ | + | v + Demand ---F_f---> DmdType + +With + α(C1(C1(_))) = >=2 -- example for usage demands, but similar for strictness + α(_) = <2 + γ(ty) = ty +and F_f being the abstract transformer of f's RHS and f_f being the abstracted +abstract transformer computable from our demand signature simply by + + f_f(>=2) = {} + f_f(<2) = postProcessUnsat {} + +where postProcessUnsat makes a proper top element out of the given demand type. + Note [Demand analysis for trivial right-hand sides] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider - foo = plusInt |> co + foo = plusInt |> co where plusInt is an arity-2 function with known strictness. Clearly we want plusInt's strictness to propagate to foo! But because it has no manifest lambdas, it won't do so automatically, and indeed 'co' might -have type (Int->Int->Int) ~ T, so we *can't* eta-expand. So we have a -special case for right-hand sides that are "trivial", namely variables, -casts, type applications, and the like. +have type (Int->Int->Int) ~ T. -Note that this can mean that 'foo' has an arity that is smaller than that -indicated by its demand info. e.g. if co :: (Int->Int->Int) ~ T, then -foo's arity will be zero (see Note [exprArity invariant] in CoreArity), -but its demand signature will be that of plusInt. A small example is the -test case of #8963. +Fortunately, CoreArity gives 'foo' arity 2, which is enough for LetDown to +forward plusInt's demand signature, and all is well (see Note [Newtype arity] in +CoreArity)! A small example is the test case NewtypeArity. Note [Product demands for function body] @@ -841,13 +959,6 @@ annotateBndr env dmd_ty var where (dmd_ty', dmd) = findBndrDmd env False dmd_ty var -annotateLamBndrs :: AnalEnv -> DFunFlag -> DmdType -> [Var] -> (DmdType, [Var]) -annotateLamBndrs env args_of_dfun ty bndrs = mapAccumR annotate ty bndrs - where - annotate dmd_ty bndr - | isId bndr = annotateLamIdBndr env args_of_dfun dmd_ty bndr - | otherwise = (dmd_ty, bndr) - annotateLamIdBndr :: AnalEnv -> DFunFlag -- is this lambda at the top of the RHS of a dfun? -> DmdType -- Demand type of body @@ -1160,12 +1271,6 @@ extendSigEnv top_lvl sigs var sig = extendVarEnv sigs var (sig, top_lvl) lookupSigEnv :: AnalEnv -> Id -> Maybe (StrictSig, TopLevelFlag) lookupSigEnv env id = lookupVarEnv (ae_sigs env) id -getStrictness :: AnalEnv -> Id -> StrictSig -getStrictness env fn - | isGlobalId fn = idStrictness fn - | Just (sig, _) <- lookupSigEnv env fn = sig - | otherwise = nopSig - nonVirgin :: AnalEnv -> AnalEnv nonVirgin env = env { ae_virgin = False } ===================================== compiler/stranal/WorkWrap.hs ===================================== @@ -9,6 +9,7 @@ module WorkWrap ( wwTopBinds ) where import GhcPrelude +import CoreArity ( manifestArity ) import CoreSyn import CoreUnfold ( certainlyWillInline, mkWwInlineRule, mkWorkerUnfolding ) import CoreUtils ( exprType, exprIsHNF ) @@ -457,7 +458,7 @@ tryWW dflags fam_envs is_rec fn_id rhs -- See Note [Don't w/w INLINE things] -- See Note [Don't w/w inline small non-loop-breaker things] - | is_fun + | is_fun && is_eta_exp = splitFun dflags fam_envs new_fn_id fn_info wrap_dmds res_info rhs | is_thunk -- See Note [Thunk splitting] @@ -474,9 +475,11 @@ tryWW dflags fam_envs is_rec fn_id rhs -- See Note [Zapping DmdEnv after Demand Analyzer] and -- See Note [Zapping Used Once info in WorkWrap] - is_fun = notNull wrap_dmds || isJoinId fn_id - is_thunk = not is_fun && not (exprIsHNF rhs) && not (isJoinId fn_id) - && not (isUnliftedType (idType fn_id)) + is_fun = notNull wrap_dmds || isJoinId fn_id + -- See Note [Don't eta expand in w/w] + is_eta_exp = length wrap_dmds == manifestArity rhs + is_thunk = not is_fun && not (exprIsHNF rhs) && not (isJoinId fn_id) + && not (isUnliftedType (idType fn_id)) {- Note [Zapping DmdEnv after Demand Analyzer] @@ -516,6 +519,36 @@ want to _keep_ the info for the code generator). We do not do it in the demand analyser for the same reasons outlined in Note [Zapping DmdEnv after Demand Analyzer] above. + +Note [Don't eta expand in w/w] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A binding where the manifestArity of the RHS is less than idArity of the binder +means CoreArity didn't eta expand that binding. When this happens, it does so +for a reason (see Note [exprArity invariant] in CoreArity) and we probably have +a PAP, cast or trivial expression as RHS. + +Performing the worker/wrapper split will implicitly eta-expand the binding to +idArity, overriding CoreArity's decision. Other than playing fast and loose with +divergence, it's also broken for newtypes: + + f = (\xy.blah) |> co + where + co :: (Int -> Int -> Char) ~ T + +Then idArity is 2 (despite the type T), and it can have a StrictSig based on a +threshold of 2. But we can't w/w it without a type error. + +The situation is less grave for PAPs, but the implicit eta expansion caused a +compiler allocation regression in T15164, where huge recursive instance method +groups, mostly consisting of PAPs, got w/w'd. This caused great churn in the +simplifier, when simply waiting for the PAPs to inline arrived at the same +output program. + +Note there is the worry here that such PAPs and trivial RHSs might not *always* +be inlined. That would lead to reboxing, because the analysis tacitly assumes +that we W/W'd for idArity and will propagate analysis information under that +assumption. So far, this doesn't seem to matter in practice. +See https://gitlab.haskell.org/ghc/ghc/merge_requests/312#note_192064. -} ===================================== compiler/stranal/WwLib.hs ===================================== @@ -134,7 +134,7 @@ mkWwBodies :: DynFlags -- wrap_fn_str E = case x of { (a,b) -> -- case a of { (a1,a2) -> -- E a1 a2 b y }} --- work_fn_str E = \a2 a2 b y -> +-- work_fn_str E = \a1 a2 b y -> -- let a = (a1,a2) in -- let x = (a,b) in -- E ===================================== testsuite/tests/perf/compiler/WWRec.hs ===================================== @@ -0,0 +1,73 @@ +{-# LANGUAGE MultiParamTypeClasses #-} +{-# LANGUAGE FlexibleContexts #-} +{-# LANGUAGE FlexibleInstances #-} +{-# LANGUAGE UndecidableInstances #-} + +module WWRec where + +class Rule f a where + get :: Decorator f => f a +class Monad f => Decorator f where + foo :: Rule f a => f a + +data A1 = MkA1 A2 +data A2 = MkA2 A3 +data A3 = MkA3 A4 +data A4 = MkA4 A5 +data A5 = MkA5 A6 +data A6 = MkA6 A7 +data A7 = MkA7 A8 +data A8 = MkA8 A9 +data A9 = MkA9 A10 +data A10 = MkA10 A11 +data A11 = MkA11 A12 +data A12 = MkA12 A13 +data A13 = MkA13 A14 +data A14 = MkA14 A15 +data A15 = MkA15 A16 +data A16 = MkA16 A17 +data A17 = MkA17 A18 +data A18 = MkA18 A19 +data A19 = MkA19 A20 +data A20 = MkA20 A21 +data A21 = MkA21 A22 +data A22 = MkA22 A23 +data A23 = MkA23 A24 +data A24 = MkA24 A25 +data A25 = MkA25 A26 +data A26 = MkA26 A27 +data A27 = MkA27 A28 +data A28 = MkA28 A29 +data A29 = MkA29 A30 +data A30 = MkA30 A1 + +instance Rule f A2 => Rule f A1 where get = MkA1 <$> foo +instance Rule f A3 => Rule f A2 where get = MkA2 <$> foo +instance Rule f A4 => Rule f A3 where get = MkA3 <$> foo +instance Rule f A5 => Rule f A4 where get = MkA4 <$> foo +instance Rule f A6 => Rule f A5 where get = MkA5 <$> foo +instance Rule f A7 => Rule f A6 where get = MkA6 <$> foo +instance Rule f A8 => Rule f A7 where get = MkA7 <$> foo +instance Rule f A9 => Rule f A8 where get = MkA8 <$> foo +instance Rule f A10 => Rule f A9 where get = MkA9 <$> foo +instance Rule f A11 => Rule f A10 where get = MkA10 <$> foo +instance Rule f A12 => Rule f A11 where get = MkA11 <$> foo +instance Rule f A13 => Rule f A12 where get = MkA12 <$> foo +instance Rule f A14 => Rule f A13 where get = MkA13 <$> foo +instance Rule f A15 => Rule f A14 where get = MkA14 <$> foo +instance Rule f A16 => Rule f A15 where get = MkA15 <$> foo +instance Rule f A17 => Rule f A16 where get = MkA16 <$> foo +instance Rule f A18 => Rule f A17 where get = MkA17 <$> foo +instance Rule f A19 => Rule f A18 where get = MkA18 <$> foo +instance Rule f A20 => Rule f A19 where get = MkA19 <$> foo +instance Rule f A21 => Rule f A20 where get = MkA20 <$> foo +instance Rule f A22 => Rule f A21 where get = MkA21 <$> foo +instance Rule f A23 => Rule f A22 where get = MkA22 <$> foo +instance Rule f A24 => Rule f A23 where get = MkA23 <$> foo +instance Rule f A25 => Rule f A24 where get = MkA24 <$> foo +instance Rule f A26 => Rule f A25 where get = MkA25 <$> foo +instance Rule f A27 => Rule f A26 where get = MkA26 <$> foo +instance Rule f A28 => Rule f A27 where get = MkA27 <$> foo +instance Rule f A29 => Rule f A28 where get = MkA28 <$> foo +instance Rule f A30 => Rule f A29 where get = MkA29 <$> foo +instance Rule f A1 => Rule f A30 where get = MkA30 <$> foo ===================================== testsuite/tests/perf/compiler/all.T ===================================== @@ -393,6 +393,13 @@ test ('T15164', compile, ['-v0 -O']) +# See https://gitlab.haskell.org/ghc/ghc/merge_requests/312#note_186960 +test ('WWRec', + [ collect_compiler_stats('bytes allocated',10) + ], + compile, + ['-v0 -O']) + test('T16190', [ collect_stats(), when(opsys('mingw32'), expect_broken(16389)) ===================================== testsuite/tests/stranal/sigs/NewtypeArity.hs ===================================== @@ -0,0 +1,10 @@ +-- | 't' and 't2' should have a strictness signature for arity 2 here. +module Test where + +newtype T = MkT (Int -> Int -> Int) + +t :: T +t = MkT (\a b -> a + b) + +t2 :: T +t2 = MkT (+) ===================================== testsuite/tests/stranal/sigs/NewtypeArity.stderr ===================================== @@ -0,0 +1,18 @@ + +==================== Strictness signatures ==================== +Test.$tc'MkT: m +Test.$tcT: m +Test.$trModule: m +Test.t: m +Test.t2: m + + + +==================== Strictness signatures ==================== +Test.$tc'MkT: m +Test.$tcT: m +Test.$trModule: m +Test.t: m +Test.t2: m + + ===================================== testsuite/tests/stranal/sigs/all.T ===================================== @@ -17,3 +17,4 @@ test('BottomFromInnerLambda', normal, compile, ['']) test('DmdAnalGADTs', normal, compile, ['']) test('T12370', normal, compile, ['']) test('CaseBinderCPR', normal, compile, ['']) +test('NewtypeArity', normal, compile, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e00de39929e12cbbc0bb2875d8f454359c967410 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e00de39929e12cbbc0bb2875d8f454359c967410 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 08:53:05 2019 From: gitlab at gitlab.haskell.org (Vladislav Zavialov) Date: Mon, 08 Apr 2019 04:53:05 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/top-level-kind-signatures Message-ID: <5cab0bf1cf9c5_62b33fa2cd318fa82231317@gitlab.haskell.org.mail> Vladislav Zavialov pushed new branch wip/top-level-kind-signatures at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/top-level-kind-signatures You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 14:36:47 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Mon, 08 Apr 2019 10:36:47 -0400 Subject: [Git][ghc/ghc][wip/slowtest] testsuite: Mark threadstatus-T9333 as fragile in ghci way Message-ID: <5cab5c7f60a0e_62b33fa27606c928225905d@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: a1927cb0 by Ben Gamari at 2019-04-08T14:35:51Z testsuite: Mark threadstatus-T9333 as fragile in ghci way As noted in #16555. - - - - - 1 changed file: - testsuite/tests/concurrent/should_run/all.T Changes: ===================================== testsuite/tests/concurrent/should_run/all.T ===================================== @@ -88,7 +88,7 @@ test('T7970', normal, compile_and_run, ['']) test('AtomicPrimops', normal, compile_and_run, ['']) # test uses 2 threads and yield, scheduling can vary with threaded2 -test('threadstatus-9333', [omit_ways(['threaded2'])], compile_and_run, ['']) +test('threadstatus-9333', [fragile_for(16555, ['ghci']), omit_ways(['threaded2'])], compile_and_run, ['']) test('T9379', normal, compile_and_run, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/a1927cb0269d1231bd8086d17f812cc38d8330e5 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/a1927cb0269d1231bd8086d17f812cc38d8330e5 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 17:47:52 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Mon, 08 Apr 2019 13:47:52 -0400 Subject: [Git][ghc/ghc][ghc-8.6] Move 8.6.5-notes.rst to docs/users_guide Message-ID: <5cab894869d3d_62b33fa24e567cac2270451@gitlab.haskell.org.mail> Ben Gamari pushed to branch ghc-8.6 at Glasgow Haskell Compiler / GHC Commits: 4c7d3228 by Ben Gamari at 2019-04-08T17:47:04Z Move 8.6.5-notes.rst to docs/users_guide - - - - - 1 changed file: - 8.6.5-notes.rst → docs/users_guide/8.6.5-notes.rst Changes: ===================================== 8.6.5-notes.rst → docs/users_guide/8.6.5-notes.rst ===================================== View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/4c7d32280b6fbea95af12432deaf0c9c57be6ad7 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/4c7d32280b6fbea95af12432deaf0c9c57be6ad7 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 18:19:34 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 14:19:34 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 12 commits: Fix #16282. Message-ID: <5cab90b67929d_62b33fa2b313a49422851c8@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 3025e427 by Michal Terepeta at 2019-04-08T18:19:13Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - d487cba1 by Phuong Trinh at 2019-04-08T18:19:15Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 79ef852c by Yuriy Syrovetskiy at 2019-04-08T18:19:16Z Add `-optcxx` option (#16477) - - - - - 7df5aedf by Ben Gamari at 2019-04-08T18:19:16Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - f4814f35 by Yuriy Syrovetskiy at 2019-04-08T18:19:17Z Fix whitespace style - - - - - ae8251b1 by Matthew Pickering at 2019-04-08T18:19:17Z Use ./hadrian/ghci.sh in .ghcid - - - - - 6665a30d by Sebastian Graf at 2019-04-08T18:19:18Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - c2f2734e by Ömer Sinan Ağacan at 2019-04-08T18:19:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 0eebc0a7 by Ömer Sinan Ağacan at 2019-04-08T18:19:23Z testsuite: Show exit code of GHCi tests on failure - - - - - 27f3939f by John Ericson at 2019-04-08T18:19:24Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - 27d0db37 by Ryan Scott at 2019-04-08T18:19:26Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - libraries/ghc-prim/cbits/bitrev.c - libraries/hpc - settings.in - testsuite/driver/testlib.py - + testsuite/tests/codeGen/should_run/NewSmallArray.hs - + testsuite/tests/codeGen/should_run/NewSmallArray.stdout - testsuite/tests/codeGen/should_run/all.T - + testsuite/tests/driver/T16500/A.hs - + testsuite/tests/driver/T16500/B.hs - + testsuite/tests/driver/T16500/Makefile - + testsuite/tests/driver/T16500/T16500.stdout - + testsuite/tests/driver/T16500/all.T - testsuite/tests/ffi/should_compile/all.T - + testsuite/tests/ffi/should_compile/cc017.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f4fc54c12b2d6a25497fab30dbef2a0d94183f55...27d0db377d53b46a4861dc977010049f6eb6088e -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f4fc54c12b2d6a25497fab30dbef2a0d94183f55...27d0db377d53b46a4861dc977010049f6eb6088e You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 18:35:41 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 14:35:41 -0400 Subject: [Git][ghc/ghc][master] Generate straightline code for inline array allocation Message-ID: <5cab947d7fac_62b33fa2cd1d76082307583@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 4 changed files: - compiler/codeGen/StgCmmPrim.hs - + testsuite/tests/codeGen/should_run/NewSmallArray.hs - + testsuite/tests/codeGen/should_run/NewSmallArray.stdout - testsuite/tests/codeGen/should_run/all.T Changes: ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -2105,17 +2105,11 @@ doNewArrayOp res_r rep info payload n init = do -- Initialise all elements of the array p <- assignTemp $ cmmOffsetB dflags (CmmReg arr) (hdrSize dflags rep) - for <- newBlockId - emitLabel for - let loopBody = - [ mkStore (CmmReg (CmmLocal p)) init - , mkAssign (CmmLocal p) (cmmOffsetW dflags (CmmReg (CmmLocal p)) 1) - , mkBranch for ] - emit =<< mkCmmIfThen - (cmmULtWord dflags (CmmReg (CmmLocal p)) - (cmmOffsetW dflags (CmmReg arr) - (hdrSizeW dflags rep + n))) - (catAGraphs loopBody) + let initialization = + [ mkStore (cmmOffsetW dflags (CmmReg (CmmLocal p)) off) init + | off <- [0.. n - 1] + ] + emit (catAGraphs initialization) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) ===================================== testsuite/tests/codeGen/should_run/NewSmallArray.hs ===================================== @@ -0,0 +1,96 @@ +{-# LANGUAGE BangPatterns #-} +{-# LANGUAGE MagicHash #-} +{-# LANGUAGE UnboxedTuples #-} + +-- Tests for creating and initializing a @SmallArray#@ including the +-- optimiziation where GHC inlines the code instead of calling the +-- @newSmallArray#@ primop if the length is small enough and known at compile +-- time. +module Main where + +import GHC.Exts +import GHC.ST + +import Control.Monad (forM_) + + +main :: IO () +main = do + let !a00 = newSmallArrayWith42 0 + !a01 = newSmallArrayWith42 1 + !a02 = newSmallArrayWith42 2 + !a03 = newSmallArrayWith42 3 + !a04 = newSmallArrayWith42 4 + !a05 = newSmallArrayWith42 5 + !a06 = newSmallArrayWith42 6 + !a07 = newSmallArrayWith42 7 + !a08 = newSmallArrayWith42 8 + !a09 = newSmallArrayWith42 9 + !a10 = newSmallArrayWith42 10 + !a11 = newSmallArrayWith42 11 + !a12 = newSmallArrayWith42 12 + !a13 = newSmallArrayWith42 13 + !a14 = newSmallArrayWith42 14 + !a15 = newSmallArrayWith42 15 + !a16 = newSmallArrayWith42 16 + !a17 = newSmallArrayWith42 17 + !a18 = newSmallArrayWith42 18 + !a19 = newSmallArrayWith42 19 + !a20 = newSmallArrayWith42 20 + !a21 = newSmallArrayWith42 21 + !a22 = newSmallArrayWith42 22 + !a23 = newSmallArrayWith42 23 + !a24 = newSmallArrayWith42 24 + !a25 = newSmallArrayWith42 25 + !a26 = newSmallArrayWith42 26 + !a27 = newSmallArrayWith42 27 + !a28 = newSmallArrayWith42 28 + !a29 = newSmallArrayWith42 29 + !a30 = newSmallArrayWith42 30 + !a31 = newSmallArrayWith42 31 + !a32 = newSmallArrayWith42 32 + !a33 = newSmallArrayWith42 33 + !a34 = newSmallArrayWith42 34 + !a35 = newSmallArrayWith42 35 + !a36 = newSmallArrayWith42 36 + !a37 = newSmallArrayWith42 37 + !a38 = newSmallArrayWith42 38 + !a39 = newSmallArrayWith42 39 + !all = [ a00, a01, a02, a03, a04, a05, a06, a07, a08, a09 + , a10, a11, a12, a13, a14, a15, a16, a17, a18, a19 + , a20, a21, a22, a23, a24, a25, a26, a27, a28, a29 + , a30, a31, a32, a33, a34, a35, a36, a37, a38, a39 + ] + forM_ all (print . toListArray) + + +data Array a = Array { unArray :: SmallArray# a } + +newSmallArrayWith42 :: Int -> Array Int +newSmallArrayWith42 n = (runST (newArray n 42)) +-- inline to make sure the length is known at compile time +{-# INLINE newSmallArrayWith42 #-} + +newArray :: Int -> a -> ST s (Array a) +newArray (I# n#) a = ST $ \s1# -> case newSmallArray# n# a s1# of + (# s2#, marr# #) -> case unsafeFreezeSmallArray# marr# s2# of + (# s3#, arr# #) -> (# s3#, Array arr# #) +-- inline to make sure the length is known at compile time +{-# INLINE newArray #-} + +toListArray :: Array a -> [a] +toListArray arr = go 0 + where + go i | i >= lengthArray arr = [] + | otherwise = indexArray arr i : go (i+1) + +indexArray :: Array a -> Int -> a +indexArray arr i@(I# i#) + | i < 0 || i >= len = + error $ "bounds error, offset " ++ show i ++ ", length " ++ show len + | otherwise = case indexSmallArray# (unArray arr) i# of + (# a #) -> a + where len = lengthArray arr + +lengthArray :: Array a -> Int +lengthArray arr = I# (sizeofSmallArray# (unArray arr)) ===================================== testsuite/tests/codeGen/should_run/NewSmallArray.stdout ===================================== @@ -0,0 +1,40 @@ +[] +[42] +[42,42] +[42,42,42] +[42,42,42,42] +[42,42,42,42,42] +[42,42,42,42,42,42] +[42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] +[42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42,42] ===================================== testsuite/tests/codeGen/should_run/all.T ===================================== @@ -137,6 +137,7 @@ test('StaticByteArraySize', normal, compile_and_run, ['-O2']) test('CopySmallArray', normal, compile_and_run, ['']) test('CopySmallArrayStressTest', reqlib('random'), compile_and_run, ['']) test('SizeOfSmallArray', normal, compile_and_run, ['']) +test('NewSmallArray', normal, compile_and_run, ['']) test('T9001', normal, compile_and_run, ['']) test('T9013', omit_ways(['ghci']), # ghci doesn't support unboxed tuples compile_and_run, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/63b7d5fb9d695dafc243cbf6f9f70b06030c0dea -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/63b7d5fb9d695dafc243cbf6f9f70b06030c0dea You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 18:41:48 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 14:41:48 -0400 Subject: [Git][ghc/ghc][master] Fix #16500: look for interface files in -hidir flag in OneShot mode Message-ID: <5cab95ec3d52c_62b33fa2cd9f8d642311356@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 7 changed files: - compiler/main/Finder.hs - docs/users_guide/separate_compilation.rst - + testsuite/tests/driver/T16500/A.hs - + testsuite/tests/driver/T16500/B.hs - + testsuite/tests/driver/T16500/Makefile - + testsuite/tests/driver/T16500/T16500.stdout - + testsuite/tests/driver/T16500/all.T Changes: ===================================== compiler/main/Finder.hs ===================================== @@ -313,8 +313,10 @@ findInstalledHomeModule hsc_env mod_name = , ("lhsig", mkHomeModLocationSearched dflags mod_name "lhsig") ] - hi_exts = [ (hisuf, mkHiOnlyModLocation dflags hisuf) - , (addBootSuffix hisuf, mkHiOnlyModLocation dflags hisuf) + -- we use mkHomeModHiOnlyLocation instead of mkHiOnlyModLocation so that + -- when hiDir field is set in dflags, we know to look there (see #16500) + hi_exts = [ (hisuf, mkHomeModHiOnlyLocation dflags mod_name) + , (addBootSuffix hisuf, mkHomeModHiOnlyLocation dflags mod_name) ] -- In compilation manager modes, we look for source files in the home @@ -489,6 +491,15 @@ mkHomeModLocation2 dflags mod src_basename ext = do ml_obj_file = obj_fn, ml_hie_file = hie_fn }) +mkHomeModHiOnlyLocation :: DynFlags + -> ModuleName + -> FilePath + -> BaseName + -> IO ModLocation +mkHomeModHiOnlyLocation dflags mod path basename = do + loc <- mkHomeModLocation2 dflags mod (path basename) "" + return loc { ml_hs_file = Nothing } + mkHiOnlyModLocation :: DynFlags -> Suffix -> FilePath -> String -> IO ModLocation mkHiOnlyModLocation dflags hisuf path basename ===================================== docs/users_guide/separate_compilation.rst ===================================== @@ -260,6 +260,9 @@ Redirecting the compilation output(s) example, they would still be put in ``parse/Foo.hi``, ``parse/Bar.hi``, and ``gurgle/Bumble.hi``. + Please also note that when doing incremental compilation, this directory is + where GHC looks into to find object files from previous builds. + .. ghc-flag:: -ohi ⟨file⟩ :shortdesc: set the filename in which to put the interface :type: dynamic @@ -288,6 +291,10 @@ Redirecting the compilation output(s) Redirects all generated interface files into ⟨dir⟩, instead of the default. + Please also note that when doing incremental compilation (by ``ghc --make`` + or ``ghc -c``), this directory is where GHC looks into to find interface + files. + .. ghc-flag:: -hiedir ⟨dir⟩ :shortdesc: set directory for extended interface files :type: dynamic @@ -296,6 +303,10 @@ Redirecting the compilation output(s) Redirects all generated extended interface files into ⟨dir⟩, instead of the default. + Please also note that when doing incremental compilation (by ``ghc --make`` + or ``ghc -c``), this directory is where GHC looks into to find extended + interface files. + .. ghc-flag:: -stubdir ⟨dir⟩ :shortdesc: redirect FFI stub files :type: dynamic ===================================== testsuite/tests/driver/T16500/A.hs ===================================== @@ -0,0 +1,4 @@ +module A (message) where + +message :: String +message = "Hello!!" ===================================== testsuite/tests/driver/T16500/B.hs ===================================== @@ -0,0 +1,7 @@ +module B where + +import A (message) + +main :: IO () +main = do + putStrLn message ===================================== testsuite/tests/driver/T16500/Makefile ===================================== @@ -0,0 +1,12 @@ +TOP=../../.. +include $(TOP)/mk/boilerplate.mk +include $(TOP)/mk/test.mk + +T16500: + $(RM) -rf interfaces objects + $(RM) A.hi + mkdir -p interfaces + mkdir -p objects + "$(TEST_HC)" $(TEST_HC_OPTS) -v0 -c A.hs -iinterfaces -hidir interfaces -odir objects + touch A.hi + "$(TEST_HC)" $(TEST_HC_OPTS) -v0 -c B.hs -iinterfaces -hidir interfaces -odir objects ===================================== testsuite/tests/driver/T16500/T16500.stdout ===================================== ===================================== testsuite/tests/driver/T16500/all.T ===================================== @@ -0,0 +1 @@ +test('T16500', [extra_files(['A.hs','B.hs',]),], run_command, ['$MAKE -s --no-print-directory T16500']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/2b3f4718502465e2b4dfa4a7868ed7a3ad5e4ff1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/2b3f4718502465e2b4dfa4a7868ed7a3ad5e4ff1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 18:47:55 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 14:47:55 -0400 Subject: [Git][ghc/ghc][master] Add `-optcxx` option (#16477) Message-ID: <5cab975b37740_62b33fa2e6b550f4231504e@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 12 changed files: - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - docs/users_guide/phases.rst - testsuite/tests/ffi/should_compile/all.T - + testsuite/tests/ffi/should_compile/cc017.hs - testsuite/tests/ffi/should_run/all.T - testsuite/tests/th/T13366.hs Changes: ===================================== compiler/main/DriverPhases.hs ===================================== @@ -369,4 +369,3 @@ isHaskellSigFilename f = isHaskellSigSuffix (drop 1 $ takeExtension f) isObjectFilename, isDynLibFilename :: Platform -> FilePath -> Bool isObjectFilename platform f = isObjectSuffix platform (drop 1 $ takeExtension f) isDynLibFilename platform f = isDynLibSuffix platform (drop 1 $ takeExtension f) - ===================================== compiler/main/DriverPipeline.hs ===================================== @@ -1218,17 +1218,8 @@ runPhase (RealPhase cc_phase) input_fn dflags ghcVersionH <- liftIO $ getGhcVersionPathName dflags - let gcc_lang_opt | cc_phase `eqPhase` Ccxx = "c++" - | cc_phase `eqPhase` Cobjc = "objective-c" - | cc_phase `eqPhase` Cobjcxx = "objective-c++" - | otherwise = "c" - liftIO $ SysTools.runCc dflags ( - -- force the C compiler to interpret this file as C when - -- compiling .hc files, by adding the -x c option. - -- Also useful for plain .c files, just in case GHC saw a - -- -x c option. - [ SysTools.Option "-x", SysTools.Option gcc_lang_opt - , SysTools.FileOption "" input_fn + liftIO $ SysTools.runCc (phaseForeignLanguage cc_phase) dflags ( + [ SysTools.FileOption "" input_fn , SysTools.Option "-o" , SysTools.FileOption "" output_fn ] @@ -1917,7 +1908,7 @@ doCpp dflags raw input_fn output_fn = do let verbFlags = getVerbFlags dflags let cpp_prog args | raw = SysTools.runCpp dflags args - | otherwise = SysTools.runCc dflags (SysTools.Option "-E" : args) + | otherwise = SysTools.runCc Nothing dflags (SysTools.Option "-E" : args) let target_defs = [ "-D" ++ HOST_OS ++ "_BUILD_OS", ===================================== compiler/main/DynFlags.hs ===================================== @@ -92,7 +92,8 @@ module DynFlags ( extraGccViaCFlags, systemPackageConfig, pgm_L, pgm_P, pgm_F, pgm_c, pgm_a, pgm_l, pgm_dll, pgm_T, pgm_windres, pgm_libtool, pgm_ar, pgm_ranlib, pgm_lo, pgm_lc, - pgm_lcc, pgm_i, opt_L, opt_P, opt_F, opt_c, opt_a, opt_l, opt_i, + pgm_lcc, pgm_i, + opt_L, opt_P, opt_F, opt_c, opt_cxx, opt_a, opt_l, opt_i, opt_P_signature, opt_windres, opt_lo, opt_lc, opt_lcc, @@ -1340,6 +1341,7 @@ data Settings = Settings { -- See Note [Repeated -optP hashing] sOpt_F :: [String], sOpt_c :: [String], + sOpt_cxx :: [String], sOpt_a :: [String], sOpt_l :: [String], sOpt_windres :: [String], @@ -1423,6 +1425,8 @@ opt_F dflags = sOpt_F (settings dflags) opt_c :: DynFlags -> [String] opt_c dflags = concatMap (wayOptc (targetPlatform dflags)) (ways dflags) ++ sOpt_c (settings dflags) +opt_cxx :: DynFlags -> [String] +opt_cxx dflags = sOpt_cxx (settings dflags) opt_a :: DynFlags -> [String] opt_a dflags = sOpt_a (settings dflags) opt_l :: DynFlags -> [String] @@ -2520,7 +2524,7 @@ setObjectDir, setHiDir, setHieDir, setStubDir, setDumpDir, setOutputDir, setDynObjectSuf, setDynHiSuf, setDylibInstallName, setObjectSuf, setHiSuf, setHieSuf, setHcSuf, parseDynLibLoaderMode, - setPgmP, addOptl, addOptc, addOptP, + setPgmP, addOptl, addOptc, addOptcxx, addOptP, addCmdlineFramework, addHaddockOpts, addGhciScript, setInteractivePrint :: String -> DynFlags -> DynFlags @@ -2636,6 +2640,7 @@ setDumpPrefixForce f d = d { dumpPrefixForce = f} setPgmP f = let (pgm:args) = words f in alterSettings (\s -> s { sPgm_P = (pgm, map Option args)}) addOptl f = alterSettings (\s -> s { sOpt_l = f : sOpt_l s}) addOptc f = alterSettings (\s -> s { sOpt_c = f : sOpt_c s}) +addOptcxx f = alterSettings (\s -> s { sOpt_cxx = f : sOpt_cxx s}) addOptP f = alterSettings (\s -> s { sOpt_P = f : sOpt_P s , sOpt_P_fingerprint = fingerprintStrings (f : sOpt_P s) }) @@ -3038,6 +3043,8 @@ dynamic_flags_deps = [ (hasArg (\f -> alterSettings (\s -> s { sOpt_F = f : sOpt_F s}))) , make_ord_flag defFlag "optc" (hasArg addOptc) + , make_ord_flag defFlag "optcxx" + (hasArg addOptcxx) , make_ord_flag defFlag "opta" (hasArg (\f -> alterSettings (\s -> s { sOpt_a = f : sOpt_a s}))) , make_ord_flag defFlag "optl" ===================================== compiler/main/HscTypes.hs ===================================== @@ -30,6 +30,7 @@ module HscTypes ( ModGuts(..), CgGuts(..), ForeignStubs(..), appendStubC, ImportedMods, ImportedBy(..), importedByUser, ImportedModsVal(..), SptEntry(..), ForeignSrcLang(..), + phaseForeignLanguage, ModSummary(..), ms_imps, ms_installed_mod, ms_mod_name, showModMsg, isBootSummary, msHsFilePath, msHiFilePath, msObjFilePath, @@ -182,6 +183,7 @@ import CmdLineParser import DynFlags import DriverPhases ( Phase, HscSource(..), hscSourceString , isHsBootOrSig, isHsigFile ) +import qualified DriverPhases as Phase import BasicTypes import IfaceSyn import Maybes @@ -3136,3 +3138,15 @@ Also see Note [Typechecking Complete Matches] in TcBinds for a more detailed explanation for how GHC ensures that all the conlikes in a COMPLETE set are consistent. -} + +-- | Foreign language of the phase if the phase deals with a foreign code +phaseForeignLanguage :: Phase -> Maybe ForeignSrcLang +phaseForeignLanguage phase = case phase of + Phase.Cc -> Just LangC + Phase.Ccxx -> Just LangCxx + Phase.Cobjc -> Just LangObjc + Phase.Cobjcxx -> Just LangObjcxx + Phase.HCc -> Just LangC + Phase.As _ -> Just LangAsm + Phase.MergeForeign -> Just RawObject + _ -> Nothing ===================================== compiler/main/SysTools.hs ===================================== @@ -301,6 +301,7 @@ initSysTools top_dir sOpt_P_fingerprint = fingerprint0, sOpt_F = [], sOpt_c = [], + sOpt_cxx = [], sOpt_a = [], sOpt_l = [], sOpt_windres = [], ===================================== compiler/main/SysTools/ExtraObj.hs ===================================== @@ -40,7 +40,7 @@ mkExtraObj dflags extn xs oFile <- newTempName dflags TFL_GhcSession "o" writeFile cFile xs ccInfo <- liftIO $ getCompilerInfo dflags - runCc dflags + runCc Nothing dflags ([Option "-c", FileOption "" cFile, Option "-o", ===================================== compiler/main/SysTools/Tasks.hs ===================================== @@ -10,6 +10,7 @@ module SysTools.Tasks where import Exception import ErrUtils +import HscTypes import DynFlags import Outputable import Platform @@ -58,11 +59,12 @@ runPp dflags args = do opts = map Option (getOpts dflags opt_F) runSomething dflags "Haskell pre-processor" prog (args ++ opts) -runCc :: DynFlags -> [Option] -> IO () -runCc dflags args = do +-- | Run compiler of C-like languages and raw objects (such as gcc or clang). +runCc :: Maybe ForeignSrcLang -> DynFlags -> [Option] -> IO () +runCc mLanguage dflags args = do let (p,args0) = pgm_c dflags - args1 = map Option (getOpts dflags opt_c) - args2 = args0 ++ args ++ args1 + args1 = map Option userOpts + args2 = args0 ++ languageOptions ++ args ++ args1 -- We take care to pass -optc flags in args1 last to ensure that the -- user can override flags passed by GHC. See #14452. mb_env <- getGccEnv args2 @@ -118,6 +120,21 @@ runCc dflags args = do | "warning: call-clobbered register used" `isContainedIn` w = False | otherwise = True + -- force the C compiler to interpret this file as C when + -- compiling .hc files, by adding the -x c option. + -- Also useful for plain .c files, just in case GHC saw a + -- -x c option. + (languageOptions, userOpts) = case mLanguage of + Nothing -> ([], userOpts_c) + Just language -> ([Option "-x", Option languageName], opts) where + (languageName, opts) = case language of + LangCxx -> ("c++", userOpts_cxx) + LangObjc -> ("objective-c", userOpts_c) + LangObjcxx -> ("objective-c++", userOpts_cxx) + _ -> ("c", userOpts_c) + userOpts_c = getOpts dflags opt_c + userOpts_cxx = getOpts dflags opt_cxx + isContainedIn :: String -> String -> Bool xs `isContainedIn` ys = any (xs `isPrefixOf`) (tails ys) ===================================== docs/users_guide/phases.rst ===================================== @@ -154,6 +154,13 @@ the following flags: Pass ⟨option⟩ to the C compiler. +.. ghc-flag:: -optcxx ⟨option⟩ + :shortdesc: pass ⟨option⟩ to the C++ compiler + :type: dynamic + :category: phase-options + + Pass ⟨option⟩ to the C++ compiler. + .. ghc-flag:: -optlo ⟨option⟩ :shortdesc: pass ⟨option⟩ to the LLVM optimiser :type: dynamic ===================================== testsuite/tests/ffi/should_compile/all.T ===================================== @@ -32,3 +32,12 @@ test('cc016', normal, compile, ['']) test('T10460', normal, compile, ['']) test('T11983', [omit_ways(['ghci'])], compile, ['T11983.c']) test('T14125', normal, compile, ['']) +test( + 'cc017', + normal, + compile, + [ + '-optc=-DC -optcxx=-DCXX -optcxx=-std=c++11' + + (' -optcxx=-stdlib=libc++' if opsys('darwin') else '') + ], +) ===================================== testsuite/tests/ffi/should_compile/cc017.hs ===================================== @@ -0,0 +1,19 @@ +{-# LANGUAGE TemplateHaskell #-} + +import Language.Haskell.TH.Syntax + +-- Check -optc and -optcxx options passing. +-- This file must be compiled with -optc=-DC -optcxx=-DCXX + +do addForeignSource LangC + "int CXX = 0; // -DCXX must not be passed to C \n\ + \_Static_assert(C, \"name C must come from -DC\"); " + + addForeignSource LangCxx + "int C = 0; // -DC must not be passed to C++ \n\ + \static_assert(CXX, \"name CXX must come from -DCXX\"); " + + pure [] + +main :: IO () +main = pure () ===================================== testsuite/tests/ffi/should_run/all.T ===================================== @@ -28,7 +28,7 @@ test('ffi004', skip, compile_and_run, ['']) # On x86, the test suffers from floating-point differences due to the # use of 80-bit internal precision when using the native code generator. # -test('ffi005', [ omit_ways(prof_ways), +test('ffi005', [ omit_ways(prof_ways), when(arch('i386'), skip), when(platform('i386-apple-darwin'), expect_broken(4105)), exit_code(3) ], ===================================== testsuite/tests/th/T13366.hs ===================================== @@ -1,6 +1,6 @@ {-# LANGUAGE ForeignFunctionInterface #-} {-# LANGUAGE TemplateHaskell #-} -{-# OPTIONS_GHC -optc-DA_MACRO=1 #-} +{-# OPTIONS_GHC -optc-DA_MACRO=1 -optcxx-DA_MACRO=1 #-} import Language.Haskell.TH.Syntax import System.IO (hFlush, stdout) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/97502be8bda9199ac058b9677b4b6ba028022936 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/97502be8bda9199ac058b9677b4b6ba028022936 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 18:47:58 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 14:47:58 -0400 Subject: [Git][ghc/ghc][wip/T16389] 13 commits: gitlab-ci: Build hyperlinked sources for releases Message-ID: <5cab975eb0884_62b33fa24ed857902315250@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/T16389 at Glasgow Haskell Compiler / GHC Commits: 6f13e7b1 by Ben Gamari at 2019-04-03T12:12:26Z gitlab-ci: Build hyperlinked sources for releases Fixes #16445. - - - - - 895394c2 by Ben Gamari at 2019-04-03T12:15:06Z gitlab: Fix label names in issue templates - - - - - 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - 30 changed files: - .gitlab-ci.yml - .gitlab/issue_templates/bug.md - .gitlab/issue_templates/feature_request.md - HACKING.md - README.md - boot - compiler/codeGen/StgCmmPrim.hs - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - compiler/typecheck/TcTyClsDecls.hs - compiler/types/Type.hs - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - hadrian/appveyor.yml - hadrian/doc/windows.md - libraries/base/base.cabal - + libraries/base/tests/T16111.hs - + libraries/base/tests/T16111.stderr - libraries/base/tests/all.T - libraries/ghc-boot-th/ghc-boot-th.cabal.in The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/57b5a3f860215fa02df5bf0eee15c79a3faf890c...97d3d546e6f03977a9cbe7d5499bb66510411468 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/57b5a3f860215fa02df5bf0eee15c79a3faf890c...97d3d546e6f03977a9cbe7d5499bb66510411468 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 18:53:59 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 14:53:59 -0400 Subject: [Git][ghc/ghc][master] testsuite: Unmark T16190 as broken Message-ID: <5cab98c799d1_62b33fa2d139656c231746f@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - 1 changed file: - testsuite/tests/perf/compiler/all.T Changes: ===================================== testsuite/tests/perf/compiler/all.T ===================================== @@ -394,8 +394,6 @@ test ('T15164', ['-v0 -O']) test('T16190', - [ collect_stats(), - when(opsys('mingw32'), expect_broken(16389)) - ], + collect_stats(), multimod_compile, ['T16190.hs', '-v0']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/97d3d546e6f03977a9cbe7d5499bb66510411468 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/97d3d546e6f03977a9cbe7d5499bb66510411468 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:00:08 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:00:08 -0400 Subject: [Git][ghc/ghc][master] Fix whitespace style Message-ID: <5cab9a3816e6a_62b33fa2ecea1a6823210c3@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 3 changed files: - .gitlab-ci.yml - libraries/ghc-prim/cbits/bitrev.c - testsuite/tests/primops/should_run/T16164.hs Changes: ===================================== .gitlab-ci.yml ===================================== @@ -48,7 +48,7 @@ ghc-linters: stage: lint image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" script: - - git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME + - git fetch "$CI_MERGE_REQUEST_PROJECT_URL" $CI_MERGE_REQUEST_TARGET_BRANCH_NAME - base="$(git merge-base FETCH_HEAD $CI_COMMIT_SHA)" - "echo Merge base $base" # - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) @@ -764,4 +764,3 @@ pages: artifacts: paths: - public - ===================================== libraries/ghc-prim/cbits/bitrev.c ===================================== @@ -78,4 +78,4 @@ hs_bitrev64(StgWord64 x) // swap 32-bit long pairs x = ( x >> 32 ) | ( x << 32 ); return x; -} \ No newline at end of file +} ===================================== testsuite/tests/primops/should_run/T16164.hs ===================================== @@ -49,4 +49,4 @@ main = do printer bitReverse8 word8s printer bitReverse16 word16s printer bitReverse32 word32s - printer bitReverse64 word64s \ No newline at end of file + printer bitReverse64 word64s View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/a42d206a76e03b4cd831aa7bc72c6d4a0f124bd1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/a42d206a76e03b4cd831aa7bc72c6d4a0f124bd1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:00:11 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:00:11 -0400 Subject: [Git][ghc/ghc][wip/hadrian-ghcid] 13 commits: Replace git.haskell.org with gitlab.haskell.org (#16196) Message-ID: <5cab9a3b7ba5d_62b33fa2e6b550f423212c5@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/hadrian-ghcid at Glasgow Haskell Compiler / GHC Commits: 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - HACKING.md - README.md - boot - compiler/codeGen/StgCmmPrim.hs - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - compiler/typecheck/TcTyClsDecls.hs - compiler/types/Type.hs - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - hadrian/appveyor.yml - hadrian/doc/windows.md - libraries/base/base.cabal - + libraries/base/tests/T16111.hs - + libraries/base/tests/T16111.stderr - libraries/base/tests/all.T - libraries/ghc-boot-th/ghc-boot-th.cabal.in - libraries/ghc-boot/ghc-boot.cabal.in The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f3a9c418998b4ad147ce49adb70c0b669b86c228...4dda2270d26017eadddd99ed567aacf41c2913b9 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f3a9c418998b4ad147ce49adb70c0b669b86c228...4dda2270d26017eadddd99ed567aacf41c2913b9 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:03:52 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Mon, 08 Apr 2019 15:03:52 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/bump-windows-cabal Message-ID: <5cab9b189059_62b33fa2ecea1a682321991@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/bump-windows-cabal at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/bump-windows-cabal You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:06:13 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:06:13 -0400 Subject: [Git][ghc/ghc][master] Use ./hadrian/ghci.sh in .ghcid Message-ID: <5cab9ba5c782d_62b33fa2e6b550f423247a4@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - 1 changed file: - .ghcid Changes: ===================================== .ghcid ===================================== @@ -1,7 +1,5 @@ ---command utils/ghc-in-ghci/run.sh +--command ./hadrian/ghci.sh --reload compiler --reload ghc --reload includes ---restart utils/ghc-in-ghci/run.sh ---restart utils/ghc-in-ghci/load-main.ghci ---restart utils/ghc-in-ghci/settings.ghci +--restart hadrian/ View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/4dda2270d26017eadddd99ed567aacf41c2913b9 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/4dda2270d26017eadddd99ed567aacf41c2913b9 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:06:17 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:06:17 -0400 Subject: [Git][ghc/ghc][wip/T15753] 16 commits: gitlab-ci: Build hyperlinked sources for releases Message-ID: <5cab9ba932bbb_62b33fa2e346e75c232497f@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/T15753 at Glasgow Haskell Compiler / GHC Commits: 6f13e7b1 by Ben Gamari at 2019-04-03T12:12:26Z gitlab-ci: Build hyperlinked sources for releases Fixes #16445. - - - - - 895394c2 by Ben Gamari at 2019-04-03T12:15:06Z gitlab: Fix label names in issue templates - - - - - 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - .gitlab/issue_templates/bug.md - .gitlab/issue_templates/feature_request.md - HACKING.md - README.md - boot - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - compiler/typecheck/TcTyClsDecls.hs - compiler/types/Type.hs - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - hadrian/appveyor.yml - hadrian/doc/windows.md - libraries/base/base.cabal - + libraries/base/tests/T16111.hs - + libraries/base/tests/T16111.stderr The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/9108ead1d23cecd6c404eabc3d61a47079a099cb...d236d9d0f4f3be0641933b959dde14a065acd37f -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/9108ead1d23cecd6c404eabc3d61a47079a099cb...d236d9d0f4f3be0641933b959dde14a065acd37f You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:12:20 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:12:20 -0400 Subject: [Git][ghc/ghc][master] Make `singleConstructor` cope with pattern synonyms Message-ID: <5cab9d14b26b_62b39e0453023289c8@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 5 changed files: - compiler/deSugar/Check.hs - + testsuite/tests/pmcheck/should_compile/T15753a.hs - + testsuite/tests/pmcheck/should_compile/T15753b.hs - + testsuite/tests/pmcheck/should_compile/T15884.hs - testsuite/tests/pmcheck/should_compile/all.T Changes: ===================================== compiler/deSugar/Check.hs ===================================== @@ -4,9 +4,13 @@ Author: George Karachalias Pattern Matching Coverage Checking. -} -{-# LANGUAGE CPP, GADTs, DataKinds, KindSignatures #-} -{-# LANGUAGE TupleSections #-} -{-# LANGUAGE ViewPatterns #-} +{-# LANGUAGE CPP #-} +{-# LANGUAGE GADTs #-} +{-# LANGUAGE DataKinds #-} +{-# LANGUAGE KindSignatures #-} +{-# LANGUAGE TupleSections #-} +{-# LANGUAGE ViewPatterns #-} +{-# LANGUAGE MultiWayIf #-} module Check ( -- Checking and printing @@ -55,7 +59,7 @@ import qualified GHC.LanguageExtensions as LangExt import Data.List (find) import Data.Maybe (catMaybes, isJust, fromMaybe) -import Control.Monad (forM, when, forM_, zipWithM) +import Control.Monad (forM, when, forM_, zipWithM, filterM) import Coercion import TcEvidence import TcSimplify (tcNormalise) @@ -289,6 +293,14 @@ data PmResult = , pmresultUncovered :: UncoveredCandidates , pmresultInaccessible :: [Located [LPat GhcTc]] } +instance Outputable PmResult where + ppr pmr = hang (text "PmResult") 2 $ vcat + [ text "pmresultProvenance" <+> ppr (pmresultProvenance pmr) + , text "pmresultRedundant" <+> ppr (pmresultRedundant pmr) + , text "pmresultUncovered" <+> ppr (pmresultUncovered pmr) + , text "pmresultInaccessible" <+> ppr (pmresultInaccessible pmr) + ] + -- | Either a list of patterns that are not covered, or their type, in case we -- have no patterns at hand. Not having patterns at hand can arise when -- handling EmptyCase expressions, in two cases: @@ -303,6 +315,10 @@ data PmResult = data UncoveredCandidates = UncoveredPatterns Uncovered | TypeOfUncovered Type +instance Outputable UncoveredCandidates where + ppr (UncoveredPatterns uc) = text "UnPat" <+> ppr uc + ppr (TypeOfUncovered ty) = text "UnTy" <+> ppr ty + -- | The empty pattern check result emptyPmResult :: PmResult emptyPmResult = PmResult FromBuiltin [] (UncoveredPatterns []) [] @@ -987,7 +1003,7 @@ translatePat fam_insts pat = case pat of | otherwise -> do ps <- translatePat fam_insts p (xp,xe) <- mkPmId2Forms ty - let g = mkGuard ps (mkHsWrap wrapper (unLoc xe)) + g <- mkGuard ps (mkHsWrap wrapper (unLoc xe)) return [xp,g] -- (n + k) ===> x (True <- x >= k) (n <- x-k) @@ -997,10 +1013,11 @@ translatePat fam_insts pat = case pat of ViewPat arg_ty lexpr lpat -> do ps <- translatePat fam_insts (unLoc lpat) -- See Note [Guards and Approximation] - case all cantFailPattern ps of + res <- allM cantFailPattern ps + case res of True -> do (xp,xe) <- mkPmId2Forms arg_ty - let g = mkGuard ps (HsApp noExt lexpr xe) + g <- mkGuard ps (HsApp noExt lexpr xe) return [xp,g] False -> mkCanFailPmPat arg_ty @@ -1255,41 +1272,38 @@ translateMatch _ _ = panic "translateMatch" translateGuards :: FamInstEnvs -> [GuardStmt GhcTc] -> DsM PatVec translateGuards fam_insts guards = do all_guards <- concat <$> mapM (translateGuard fam_insts) guards - return (replace_unhandled all_guards) - -- It should have been (return all_guards) but it is too expressive. + let + shouldKeep :: Pattern -> DsM Bool + shouldKeep p + | PmVar {} <- p = pure True + | PmCon {} <- p = (&&) + <$> singleMatchConstructor (pm_con_con p) (pm_con_arg_tys p) + <*> allM shouldKeep (pm_con_args p) + shouldKeep (PmGrd pv e) + | isNotPmExprOther e = pure True -- expensive but we want it + | otherwise = allM shouldKeep pv + shouldKeep _other_pat = pure False -- let the rest.. + + all_handled <- allM shouldKeep all_guards + -- It should have been @pure all_guards@ but it is too expressive. -- Since the term oracle does not handle all constraints we generate, -- we (hackily) replace all constraints the oracle cannot handle with a - -- single one (we need to know if there is a possibility of falure). + -- single one (we need to know if there is a possibility of failure). -- See Note [Guards and Approximation] for all guard-related approximations -- we implement. - where - replace_unhandled :: PatVec -> PatVec - replace_unhandled gv - | any_unhandled gv = fake_pat : [ p | p <- gv, shouldKeep p ] - | otherwise = gv - - any_unhandled :: PatVec -> Bool - any_unhandled gv = any (not . shouldKeep) gv - - shouldKeep :: Pattern -> Bool - shouldKeep p - | PmVar {} <- p = True - | PmCon {} <- p = singleConstructor (pm_con_con p) - && all shouldKeep (pm_con_args p) - shouldKeep (PmGrd pv e) - | all shouldKeep pv = True - | isNotPmExprOther e = True -- expensive but we want it - shouldKeep _other_pat = False -- let the rest.. + if all_handled + then pure all_guards + else do + kept <- filterM shouldKeep all_guards + pure (fake_pat : kept) -- | Check whether a pattern can fail to match -cantFailPattern :: Pattern -> Bool -cantFailPattern p - | PmVar {} <- p = True - | PmCon {} <- p = singleConstructor (pm_con_con p) - && all cantFailPattern (pm_con_args p) -cantFailPattern (PmGrd pv _e) - = all cantFailPattern pv -cantFailPattern _ = False +cantFailPattern :: Pattern -> DsM Bool +cantFailPattern PmVar {} = pure True +cantFailPattern PmCon { pm_con_con = c, pm_con_arg_tys = tys, pm_con_args = ps} + = (&&) <$> singleMatchConstructor c tys <*> allM cantFailPattern ps +cantFailPattern (PmGrd pv _e) = allM cantFailPattern pv +cantFailPattern _ = pure False -- | Translate a guard statement to Pattern translateGuard :: FamInstEnvs -> GuardStmt GhcTc -> DsM PatVec @@ -1312,7 +1326,8 @@ translateLet _binds = return [] translateBind :: FamInstEnvs -> LPat GhcTc -> LHsExpr GhcTc -> DsM PatVec translateBind fam_insts (dL->L _ p) e = do ps <- translatePat fam_insts p - return [mkGuard ps (unLoc e)] + g <- mkGuard ps (unLoc e) + return [g] -- | Translate a boolean guard translateBoolGuard :: LHsExpr GhcTc -> DsM PatVec @@ -1321,7 +1336,7 @@ translateBoolGuard e -- The formal thing to do would be to generate (True <- True) -- but it is trivial to solve so instead we give back an empty -- PatVec for efficiency - | otherwise = return [mkGuard [truePattern] (unLoc e)] + | otherwise = (:[]) <$> mkGuard [truePattern] (unLoc e) {- Note [Guards and Approximation] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1658,13 +1673,14 @@ mkOneConFull x con = do -- * More smart constructors and fresh variable generation -- | Create a guard pattern -mkGuard :: PatVec -> HsExpr GhcTc -> Pattern -mkGuard pv e - | all cantFailPattern pv = PmGrd pv expr - | PmExprOther {} <- expr = fake_pat - | otherwise = PmGrd pv expr - where - expr = hsExprToPmExpr e +mkGuard :: PatVec -> HsExpr GhcTc -> DsM Pattern +mkGuard pv e = do + res <- allM cantFailPattern pv + let expr = hsExprToPmExpr e + tracePmD "mkGuard" (vcat [ppr pv, ppr e, ppr res, ppr expr]) + if | res -> pure (PmGrd pv expr) + | PmExprOther {} <- expr -> pure fake_pat + | otherwise -> pure (PmGrd pv expr) -- | Create a term equality of the form: `(False ~ (x ~ lit))` mkNegEq :: Id -> PmLit -> ComplexEq @@ -1738,14 +1754,37 @@ coercePmPat (PmCon { pm_con_con = con, pm_con_arg_tys = arg_tys , pm_con_args = coercePatVec args }] coercePmPat (PmGrd {}) = [] -- drop the guards --- | Check whether a data constructor is the only way to construct --- a data type. -singleConstructor :: ConLike -> Bool -singleConstructor (RealDataCon dc) = - case tyConDataCons (dataConTyCon dc) of - [_] -> True - _ -> False -singleConstructor _ = False +-- | Check whether a 'ConLike' has the /single match/ property, i.e. whether +-- it is the only possible match in the given context. See also +-- 'allCompleteMatches' and Note [Single match constructors]. +singleMatchConstructor :: ConLike -> [Type] -> DsM Bool +singleMatchConstructor cl tys = + any (isSingleton . snd) <$> allCompleteMatches cl tys + +{- +Note [Single match constructors] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When translating pattern guards for consumption by the checker, we desugar +every pattern guard that might fail ('cantFailPattern') to 'fake_pat' +(True <- _). Which patterns can't fail? Exactly those that only match on +'singleMatchConstructor's. + +Here are a few examples: + * @f a | (a, b) <- foo a = 42@: Product constructors are generally + single match. This extends to single constructors of GADTs like 'Refl'. + * If @f | Id <- id () = 42@, where @pattern Id = ()@ and 'Id' is part of a + singleton `COMPLETE` set, then 'Id' has the single match property. + +In effect, we can just enumerate 'allCompleteMatches' and check if the conlike +occurs as a singleton set. +There's the chance that 'Id' is part of multiple `COMPLETE` sets. That's +irrelevant; If the user specified a singleton set, it is single-match. + +Note that this doesn't really take into account incoming type constraints; +It might be obvious from type context that a particular GADT constructor has +the single-match property. We currently don't (can't) check this in the +translation step. See #15753 for why this yields surprising results. +-} -- | For a given conlike, finds all the sets of patterns which could -- be relevant to that conlike by consulting the result type. ===================================== testsuite/tests/pmcheck/should_compile/T15753a.hs ===================================== @@ -0,0 +1,28 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE TypeOperators #-} +{-# OPTIONS_GHC -Wincomplete-patterns #-} +module Bug where + +import Data.Type.Equality + +data G a where + GInt :: G Int + GBool :: G Bool + +ex1, ex2, ex3 + :: a :~: Int + -> G a + -> () + +ex1 Refl g + | GInt <- id g + = () + +ex2 Refl g + | GInt <- g + = () + +ex3 Refl g + = case id g of + GInt -> () + ===================================== testsuite/tests/pmcheck/should_compile/T15753b.hs ===================================== @@ -0,0 +1,10 @@ +{-# LANGUAGE PatternSynonyms #-} +module Bug where + +{-# COMPLETE Id #-} +pattern Id :: () +pattern Id = () + +bug :: () +bug | Id <- id () = () + ===================================== testsuite/tests/pmcheck/should_compile/T15884.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE ViewPatterns #-} + +module Bug where + +f :: Maybe a -> Bool +f (id->Nothing) = False +f (id->(Just _)) = True ===================================== testsuite/tests/pmcheck/should_compile/all.T ===================================== @@ -68,6 +68,12 @@ test('T15584', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) test('T15713', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15753a', expect_broken(15753), compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15753b', normal, compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) +test('T15884', expect_broken(15884), compile, + ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) test('T16289', normal, compile, ['-fwarn-incomplete-patterns -fwarn-overlapping-patterns']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d236d9d0f4f3be0641933b959dde14a065acd37f -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d236d9d0f4f3be0641933b959dde14a065acd37f You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:18:31 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:18:31 -0400 Subject: [Git][ghc/ghc][master] Skip test ArithInt16 and ArithWord16 in GHCi way Message-ID: <5cab9e87280b_62b33fa2cd668654233251d@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 1 changed file: - testsuite/tests/primops/should_run/all.T Changes: ===================================== testsuite/tests/primops/should_run/all.T ===================================== @@ -20,7 +20,9 @@ test('CmpInt8', normal, compile_and_run, ['']) test('CmpWord8', normal, compile_and_run, ['']) test('ShowPrim', normal, compile_and_run, ['']) -test('ArithInt16', normal, compile_and_run, ['']) -test('ArithWord16', normal, compile_and_run, ['']) +# These two tests use unboxed tuples, which GHCi doesn't support +test('ArithInt16', omit_ways(['ghci']), compile_and_run, ['']) +test('ArithWord16', omit_ways(['ghci']), compile_and_run, ['']) + test('CmpInt16', normal, compile_and_run, ['']) -test('CmpWord16', normal, compile_and_run, ['']) \ No newline at end of file +test('CmpWord16', normal, compile_and_run, ['']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1085090e83bbf2a7dbb1b9a2e9023f3500538930 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1085090e83bbf2a7dbb1b9a2e9023f3500538930 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:24:41 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:24:41 -0400 Subject: [Git][ghc/ghc][master] testsuite: Show exit code of GHCi tests on failure Message-ID: <5cab9ff91df40_62b33fa2c8561af823364e8@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - 1 changed file: - testsuite/driver/testlib.py Changes: ===================================== testsuite/driver/testlib.py ===================================== @@ -1489,7 +1489,7 @@ def interpreter_run(name, way, extra_hc_opts, top_mod): print('Wrong exit code for ' + name + '(' + way + ') (expected', getTestOpts().exit_code, ', actual', exit_code, ')') dump_stdout(name) dump_stderr(name) - return failBecause('bad exit code') + return failBecause('bad exit code (%d)' % exit_code) # ToDo: if the sub-shell was killed by ^C, then exit View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/7287bb9ea51251441bc8caca2894549ab34fbcb3 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/7287bb9ea51251441bc8caca2894549ab34fbcb3 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:30:49 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:30:49 -0400 Subject: [Git][ghc/ghc][master] settings.in: Reformat Message-ID: <5caba1692ffa5_62b33fa2e358375023421e2@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - 1 changed file: - settings.in Changes: ===================================== settings.in ===================================== @@ -1,35 +1,34 @@ -[("GCC extra via C opts", "@GccExtraViaCOpts@"), - ("C compiler command", "@SettingsCCompilerCommand@"), - ("C compiler flags", "@SettingsCCompilerFlags@"), - ("C compiler link flags", "@SettingsCCompilerLinkFlags@"), - ("C compiler supports -no-pie", "@SettingsCCompilerSupportsNoPie@"), - ("Haskell CPP command","@SettingsHaskellCPPCommand@"), - ("Haskell CPP flags","@SettingsHaskellCPPFlags@"), - ("ld command", "@SettingsLdCommand@"), - ("ld flags", "@SettingsLdFlags@"), - ("ld supports compact unwind", "@LdHasNoCompactUnwind@"), - ("ld supports build-id", "@LdHasBuildId@"), - ("ld supports filelist", "@LdHasFilelist@"), - ("ld is GNU ld", "@LdIsGNULd@"), - ("ar command", "@SettingsArCommand@"), - ("ar flags", "@ArArgs@"), - ("ar supports at file", "@ArSupportsAtFile@"), - ("ranlib command", "@SettingsRanlibCommand@"), - ("touch command", "@SettingsTouchCommand@"), - ("dllwrap command", "@SettingsDllWrapCommand@"), - ("windres command", "@SettingsWindresCommand@"), - ("libtool command", "@SettingsLibtoolCommand@"), - ("cross compiling", "@CrossCompiling@"), - ("target os", "@HaskellTargetOs@"), - ("target arch", "@HaskellTargetArch@"), - ("target word size", "@TargetWordSize@"), - ("target has GNU nonexec stack", "@HaskellHaveGnuNonexecStack@"), - ("target has .ident directive", "@HaskellHaveIdentDirective@"), - ("target has subsections via symbols", "@HaskellHaveSubsectionsViaSymbols@"), - ("target has RTS linker", "@HaskellHaveRTSLinker@"), - ("Unregisterised", "@Unregisterised@"), - ("LLVM llc command", "@SettingsLlcCommand@"), - ("LLVM opt command", "@SettingsOptCommand@"), - ("LLVM clang command", "@SettingsClangCommand@") - ] - +[("GCC extra via C opts", "@GccExtraViaCOpts@") +,("C compiler command", "@SettingsCCompilerCommand@") +,("C compiler flags", "@SettingsCCompilerFlags@") +,("C compiler link flags", "@SettingsCCompilerLinkFlags@") +,("C compiler supports -no-pie", "@SettingsCCompilerSupportsNoPie@") +,("Haskell CPP command", "@SettingsHaskellCPPCommand@") +,("Haskell CPP flags", "@SettingsHaskellCPPFlags@") +,("ld command", "@SettingsLdCommand@") +,("ld flags", "@SettingsLdFlags@") +,("ld supports compact unwind", "@LdHasNoCompactUnwind@") +,("ld supports build-id", "@LdHasBuildId@") +,("ld supports filelist", "@LdHasFilelist@") +,("ld is GNU ld", "@LdIsGNULd@") +,("ar command", "@SettingsArCommand@") +,("ar flags", "@ArArgs@") +,("ar supports at file", "@ArSupportsAtFile@") +,("ranlib command", "@SettingsRanlibCommand@") +,("touch command", "@SettingsTouchCommand@") +,("dllwrap command", "@SettingsDllWrapCommand@") +,("windres command", "@SettingsWindresCommand@") +,("libtool command", "@SettingsLibtoolCommand@") +,("cross compiling", "@CrossCompiling@") +,("target os", "@HaskellTargetOs@") +,("target arch", "@HaskellTargetArch@") +,("target word size", "@TargetWordSize@") +,("target has GNU nonexec stack", "@HaskellHaveGnuNonexecStack@") +,("target has .ident directive", "@HaskellHaveIdentDirective@") +,("target has subsections via symbols", "@HaskellHaveSubsectionsViaSymbols@") +,("target has RTS linker", "@HaskellHaveRTSLinker@") +,("Unregisterised", "@Unregisterised@") +,("LLVM llc command", "@SettingsLlcCommand@") +,("LLVM opt command", "@SettingsOptCommand@") +,("LLVM clang command", "@SettingsClangCommand@") +] View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/f5604d3709d015f4085843a92298af3e53c08d36 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/f5604d3709d015f4085843a92298af3e53c08d36 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 8 19:36:58 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 08 Apr 2019 15:36:58 -0400 Subject: [Git][ghc/ghc][master] Bump hpc submodule Message-ID: <5caba2da97155_62b33fa2c8561af8234444@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 1 changed file: - libraries/hpc Changes: ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 7673420558e2a54affe530911d555cc78577ad87 +Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/cf9e1837adc647c90cfa176669d14e0d413c043d -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/cf9e1837adc647c90cfa176669d14e0d413c043d You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 03:22:34 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Mon, 08 Apr 2019 23:22:34 -0400 Subject: [Git][ghc/ghc][wip/bump-windows-cabal] 12 commits: Generate straightline code for inline array allocation Message-ID: <5cac0ffaeb910_62b33fa287e6282823863a8@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/bump-windows-cabal at Glasgow Haskell Compiler / GHC Commits: 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 96eae4e7 by Ben Gamari at 2019-04-09T03:22:33Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - libraries/ghc-prim/cbits/bitrev.c - libraries/hpc - settings.in - testsuite/driver/testlib.py - + testsuite/tests/codeGen/should_run/NewSmallArray.hs - + testsuite/tests/codeGen/should_run/NewSmallArray.stdout - testsuite/tests/codeGen/should_run/all.T - + testsuite/tests/driver/T16500/A.hs - + testsuite/tests/driver/T16500/B.hs - + testsuite/tests/driver/T16500/Makefile - + testsuite/tests/driver/T16500/T16500.stdout - + testsuite/tests/driver/T16500/all.T - testsuite/tests/ffi/should_compile/all.T - + testsuite/tests/ffi/should_compile/cc017.hs - testsuite/tests/ffi/should_run/all.T The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d7984d4a18fb43d4c364fbac61bcef2dc44c8dc2...96eae4e70dc870abd0c56a49d0ff4e63aee186c1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/d7984d4a18fb43d4c364fbac61bcef2dc44c8dc2...96eae4e70dc870abd0c56a49d0ff4e63aee186c1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 04:25:37 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Tue, 09 Apr 2019 00:25:37 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/ghc-8.6-Cabal-bump Message-ID: <5cac1ec1221fa_62b33fa2cd668654239799f@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/ghc-8.6-Cabal-bump at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/ghc-8.6-Cabal-bump You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 11:53:57 2019 From: gitlab at gitlab.haskell.org (Krzysztof Gogolewski) Date: Tue, 09 Apr 2019 07:53:57 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/T16551a Message-ID: <5cac87d53229c_62b3e1a5cf8244767f@gitlab.haskell.org.mail> Krzysztof Gogolewski pushed new branch wip/T16551a at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/T16551a You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 13:09:59 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Tue, 09 Apr 2019 09:09:59 -0400 Subject: [Git][ghc/ghc][wip/lint-check-version-number] 13 commits: Fix #16282. Message-ID: <5cac99a7b6b30_62b33fa2c8378ac02464448@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/lint-check-version-number at Glasgow Haskell Compiler / GHC Commits: 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - e089db10 by Ben Gamari at 2019-04-09T13:09:47Z gitlab-ci: Ensure that version number has three components - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - configure.ac - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - libraries/ghc-prim/cbits/bitrev.c - libraries/hpc - settings.in - testsuite/driver/testlib.py - + testsuite/tests/codeGen/should_run/NewSmallArray.hs - + testsuite/tests/codeGen/should_run/NewSmallArray.stdout - testsuite/tests/codeGen/should_run/all.T - + testsuite/tests/driver/T16500/A.hs - + testsuite/tests/driver/T16500/B.hs - + testsuite/tests/driver/T16500/Makefile - + testsuite/tests/driver/T16500/T16500.stdout - + testsuite/tests/driver/T16500/all.T The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1dd72c10df80c6ea7b68f69d4c4d2ff030331fdb...e089db10d2aefb3c5be448370a8c87289ad89edc -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1dd72c10df80c6ea7b68f69d4c4d2ff030331fdb...e089db10d2aefb3c5be448370a8c87289ad89edc You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 13:19:21 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Tue, 09 Apr 2019 09:19:21 -0400 Subject: [Git][ghc/ghc][wip/T16546] base: Better document implementation implications of Data.Timeout Message-ID: <5cac9bd9d4a08_62b3defc3d0247264@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/T16546 at Glasgow Haskell Compiler / GHC Commits: 9b94af4c by Ben Gamari at 2019-04-09T13:19:02Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. - - - - - 1 changed file: - libraries/base/System/Timeout.hs Changes: ===================================== libraries/base/System/Timeout.hs ===================================== @@ -16,7 +16,7 @@ -- ------------------------------------------------------------------------------- -module System.Timeout ( timeout ) where +module System.Timeout ( Timeout, timeout ) where #if !defined(mingw32_HOST_OS) import Control.Monad @@ -35,7 +35,11 @@ import Data.Unique (Unique, newUnique) -- interrupt the running IO computation when the timeout has -- expired. -newtype Timeout = Timeout Unique deriving Eq -- ^ @since 4.0 +-- | An exception thrown to a thread by 'timeout' to interrupt a timed-out +-- computation. +-- +-- @since 4.0 +newtype Timeout = Timeout Unique deriving Eq -- | @since 4.0 instance Show Timeout where @@ -67,20 +71,25 @@ instance Exception Timeout where -- another thread. -- -- A tricky implementation detail is the question of how to abort an @IO@ --- computation. This combinator relies on asynchronous exceptions internally. --- The technique works very well for computations executing inside of the --- Haskell runtime system, but it doesn't work at all for non-Haskell code. --- Foreign function calls, for example, cannot be timed out with this --- combinator simply because an arbitrary C function cannot receive --- asynchronous exceptions. When @timeout@ is used to wrap an FFI call that --- blocks, no timeout event can be delivered until the FFI call returns, which --- pretty much negates the purpose of the combinator. In practice, however, --- this limitation is less severe than it may sound. Standard I\/O functions --- like 'System.IO.hGetBuf', 'System.IO.hPutBuf', Network.Socket.accept, or --- 'System.IO.hWaitForInput' appear to be blocking, but they really don't --- because the runtime system uses scheduling mechanisms like @select(2)@ to --- perform asynchronous I\/O, so it is possible to interrupt standard socket --- I\/O or file I\/O using this combinator. +-- computation. This combinator relies on asynchronous exceptions internally +-- (namely throwing the computation the 'Timeout' exception). The technique +-- works very well for computations executing inside of the Haskell runtime +-- system, but it doesn't work at all for non-Haskell code. Foreign function +-- calls, for example, cannot be timed out with this combinator simply because +-- an arbitrary C function cannot receive asynchronous exceptions. When +-- @timeout@ is used to wrap an FFI call that blocks, no timeout event can be +-- delivered until the FFI call returns, which pretty much negates the purpose +-- of the combinator. In practice, however, this limitation is less severe than +-- it may sound. Standard I\/O functions like 'System.IO.hGetBuf', +-- 'System.IO.hPutBuf', Network.Socket.accept, or 'System.IO.hWaitForInput' +-- appear to be blocking, but they really don't because the runtime system uses +-- scheduling mechanisms like @select(2)@ to perform asynchronous I\/O, so it +-- is possible to interrupt standard socket I\/O or file I\/O using this +-- combinator. +--- +-- Note that 'timeout' cancels the computation by throwing it the 'Timeout' +-- exception. Consequently blanket exception handlers (e.g. catching +-- 'SomeException') within the computation will break the timeout behavior. timeout :: Int -> IO a -> IO (Maybe a) timeout n f | n < 0 = fmap Just f View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9b94af4caf0d5c4db0232f4fb31ece76522e2d33 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9b94af4caf0d5c4db0232f4fb31ece76522e2d33 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 13:48:45 2019 From: gitlab at gitlab.haskell.org (Krzysztof Gogolewski) Date: Tue, 09 Apr 2019 09:48:45 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/corelint-safeguard Message-ID: <5caca2bdb4b1f_62b33fa2be002670248716e@gitlab.haskell.org.mail> Krzysztof Gogolewski pushed new branch wip/corelint-safeguard at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/corelint-safeguard You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 14:13:46 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 10:13:46 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 15 commits: Generate straightline code for inline array allocation Message-ID: <5caca89a8e9ed_62b3e1a5cf8249592a@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 12e3d8bc by Ben Gamari at 2019-04-09T14:13:37Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - 7f33979d by Artem Pyanykh at 2019-04-09T14:13:40Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - 6ba24ba0 by Artem Pyanykh at 2019-04-09T14:13:40Z codegen: use newtype for Alignment in BasicTypes - - - - - 06b04e44 by Artem Pyanykh at 2019-04-09T14:13:40Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - compiler/basicTypes/BasicTypes.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Ppr.hs - compiler/utils/Util.hs - docs/users_guide/8.10.1-notes.rst - docs/users_guide/packages.rst - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - libraries/ghc-prim/cbits/bitrev.c - libraries/hpc - settings.in - testsuite/driver/testlib.py - testsuite/tests/codeGen/should_gen_asm/all.T - + testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.asm - + testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.hs - + testsuite/tests/codeGen/should_run/NewSmallArray.hs - + testsuite/tests/codeGen/should_run/NewSmallArray.stdout - testsuite/tests/codeGen/should_run/all.T The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/27d0db377d53b46a4861dc977010049f6eb6088e...06b04e44e9ac588fbd54d1c6f904275de6d248c1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/27d0db377d53b46a4861dc977010049f6eb6088e...06b04e44e9ac588fbd54d1c6f904275de6d248c1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 14:23:54 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 10:23:54 -0400 Subject: [Git][ghc/ghc][wip/document-package-env] 49 commits: Add support for bitreverse primop Message-ID: <5cacaafa85f75_62b3e2f6364250618f@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/document-package-env at Glasgow Haskell Compiler / GHC Commits: 33173a51 by Alexandre at 2019-04-01T07:32:28Z Add support for bitreverse primop This commit includes the necessary changes in code and documentation to support a primop that reverses a word's bits. It also includes a test. - - - - - a3971b4e by Alexandre at 2019-04-01T07:32:28Z Bump ghc-prim's version where needed - - - - - 061276ea by Michael Sloan at 2019-04-01T07:32:30Z Remove unnecessary uses of UnboxedTuples pragma (see #13101 / #15454) Also removes a couple unnecessary MagicHash pragmas - - - - - e468c613 by David Eichmann at 2019-04-01T07:32:34Z Support Shake's --lint-fsatrace feature. Using this feature requires fsatrace (e.g. https://github.com/jacereda/fsatrace). Simply use the `--lint-fsatrace` option when running hadrian. Shake version >= 0.17.7 is required to support linting out of tree build dirs. - - - - - 1e9e4197 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add merge request template for backports for 8.8 - - - - - 55650d14 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add some simply issue templates - - - - - 27b99ed8 by Takenobu Tani at 2019-04-01T07:32:36Z Clean up URLs to point to GitLab This moves URL references to old Trac to their corresponding GitLab counterparts. This patch does not update the submodule library, such as libraries/Cabal. See also !539, !606, !618 [ci skip] - - - - - 18d1555d by Adam Sandberg Eriksson at 2019-04-01T07:32:38Z configure: document the use of the LD variable - - - - - 10352efa by Ben Gamari at 2019-04-01T22:22:34Z gitlab: Add feature request MR template - - - - - 1e52054b by Ben Gamari at 2019-04-01T23:16:21Z gitlab: Move feature request template to issue_templates Whoops. - - - - - e5c21ca9 by Ben Gamari at 2019-04-01T23:16:25Z gitlab: Mention ~"user facing" label - - - - - 39282422 by Ryan Scott at 2019-04-02T00:01:38Z Bump array submodule This bumps `array` to version 0.5.4.0 so that we can distinguish it with `MIN_VERSION_array` (as it introduces some changes to the `Show` instance for `UArray`). - - - - - 7cf5ba3d by Michal Terepeta at 2019-04-02T00:07:49Z Improve performance of newSmallArray# This: - Hoists part of the condition outside of the initialization loop in `stg_newSmallArrayzh`. - Annotates one of the unlikely branches as unlikely, also in `stg_newSmallArrayzh`. - Adds a couple of annotations to `allocateMightFail` indicating which branches are likely to be taken. Together this gives about 5% improvement. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - dd9c82ef by David Eichmann at 2019-04-02T00:13:55Z Hadrian: correct deps for ghc builder. Previously, when needing ghc as a builder, the ghcDeps (Files the GHC binary depends on) for the current stage were needed. This is incorrect as the previous stage's ghc is used for building. This commit fixes the issue, needing the previous stage's ghcDeps. - - - - - 345306d3 by Alexandre Baldé at 2019-04-02T16:34:30Z Fix formatting issue in ghc-prim's changelog [skip ci] - - - - - f54b5124 by David Eichmann at 2019-04-02T16:40:39Z Hadrian: traceAllow deep dependencies when compilling haskell object files. - - - - - d132b30a by David Eichmann at 2019-04-02T16:40:39Z Hadrian: lint ignore autom4te and ghc-pkg cache files. - - - - - bf734195 by Simon Marlow at 2019-04-02T16:46:46Z Add myself to libraries/ghci - - - - - 5a75ccd0 by klebinger.andreas at gmx.at at 2019-04-03T04:34:57Z Fix faulty substitutions in StgCse (#11532). `substBndr` should rename bindings which shadow existing ids. However while it was renaming the bindings it was not adding proper substitutions for renamed bindings. Instead of adding a substitution of the form `old -> new` for renamed bindings it mistakenly added `old -> old` if no replacement had taken place while adding none if `old` had been renamed. As a byproduct this should improve performance, as we no longer add useless substitutions for unshadowed bindings. - - - - - 2ec749b5 by Nathan Collins at 2019-04-03T04:41:05Z users-guide: Fix typo - - - - - ea192a09 by Andrew Martin at 2019-04-03T04:41:05Z base: Add documentation that liftA2 used to not be a typeclass method - - - - - 733f1b52 by Frank Steffahn at 2019-04-03T04:41:05Z users-guide: Typo in Users Guide, Glasgow Exts - - - - - 3364def0 by Ben Gamari at 2019-04-03T04:41:05Z integer-gmp: Write friendlier documentation for Integer - - - - - dd3a3d08 by Ben Gamari at 2019-04-03T04:41:05Z integer-simple: Add documentation for Integer type - - - - - 722fdddf by Chris Martin at 2019-04-03T04:41:05Z Correct two misspellings of "separately" - - - - - bf6dbe3d by Chris Martin at 2019-04-03T04:41:05Z Inline the definition of 'ap' in the Monad laws The law as it is currently written is meaningless, because nowhere have we defined the implementation of 'ap'. The reader of the Control.Monad documentation is provided with only a type signature, > ap :: Monad m => m (a -> b) -> m a -> m b an informal description, > In many situations, the liftM operations can be replaced by uses of > ap, which promotes function application. and a relationship between 'ap' and the 'liftM' functions > return f `ap` x1 `ap` ... `ap` xn > is equivalent to > liftMn f x1 x2 ... xn Without knowing how 'ap' is defined, a law involving 'ap' cannot provide any guidance for how to write a lawful Monad instance, nor can we conclude anything from the law. I suspect that a reader equipped with the understanding that 'ap' was defined prior to the invention of the Applicative class could deduce that 'ap' must be defined in terms of (>>=), but nowhere as far as I can tell have we written this down explicitly for readers without the benefit of historical context. If the law is meant to express a relationship among (<*>), (>>=), and 'return', it seems that it is better off making this statement directly, sidestepping 'ap' altogether. - - - - - 7b090b53 by Ben Gamari at 2019-04-03T07:57:40Z configure: Always use AC_LINK_ELSEIF when testing against assembler This fixes #16440, where the build system incorrectly concluded that the `.subsections_via_symbols` assembler directive was supported on a Linux system. This was caused by the fact that gcc was invoked with `-flto`; when so-configured gcc does not call the assembler but rather simply serialises its AST for compilation during the final link. This is described in Note [autoconf assembler checks and -flto]. - - - - - 4626cf21 by Sebastian Graf at 2019-04-03T08:03:47Z Fix Uncovered set of literal patterns Issues #16289 and #15713 are proof that the pattern match checker did an unsound job of estimating the value set abstraction corresponding to the uncovered set. The reason is that the fix from #11303 introducing `NLit` was incomplete: The `LitCon` case desugared to `Var` rather than `LitVar`, which would have done the necessary case splitting analogous to the `ConVar` case. This patch rectifies that by introducing the fresh unification variable in `LitCon` in value abstraction position rather than pattern postition, recording a constraint equating it to the constructor expression rather than the literal. Fixes #16289 and #15713. - - - - - 6f13e7b1 by Ben Gamari at 2019-04-03T12:12:26Z gitlab-ci: Build hyperlinked sources for releases Fixes #16445. - - - - - 895394c2 by Ben Gamari at 2019-04-03T12:15:06Z gitlab: Fix label names in issue templates - - - - - 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - + .gitlab/issue_templates/bug.md - + .gitlab/issue_templates/feature_request.md - + .gitlab/merge_request_templates/backport-for-8.8.md - .gitlab/merge_request_templates/merge-request.md - ANNOUNCE - CODEOWNERS - HACKING.md - README.md - aclocal.m4 - boot - compiler/cmm/CmmMachOp.hs - compiler/cmm/PprC.hs - compiler/codeGen/StgCmmMonad.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/deSugar/TmOracle.hs - compiler/ghci/ByteCodeLink.hs - compiler/ghci/RtClosureInspect.hs - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/llvmGen/LlvmCodeGen/CodeGen.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/InteractiveEval.hs - compiler/main/SysTools.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/8862a9cba94dca5c366b08bfd56532b824a58b77...36d380475d9056fdf93305985be3def00aaf6cf7 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/8862a9cba94dca5c366b08bfd56532b824a58b77...36d380475d9056fdf93305985be3def00aaf6cf7 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 14:30:08 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 10:30:08 -0400 Subject: [Git][ghc/ghc][master] users-guide: Document how to disable package environments Message-ID: <5cacac701fa82_62b3799ebc82509538@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - 1 changed file: - docs/users_guide/packages.rst Changes: ===================================== docs/users_guide/packages.rst ===================================== @@ -579,6 +579,12 @@ must be relative to the location of the package environment file. Use the package environment in ⟨file⟩, or in ``$HOME/.ghc/arch-os-version/environments/⟨name⟩`` + If set to ``-`` no package environment is read. + +.. envvar:: GHC_ENVIRONMENT + + Specifies the path to the package environment file to be used by GHC. + Overridden by the :ghc-flag:`-package-env ⟨file⟩|⟨name⟩` flag if set. In order, ``ghc`` will look for the package environment in the following locations: @@ -588,11 +594,11 @@ locations: - File ``$HOME/.ghc/arch-os-version/environments/name`` if you pass the option ``-package-env ⟨name⟩``. -- File ⟨file⟩ if the environment variable ``GHC_ENVIRONMENT`` is set to +- File ⟨file⟩ if the environment variable :envvar:`GHC_ENVIRONMENT` is set to ⟨file⟩. - File ``$HOME/.ghc/arch-os-version/environments/name`` if the - environment variable ``GHC_ENVIRONMENT`` is set to ⟨name⟩. + environment variable :envvar:`GHC_ENVIRONMENT` is set to ⟨name⟩. Additionally, unless ``-hide-all-packages`` is specified ``ghc`` will also look for the package environment in the following locations: View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/36d380475d9056fdf93305985be3def00aaf6cf7 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/36d380475d9056fdf93305985be3def00aaf6cf7 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 14:36:24 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 10:36:24 -0400 Subject: [Git][ghc/ghc][master] 3 commits: codegen: fix memset unroll for small bytearrays, add 64-bit sets Message-ID: <5cacade8b9972_62b33fa2872a81f42513310@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - 11 changed files: - compiler/basicTypes/BasicTypes.hs - compiler/codeGen/StgCmmPrim.hs - compiler/main/DynFlags.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Ppr.hs - compiler/utils/Util.hs - docs/users_guide/8.10.1-notes.rst - testsuite/driver/testlib.py - testsuite/tests/codeGen/should_gen_asm/all.T - + testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.asm - + testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.hs Changes: ===================================== compiler/basicTypes/BasicTypes.hs ===================================== @@ -26,7 +26,7 @@ module BasicTypes( Arity, RepArity, JoinArity, - Alignment, + Alignment, mkAlignment, alignmentOf, alignmentBytes, PromotionFlag(..), isPromoted, FunctionOrData(..), @@ -116,6 +116,7 @@ import Outputable import SrcLoc ( Located,unLoc ) import Data.Data hiding (Fixity, Prefix, Infix) import Data.Function (on) +import Data.Bits {- ************************************************************************ @@ -196,8 +197,39 @@ fIRST_TAG = 1 ************************************************************************ -} -type Alignment = Int -- align to next N-byte boundary (N must be a power of 2). - +-- | A power-of-two alignment +newtype Alignment = Alignment { alignmentBytes :: Int } deriving (Eq, Ord) + +-- Builds an alignment, throws on non power of 2 input. This is not +-- ideal, but convenient for internal use and better then silently +-- passing incorrect data. +mkAlignment :: Int -> Alignment +mkAlignment n + | n == 1 = Alignment 1 + | n == 2 = Alignment 2 + | n == 4 = Alignment 4 + | n == 8 = Alignment 8 + | n == 16 = Alignment 16 + | n == 32 = Alignment 32 + | n == 64 = Alignment 64 + | n == 128 = Alignment 128 + | n == 256 = Alignment 256 + | n == 512 = Alignment 512 + | otherwise = panic "mkAlignment: received either a non power of 2 argument or > 512" + +-- Calculates an alignment of a number. x is aligned at N bytes means +-- the remainder from x / N is zero. Currently, interested in N <= 8, +-- but can be expanded to N <= 16 or N <= 32 if used within SSE or AVX +-- context. +alignmentOf :: Int -> Alignment +alignmentOf x = case x .&. 7 of + 0 -> Alignment 8 + 4 -> Alignment 4 + 2 -> Alignment 2 + _ -> Alignment 1 + +instance Outputable Alignment where + ppr (Alignment m) = ppr m {- ************************************************************************ * * ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -2073,10 +2073,17 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do -- character. doSetByteArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> FCode () -doSetByteArrayOp ba off len c - = do dflags <- getDynFlags - p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off - emitMemsetCall p c len 1 +doSetByteArrayOp ba off len c = do + dflags <- getDynFlags + + let byteArrayAlignment = wordAlignment dflags -- known since BA is allocated on heap + offsetAlignment = case off of + CmmLit (CmmInt intOff _) -> alignmentOf (fromInteger intOff) + _ -> mkAlignment 1 + align = min byteArrayAlignment offsetAlignment + + p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off + emitMemsetCall p c len align -- ---------------------------------------------------------------------------- -- Allocating arrays @@ -2347,7 +2354,7 @@ emitSetCards dst_start dst_cards_start n = do emitMemsetCall (cmmAddWord dflags dst_cards_start start_card) (mkIntExpr dflags 1) (cmmAddWord dflags (cmmSubWord dflags end_card start_card) (mkIntExpr dflags 1)) - 1 -- no alignment (1 byte) + (mkAlignment 1) -- no alignment (1 byte) -- Convert an element index to a card index cardCmm :: DynFlags -> CmmExpr -> CmmExpr @@ -2473,11 +2480,11 @@ emitMemmoveCall dst src n align = do -- | Emit a call to @memset at . The second argument must fit inside an -- unsigned char. -emitMemsetCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemsetCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemsetCall dst c n align = do emitPrimCall [ {- no results -} ] - (MO_Memset align) + (MO_Memset (alignmentBytes align)) [ dst, c, n ] emitMemcmpCall :: LocalReg -> CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () ===================================== compiler/main/DynFlags.hs ===================================== @@ -147,6 +147,7 @@ module DynFlags ( #include "GHCConstantsHaskellExports.hs" bLOCK_SIZE_W, wORD_SIZE_IN_BITS, + wordAlignment, tAG_MASK, mAX_PTR_TAG, tARGET_MIN_INT, tARGET_MAX_INT, tARGET_MAX_WORD, @@ -205,7 +206,7 @@ import Maybes import MonadUtils import qualified Pretty import SrcLoc -import BasicTypes ( IntWithInf, treatZeroAsInf ) +import BasicTypes ( Alignment, alignmentOf, IntWithInf, treatZeroAsInf ) import FastString import Fingerprint import Outputable @@ -5661,6 +5662,9 @@ bLOCK_SIZE_W dflags = bLOCK_SIZE dflags `quot` wORD_SIZE dflags wORD_SIZE_IN_BITS :: DynFlags -> Int wORD_SIZE_IN_BITS dflags = wORD_SIZE dflags * 8 +wordAlignment :: DynFlags -> Alignment +wordAlignment dflags = alignmentOf (wORD_SIZE dflags) + tAG_MASK :: DynFlags -> Int tAG_MASK dflags = (1 `shiftL` tAG_BITS dflags) - 1 ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -128,7 +128,7 @@ cmmTopCodeGen (CmmProc info lab live graph) = do Nothing -> return tops cmmTopCodeGen (CmmData sec dat) = do - return [CmmData sec (1, dat)] -- no translation, we just use CmmStatic + return [CmmData sec (mkAlignment 1, dat)] -- no translation, we just use CmmStatic basicBlockCodeGen @@ -569,7 +569,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = return (Any format code) | otherwise = do - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit loadFloatAmode True w addr code float_const_x87 = case w of @@ -583,7 +583,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = in return (Any FF80 code) _otherwise -> do - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit loadFloatAmode False w addr code -- catch simple cases of zero- or sign-extended load @@ -1247,7 +1247,7 @@ getNonClobberedOperand (CmmLit lit) = do if use_sse2 && isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit return (OpAddr addr, code) else do @@ -1303,7 +1303,7 @@ getOperand (CmmLit lit) = do if (use_sse2 && isSuitableFloatingPointLit lit) then do let CmmFloat _ w = lit - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit return (OpAddr addr, code) else do @@ -1351,7 +1351,7 @@ addAlignmentCheck align reg = , JXX_GBL NE $ ImmCLbl mkBadAlignmentLabel ] -memConstant :: Int -> CmmLit -> NatM Amode +memConstant :: Alignment -> CmmLit -> NatM Amode memConstant align lit = do lbl <- getNewLabelNat let rosection = Section ReadOnlyData lbl @@ -1848,17 +1848,25 @@ genCCall dflags _ (PrimTarget (MO_Memset align)) _ CmmLit (CmmInt c _), CmmLit (CmmInt n _)] _ - | fromInteger insns <= maxInlineMemsetInsns dflags && align .&. 3 == 0 = do + | fromInteger insns <= maxInlineMemsetInsns dflags = do code_dst <- getAnyReg dst dst_r <- getNewRegNat format - return $ code_dst dst_r `appOL` go dst_r (fromInteger n) + if format == II64 && n >= 8 then do + code_imm8byte <- getAnyReg (CmmLit (CmmInt c8 W64)) + imm8byte_r <- getNewRegNat II64 + return $ code_dst dst_r `appOL` + code_imm8byte imm8byte_r `appOL` + go8 dst_r imm8byte_r (fromInteger n) + else + return $ code_dst dst_r `appOL` + go4 dst_r (fromInteger n) where - (format, val) = case align .&. 3 of - 2 -> (II16, c2) - 0 -> (II32, c4) - _ -> (II8, c) + maxAlignment = wordAlignment dflags -- only machine word wide MOVs are supported + effectiveAlignment = min (alignmentOf align) maxAlignment + format = intFormat . widthFromBytes $ alignmentBytes effectiveAlignment c2 = c `shiftL` 8 .|. c c4 = c2 `shiftL` 16 .|. c2 + c8 = c4 `shiftL` 32 .|. c4 -- The number of instructions we will generate (approx). We need 1 -- instructions per move. @@ -1868,25 +1876,45 @@ genCCall dflags _ (PrimTarget (MO_Memset align)) _ sizeBytes :: Integer sizeBytes = fromIntegral (formatInBytes format) - go :: Reg -> Integer -> OrdList Instr - go dst i - -- TODO: Add movabs instruction and support 64-bit sets. - | i >= sizeBytes = -- This might be smaller than the below sizes - unitOL (MOV format (OpImm (ImmInteger val)) (OpAddr dst_addr)) `appOL` - go dst (i - sizeBytes) - | i >= 4 = -- Will never happen on 32-bit - unitOL (MOV II32 (OpImm (ImmInteger c4)) (OpAddr dst_addr)) `appOL` - go dst (i - 4) - | i >= 2 = - unitOL (MOV II16 (OpImm (ImmInteger c2)) (OpAddr dst_addr)) `appOL` - go dst (i - 2) - | i >= 1 = - unitOL (MOV II8 (OpImm (ImmInteger c)) (OpAddr dst_addr)) `appOL` - go dst (i - 1) - | otherwise = nilOL + -- Depending on size returns the widest MOV instruction and its + -- width. + gen4 :: AddrMode -> Integer -> (InstrBlock, Integer) + gen4 addr size + | size >= 4 = + (unitOL (MOV II32 (OpImm (ImmInteger c4)) (OpAddr addr)), 4) + | size >= 2 = + (unitOL (MOV II16 (OpImm (ImmInteger c2)) (OpAddr addr)), 2) + | size >= 1 = + (unitOL (MOV II8 (OpImm (ImmInteger c)) (OpAddr addr)), 1) + | otherwise = (nilOL, 0) + + -- Generates a 64-bit wide MOV instruction from REG to MEM. + gen8 :: AddrMode -> Reg -> InstrBlock + gen8 addr reg8byte = + unitOL (MOV format (OpReg reg8byte) (OpAddr addr)) + + -- Unrolls memset when the widest MOV is <= 4 bytes. + go4 :: Reg -> Integer -> InstrBlock + go4 dst left = + if left <= 0 then nilOL + else curMov `appOL` go4 dst (left - curWidth) where - dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone - (ImmInteger (n - i)) + possibleWidth = minimum [left, sizeBytes] + dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone (ImmInteger (n - left)) + (curMov, curWidth) = gen4 dst_addr possibleWidth + + -- Unrolls memset when the widest MOV is 8 bytes (thus another Reg + -- argument). Falls back to go4 when all 8 byte moves are + -- exhausted. + go8 :: Reg -> Reg -> Integer -> InstrBlock + go8 dst reg8byte left = + if possibleWidth >= 8 then + let curMov = gen8 dst_addr reg8byte + in curMov `appOL` go8 dst reg8byte (left - 8) + else go4 dst left + where + possibleWidth = minimum [left, sizeBytes] + dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone (ImmInteger (n - left)) genCCall _ _ (PrimTarget MO_WriteBarrier) _ _ _ = return nilOL -- write barrier compiles to no code on x86/x86-64; @@ -2322,7 +2350,7 @@ genCCall _ is32Bit target dest_regs args bid = do let const | FF32 <- fmt = CmmInt 0x7fffffff W32 | otherwise = CmmInt 0x7fffffffffffffff W64 - Amode amode amode_code <- memConstant (widthInBytes w) const + Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const tmp <- getNewRegNat fmt let code dst = x_code dst `appOL` amode_code `appOL` toOL [ @@ -3051,7 +3079,7 @@ createJumpTable dflags ids section lbl where blockLabel = blockLbl blockid in map jumpTableEntryRel ids | otherwise = map (jumpTableEntry dflags) ids - in CmmData section (1, Statics lbl jumpTable) + in CmmData section (mkAlignment 1, Statics lbl jumpTable) extractUnwindPoints :: [Instr] -> [UnwindPoint] extractUnwindPoints instrs = @@ -3418,7 +3446,7 @@ sse2NegCode w x = do x at FF80 -> wrongFmt x where wrongFmt x = panic $ "sse2NegCode: " ++ show x - Amode amode amode_code <- memConstant (widthInBytes w) const + Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const tmp <- getNewRegNat fmt let code dst = x_code dst `appOL` amode_code `appOL` toOL [ ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -36,7 +36,7 @@ import PprBase import Hoopl.Collections import Hoopl.Label -import BasicTypes (Alignment) +import BasicTypes (Alignment, mkAlignment, alignmentBytes) import DynFlags import Cmm hiding (topInfoTable) import BlockId @@ -72,7 +72,7 @@ import Data.Bits pprProcAlignment :: SDoc pprProcAlignment = sdocWithDynFlags $ \dflags -> - (maybe empty pprAlign . cmmProcAlignment $ dflags) + (maybe empty (pprAlign . mkAlignment) (cmmProcAlignment dflags)) pprNatCmmDecl :: NatCmmDecl (Alignment, CmmStatics) Instr -> SDoc pprNatCmmDecl (CmmData section dats) = @@ -236,14 +236,15 @@ pprLabel lbl = pprGloblDecl lbl $$ pprTypeDecl lbl $$ (ppr lbl <> char ':') -pprAlign :: Int -> SDoc -pprAlign bytes +pprAlign :: Alignment -> SDoc +pprAlign alignment = sdocWithPlatform $ \platform -> - text ".align " <> int (alignment platform) + text ".align " <> int (alignmentOn platform) where - alignment platform = if platformOS platform == OSDarwin - then log2 bytes - else bytes + bytes = alignmentBytes alignment + alignmentOn platform = if platformOS platform == OSDarwin + then log2 bytes + else bytes log2 :: Int -> Int -- cache the common ones log2 1 = 0 ===================================== compiler/utils/Util.hs ===================================== @@ -1149,7 +1149,6 @@ exactLog2 x pow2 x | x == 1 = 0 | otherwise = 1 + pow2 (x `shiftR` 1) - {- -- ----------------------------------------------------------------------------- -- Floats ===================================== docs/users_guide/8.10.1-notes.rst ===================================== @@ -61,6 +61,11 @@ Compiler :ghc-flag:`-Wredundant-record-wildcards` which warn users when they have redundant or unused uses of a record wildcard match. +- Calls to `memset` are now unrolled more aggressively and the + produced code is more efficient on `x86_64` with added support for + 64-bit `MOV`s. In particular, `setByteArray#` calls that were not + optimized before, now will be. See :ghc-ticket:`16052`. + Runtime system ~~~~~~~~~~~~~~ ===================================== testsuite/driver/testlib.py ===================================== @@ -1131,9 +1131,9 @@ def do_compile(name, way, should_fail, top_mod, extra_mods, extra_hc_opts, **kwa # no problems found, this test passed return passed() -def compile_cmp_asm( name, way, extra_hc_opts ): +def compile_cmp_asm( name, way, ext, extra_hc_opts ): print('Compile only, extra args = ', extra_hc_opts) - result = simple_build(name + '.cmm', way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) + result = simple_build(name + '.' + ext, way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) if badResult(result): return result @@ -1153,6 +1153,24 @@ def compile_cmp_asm( name, way, extra_hc_opts ): # no problems found, this test passed return passed() +def compile_grep_asm( name, way, ext, is_substring, extra_hc_opts ): + print('Compile only, extra args = ', extra_hc_opts) + result = simple_build(name + '.' + ext, way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) + + if badResult(result): + return result + + expected_pat_file = find_expected_file(name, 'asm') + actual_asm_file = add_suffix(name, 's') + + if not grep_output(join_normalisers(normalise_errmsg), + expected_pat_file, actual_asm_file, + is_substring): + return failBecause('asm mismatch') + + # no problems found, this test passed + return passed() + # ----------------------------------------------------------------------------- # Compile-and-run tests @@ -1735,6 +1753,43 @@ def compare_outputs(way, kind, normaliser, expected_file, actual_file, diff_file else: return False +# Checks that each line from pattern_file is present in actual_file as +# a substring or regex pattern depending on is_substring. +def grep_output(normaliser, pattern_file, actual_file, is_substring=True): + expected_path = in_srcdir(pattern_file) + actual_path = in_testdir(actual_file) + + expected_patterns = read_no_crs(expected_path).strip().split('\n') + actual_raw = read_no_crs(actual_path) + actual_str = normaliser(actual_raw) + + success = True + failed_patterns = [] + + def regex_match(pat, actual): + return re.search(pat, actual) is not None + + def substring_match(pat, actual): + return pat in actual + + def is_match(pat, actual): + if is_substring: + return substring_match(pat, actual) + else: + return regex_match(pat, actual) + + for pat in expected_patterns: + if not is_match(pat, actual_str): + success = False + failed_patterns.append(pat) + + if not success: + print('Actual output does not contain the following patterns:') + for pat in failed_patterns: + print(pat) + + return success + # Note [Output comparison] # # We do two types of output comparison: ===================================== testsuite/tests/codeGen/should_gen_asm/all.T ===================================== @@ -3,7 +3,8 @@ is_amd64_codegen = [ when(unregisterised(), skip), ] -test('memcpy', is_amd64_codegen, compile_cmp_asm, ['']) -test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['']) -test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['']) -test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['']) +test('memcpy', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('bytearray-memset-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.asm ===================================== @@ -0,0 +1,6 @@ +movq $72340172838076673,%rcx +movq %rcx,0(%rbx) +movq %rcx,8(%rbx) +movl $16843009,16(%rbx) +movw $257,20(%rbx) +movb $1,22(%rbx) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.hs ===================================== @@ -0,0 +1,17 @@ +{-# language MagicHash #-} +{-# language UnboxedTuples #-} + +module FillArray + ( fill + ) where + +import GHC.Exts +import GHC.IO + +data ByteArray = ByteArray ByteArray# + +fill :: IO ByteArray +fill = IO $ \s0 -> case newByteArray# 24# s0 of + (# s1, m #) -> case setByteArray# m 0# 23# 1# s1 of + s2 -> case unsafeFreezeByteArray# m s2 of + (# s3, r #) -> (# s3, ByteArray r #) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/36d380475d9056fdf93305985be3def00aaf6cf7...14a7870770fb2dd2e096bc13e8c927d49c868911 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/36d380475d9056fdf93305985be3def00aaf6cf7...14a7870770fb2dd2e096bc13e8c927d49c868911 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 16:34:32 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Tue, 09 Apr 2019 12:34:32 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/fix-marge Message-ID: <5cacc998c5a2f_62b33fa2ccc0a81025438ef@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/fix-marge at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/fix-marge You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 16:35:04 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Tue, 09 Apr 2019 12:35:04 -0400 Subject: [Git][ghc/ghc][wip/fix-marge] 13 commits: Fix #16282. Message-ID: <5cacc9b81aa0d_62b33fa2e6cd4844254467f@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/fix-marge at Glasgow Haskell Compiler / GHC Commits: 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 6f20b58d by Ben Gamari at 2019-04-09T16:34:58Z gitlab: Don't run lint-submods job on Marge jobs This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - libraries/ghc-prim/cbits/bitrev.c - libraries/hpc - settings.in - testsuite/driver/testlib.py - + testsuite/tests/codeGen/should_run/NewSmallArray.hs - + testsuite/tests/codeGen/should_run/NewSmallArray.stdout - testsuite/tests/codeGen/should_run/all.T - + testsuite/tests/driver/T16500/A.hs - + testsuite/tests/driver/T16500/B.hs - + testsuite/tests/driver/T16500/Makefile - + testsuite/tests/driver/T16500/T16500.stdout - + testsuite/tests/driver/T16500/all.T - testsuite/tests/ffi/should_compile/all.T - + testsuite/tests/ffi/should_compile/cc017.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/5bcaf1e3fe9c2df0ab00fe7e2c65c798a10f617b...6f20b58d777341387f2d829e7c167990f72844e5 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/5bcaf1e3fe9c2df0ab00fe7e2c65c798a10f617b...6f20b58d777341387f2d829e7c167990f72844e5 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 16:38:38 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Tue, 09 Apr 2019 12:38:38 -0400 Subject: [Git][ghc/ghc][wip/fix-marge] gitlab: Don't run lint-submods job on Marge branches Message-ID: <5cacca8e7db67_62b33fa2ccc0a81025461dd@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/fix-marge at Glasgow Haskell Compiler / GHC Commits: 91cf8fc1 by Ben Gamari at 2019-04-09T16:38:23Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -80,7 +80,14 @@ lint-submods: refs: - master - /ghc-[0-9]+\.[0-9]+/ - - wip/marge_bot_batch_merge_job + +lint-submods-marge: + extends: .lint-submods + only: + refs: + - merge_requests + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*wip/marge_bot_batch_merge_job.*/ lint-submods-mr: extends: .lint-submods View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/91cf8fc1a701831cbd2aa6018b5220f02ecad118 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/91cf8fc1a701831cbd2aa6018b5220f02ecad118 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 16:40:08 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 12:40:08 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 6 commits: users-guide: Document how to disable package environments Message-ID: <5caccae8399c2_62b34bdd54025479af@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - 97b412e5 by Sylvain Henry at 2019-04-09T16:39:57Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 8e81d0ca by Ben Gamari at 2019-04-09T16:39:57Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - 15 changed files: - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/codeGen/StgCmmPrim.hs - compiler/main/DynFlags.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Ppr.hs - compiler/utils/Util.hs - docs/users_guide/8.10.1-notes.rst - docs/users_guide/packages.rst - hadrian/src/Rules/BinaryDist.hs - testsuite/driver/testlib.py - testsuite/tests/codeGen/should_gen_asm/all.T - + testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.asm - + testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.hs Changes: ===================================== .gitlab-ci.yml ===================================== @@ -4,6 +4,10 @@ variables: # Commit of ghc/ci-images repository from which to pull Docker images DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + # Sequential version number capturing the versions of all tools fetched by + # .gitlab/win32-init.sh. + WINDOWS_TOOLCHAIN_VERSION: 1 + before_script: - python3 .gitlab/fix-submodules.py - git submodule sync --recursive @@ -525,7 +529,7 @@ validate-x86_64-windows-hadrian: variables: MSYSTEM: MINGW64 cache: - key: x86_64-windows-hadrian + key: "x86_64-windows-hadrian-$WINDOWS_TOOLCHAIN_VERSION" nightly-i386-windows-hadrian: extends: .build-windows-hadrian @@ -535,7 +539,7 @@ nightly-i386-windows-hadrian: variables: - $NIGHTLY cache: - key: i386-windows-hadrian + key: "i386-windows-hadrian-$WINDOWS_TOOLCHAIN_VERSION" .build-windows-make: extends: .build-windows @@ -571,7 +575,7 @@ validate-x86_64-windows: MSYSTEM: MINGW64 CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" cache: - key: x86_64-windows + key: "x86_64-windows-$WINDOWS_TOOLCHAIN_VERSION" # Normal Windows validate builds are profiled; that won't do for releases. release-x86_64-windows: @@ -592,7 +596,7 @@ release-i386-windows: BUILD_FLAVOUR: "perf" CONFIGURE_ARGS: "--target=i386-unknown-mingw32" cache: - key: i386-windows + key: "i386-windows-$WINDOWS_TOOLCHAIN_VERSION" nightly-i386-windows: extends: .build-windows-make @@ -603,7 +607,7 @@ nightly-i386-windows: MSYSTEM: MINGW32 CONFIGURE_ARGS: "--target=i386-unknown-mingw32" cache: - key: i386-windows + key: "i386-windows-$WINDOWS_TOOLCHAIN_VERSION" ############################################################ # Cleanup ===================================== .gitlab/win32-init.sh ===================================== @@ -27,7 +27,8 @@ if [ ! -e $toolchain/bin/ghc ]; then fi if [ ! -e $toolchain/bin/cabal ]; then - curl https://www.haskell.org/cabal/release/cabal-install-2.2.0.0/cabal-install-2.2.0.0-i386-unknown-mingw32.zip > /tmp/cabal.zip + url="https://downloads.haskell.org/~cabal/cabal-install-latest/cabal-install-2.4.1.0-x86_64-unknown-mingw32.zip" + curl $url > /tmp/cabal.zip unzip /tmp/cabal.zip mv cabal.exe $toolchain/bin fi ===================================== compiler/basicTypes/BasicTypes.hs ===================================== @@ -26,7 +26,7 @@ module BasicTypes( Arity, RepArity, JoinArity, - Alignment, + Alignment, mkAlignment, alignmentOf, alignmentBytes, PromotionFlag(..), isPromoted, FunctionOrData(..), @@ -116,6 +116,7 @@ import Outputable import SrcLoc ( Located,unLoc ) import Data.Data hiding (Fixity, Prefix, Infix) import Data.Function (on) +import Data.Bits {- ************************************************************************ @@ -196,8 +197,39 @@ fIRST_TAG = 1 ************************************************************************ -} -type Alignment = Int -- align to next N-byte boundary (N must be a power of 2). - +-- | A power-of-two alignment +newtype Alignment = Alignment { alignmentBytes :: Int } deriving (Eq, Ord) + +-- Builds an alignment, throws on non power of 2 input. This is not +-- ideal, but convenient for internal use and better then silently +-- passing incorrect data. +mkAlignment :: Int -> Alignment +mkAlignment n + | n == 1 = Alignment 1 + | n == 2 = Alignment 2 + | n == 4 = Alignment 4 + | n == 8 = Alignment 8 + | n == 16 = Alignment 16 + | n == 32 = Alignment 32 + | n == 64 = Alignment 64 + | n == 128 = Alignment 128 + | n == 256 = Alignment 256 + | n == 512 = Alignment 512 + | otherwise = panic "mkAlignment: received either a non power of 2 argument or > 512" + +-- Calculates an alignment of a number. x is aligned at N bytes means +-- the remainder from x / N is zero. Currently, interested in N <= 8, +-- but can be expanded to N <= 16 or N <= 32 if used within SSE or AVX +-- context. +alignmentOf :: Int -> Alignment +alignmentOf x = case x .&. 7 of + 0 -> Alignment 8 + 4 -> Alignment 4 + 2 -> Alignment 2 + _ -> Alignment 1 + +instance Outputable Alignment where + ppr (Alignment m) = ppr m {- ************************************************************************ * * ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -2073,10 +2073,17 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do -- character. doSetByteArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> FCode () -doSetByteArrayOp ba off len c - = do dflags <- getDynFlags - p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off - emitMemsetCall p c len 1 +doSetByteArrayOp ba off len c = do + dflags <- getDynFlags + + let byteArrayAlignment = wordAlignment dflags -- known since BA is allocated on heap + offsetAlignment = case off of + CmmLit (CmmInt intOff _) -> alignmentOf (fromInteger intOff) + _ -> mkAlignment 1 + align = min byteArrayAlignment offsetAlignment + + p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off + emitMemsetCall p c len align -- ---------------------------------------------------------------------------- -- Allocating arrays @@ -2347,7 +2354,7 @@ emitSetCards dst_start dst_cards_start n = do emitMemsetCall (cmmAddWord dflags dst_cards_start start_card) (mkIntExpr dflags 1) (cmmAddWord dflags (cmmSubWord dflags end_card start_card) (mkIntExpr dflags 1)) - 1 -- no alignment (1 byte) + (mkAlignment 1) -- no alignment (1 byte) -- Convert an element index to a card index cardCmm :: DynFlags -> CmmExpr -> CmmExpr @@ -2473,11 +2480,11 @@ emitMemmoveCall dst src n align = do -- | Emit a call to @memset at . The second argument must fit inside an -- unsigned char. -emitMemsetCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemsetCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemsetCall dst c n align = do emitPrimCall [ {- no results -} ] - (MO_Memset align) + (MO_Memset (alignmentBytes align)) [ dst, c, n ] emitMemcmpCall :: LocalReg -> CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () ===================================== compiler/main/DynFlags.hs ===================================== @@ -147,6 +147,7 @@ module DynFlags ( #include "GHCConstantsHaskellExports.hs" bLOCK_SIZE_W, wORD_SIZE_IN_BITS, + wordAlignment, tAG_MASK, mAX_PTR_TAG, tARGET_MIN_INT, tARGET_MAX_INT, tARGET_MAX_WORD, @@ -205,7 +206,7 @@ import Maybes import MonadUtils import qualified Pretty import SrcLoc -import BasicTypes ( IntWithInf, treatZeroAsInf ) +import BasicTypes ( Alignment, alignmentOf, IntWithInf, treatZeroAsInf ) import FastString import Fingerprint import Outputable @@ -5661,6 +5662,9 @@ bLOCK_SIZE_W dflags = bLOCK_SIZE dflags `quot` wORD_SIZE dflags wORD_SIZE_IN_BITS :: DynFlags -> Int wORD_SIZE_IN_BITS dflags = wORD_SIZE dflags * 8 +wordAlignment :: DynFlags -> Alignment +wordAlignment dflags = alignmentOf (wORD_SIZE dflags) + tAG_MASK :: DynFlags -> Int tAG_MASK dflags = (1 `shiftL` tAG_BITS dflags) - 1 ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -128,7 +128,7 @@ cmmTopCodeGen (CmmProc info lab live graph) = do Nothing -> return tops cmmTopCodeGen (CmmData sec dat) = do - return [CmmData sec (1, dat)] -- no translation, we just use CmmStatic + return [CmmData sec (mkAlignment 1, dat)] -- no translation, we just use CmmStatic basicBlockCodeGen @@ -569,7 +569,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = return (Any format code) | otherwise = do - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit loadFloatAmode True w addr code float_const_x87 = case w of @@ -583,7 +583,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = in return (Any FF80 code) _otherwise -> do - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit loadFloatAmode False w addr code -- catch simple cases of zero- or sign-extended load @@ -1247,7 +1247,7 @@ getNonClobberedOperand (CmmLit lit) = do if use_sse2 && isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit return (OpAddr addr, code) else do @@ -1303,7 +1303,7 @@ getOperand (CmmLit lit) = do if (use_sse2 && isSuitableFloatingPointLit lit) then do let CmmFloat _ w = lit - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit return (OpAddr addr, code) else do @@ -1351,7 +1351,7 @@ addAlignmentCheck align reg = , JXX_GBL NE $ ImmCLbl mkBadAlignmentLabel ] -memConstant :: Int -> CmmLit -> NatM Amode +memConstant :: Alignment -> CmmLit -> NatM Amode memConstant align lit = do lbl <- getNewLabelNat let rosection = Section ReadOnlyData lbl @@ -1848,17 +1848,25 @@ genCCall dflags _ (PrimTarget (MO_Memset align)) _ CmmLit (CmmInt c _), CmmLit (CmmInt n _)] _ - | fromInteger insns <= maxInlineMemsetInsns dflags && align .&. 3 == 0 = do + | fromInteger insns <= maxInlineMemsetInsns dflags = do code_dst <- getAnyReg dst dst_r <- getNewRegNat format - return $ code_dst dst_r `appOL` go dst_r (fromInteger n) + if format == II64 && n >= 8 then do + code_imm8byte <- getAnyReg (CmmLit (CmmInt c8 W64)) + imm8byte_r <- getNewRegNat II64 + return $ code_dst dst_r `appOL` + code_imm8byte imm8byte_r `appOL` + go8 dst_r imm8byte_r (fromInteger n) + else + return $ code_dst dst_r `appOL` + go4 dst_r (fromInteger n) where - (format, val) = case align .&. 3 of - 2 -> (II16, c2) - 0 -> (II32, c4) - _ -> (II8, c) + maxAlignment = wordAlignment dflags -- only machine word wide MOVs are supported + effectiveAlignment = min (alignmentOf align) maxAlignment + format = intFormat . widthFromBytes $ alignmentBytes effectiveAlignment c2 = c `shiftL` 8 .|. c c4 = c2 `shiftL` 16 .|. c2 + c8 = c4 `shiftL` 32 .|. c4 -- The number of instructions we will generate (approx). We need 1 -- instructions per move. @@ -1868,25 +1876,45 @@ genCCall dflags _ (PrimTarget (MO_Memset align)) _ sizeBytes :: Integer sizeBytes = fromIntegral (formatInBytes format) - go :: Reg -> Integer -> OrdList Instr - go dst i - -- TODO: Add movabs instruction and support 64-bit sets. - | i >= sizeBytes = -- This might be smaller than the below sizes - unitOL (MOV format (OpImm (ImmInteger val)) (OpAddr dst_addr)) `appOL` - go dst (i - sizeBytes) - | i >= 4 = -- Will never happen on 32-bit - unitOL (MOV II32 (OpImm (ImmInteger c4)) (OpAddr dst_addr)) `appOL` - go dst (i - 4) - | i >= 2 = - unitOL (MOV II16 (OpImm (ImmInteger c2)) (OpAddr dst_addr)) `appOL` - go dst (i - 2) - | i >= 1 = - unitOL (MOV II8 (OpImm (ImmInteger c)) (OpAddr dst_addr)) `appOL` - go dst (i - 1) - | otherwise = nilOL + -- Depending on size returns the widest MOV instruction and its + -- width. + gen4 :: AddrMode -> Integer -> (InstrBlock, Integer) + gen4 addr size + | size >= 4 = + (unitOL (MOV II32 (OpImm (ImmInteger c4)) (OpAddr addr)), 4) + | size >= 2 = + (unitOL (MOV II16 (OpImm (ImmInteger c2)) (OpAddr addr)), 2) + | size >= 1 = + (unitOL (MOV II8 (OpImm (ImmInteger c)) (OpAddr addr)), 1) + | otherwise = (nilOL, 0) + + -- Generates a 64-bit wide MOV instruction from REG to MEM. + gen8 :: AddrMode -> Reg -> InstrBlock + gen8 addr reg8byte = + unitOL (MOV format (OpReg reg8byte) (OpAddr addr)) + + -- Unrolls memset when the widest MOV is <= 4 bytes. + go4 :: Reg -> Integer -> InstrBlock + go4 dst left = + if left <= 0 then nilOL + else curMov `appOL` go4 dst (left - curWidth) where - dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone - (ImmInteger (n - i)) + possibleWidth = minimum [left, sizeBytes] + dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone (ImmInteger (n - left)) + (curMov, curWidth) = gen4 dst_addr possibleWidth + + -- Unrolls memset when the widest MOV is 8 bytes (thus another Reg + -- argument). Falls back to go4 when all 8 byte moves are + -- exhausted. + go8 :: Reg -> Reg -> Integer -> InstrBlock + go8 dst reg8byte left = + if possibleWidth >= 8 then + let curMov = gen8 dst_addr reg8byte + in curMov `appOL` go8 dst reg8byte (left - 8) + else go4 dst left + where + possibleWidth = minimum [left, sizeBytes] + dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone (ImmInteger (n - left)) genCCall _ _ (PrimTarget MO_WriteBarrier) _ _ _ = return nilOL -- write barrier compiles to no code on x86/x86-64; @@ -2322,7 +2350,7 @@ genCCall _ is32Bit target dest_regs args bid = do let const | FF32 <- fmt = CmmInt 0x7fffffff W32 | otherwise = CmmInt 0x7fffffffffffffff W64 - Amode amode amode_code <- memConstant (widthInBytes w) const + Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const tmp <- getNewRegNat fmt let code dst = x_code dst `appOL` amode_code `appOL` toOL [ @@ -3051,7 +3079,7 @@ createJumpTable dflags ids section lbl where blockLabel = blockLbl blockid in map jumpTableEntryRel ids | otherwise = map (jumpTableEntry dflags) ids - in CmmData section (1, Statics lbl jumpTable) + in CmmData section (mkAlignment 1, Statics lbl jumpTable) extractUnwindPoints :: [Instr] -> [UnwindPoint] extractUnwindPoints instrs = @@ -3418,7 +3446,7 @@ sse2NegCode w x = do x at FF80 -> wrongFmt x where wrongFmt x = panic $ "sse2NegCode: " ++ show x - Amode amode amode_code <- memConstant (widthInBytes w) const + Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const tmp <- getNewRegNat fmt let code dst = x_code dst `appOL` amode_code `appOL` toOL [ ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -36,7 +36,7 @@ import PprBase import Hoopl.Collections import Hoopl.Label -import BasicTypes (Alignment) +import BasicTypes (Alignment, mkAlignment, alignmentBytes) import DynFlags import Cmm hiding (topInfoTable) import BlockId @@ -72,7 +72,7 @@ import Data.Bits pprProcAlignment :: SDoc pprProcAlignment = sdocWithDynFlags $ \dflags -> - (maybe empty pprAlign . cmmProcAlignment $ dflags) + (maybe empty (pprAlign . mkAlignment) (cmmProcAlignment dflags)) pprNatCmmDecl :: NatCmmDecl (Alignment, CmmStatics) Instr -> SDoc pprNatCmmDecl (CmmData section dats) = @@ -236,14 +236,15 @@ pprLabel lbl = pprGloblDecl lbl $$ pprTypeDecl lbl $$ (ppr lbl <> char ':') -pprAlign :: Int -> SDoc -pprAlign bytes +pprAlign :: Alignment -> SDoc +pprAlign alignment = sdocWithPlatform $ \platform -> - text ".align " <> int (alignment platform) + text ".align " <> int (alignmentOn platform) where - alignment platform = if platformOS platform == OSDarwin - then log2 bytes - else bytes + bytes = alignmentBytes alignment + alignmentOn platform = if platformOS platform == OSDarwin + then log2 bytes + else bytes log2 :: Int -> Int -- cache the common ones log2 1 = 0 ===================================== compiler/utils/Util.hs ===================================== @@ -1149,7 +1149,6 @@ exactLog2 x pow2 x | x == 1 = 0 | otherwise = 1 + pow2 (x `shiftR` 1) - {- -- ----------------------------------------------------------------------------- -- Floats ===================================== docs/users_guide/8.10.1-notes.rst ===================================== @@ -61,6 +61,11 @@ Compiler :ghc-flag:`-Wredundant-record-wildcards` which warn users when they have redundant or unused uses of a record wildcard match. +- Calls to `memset` are now unrolled more aggressively and the + produced code is more efficient on `x86_64` with added support for + 64-bit `MOV`s. In particular, `setByteArray#` calls that were not + optimized before, now will be. See :ghc-ticket:`16052`. + Runtime system ~~~~~~~~~~~~~~ ===================================== docs/users_guide/packages.rst ===================================== @@ -579,6 +579,12 @@ must be relative to the location of the package environment file. Use the package environment in ⟨file⟩, or in ``$HOME/.ghc/arch-os-version/environments/⟨name⟩`` + If set to ``-`` no package environment is read. + +.. envvar:: GHC_ENVIRONMENT + + Specifies the path to the package environment file to be used by GHC. + Overridden by the :ghc-flag:`-package-env ⟨file⟩|⟨name⟩` flag if set. In order, ``ghc`` will look for the package environment in the following locations: @@ -588,11 +594,11 @@ locations: - File ``$HOME/.ghc/arch-os-version/environments/name`` if you pass the option ``-package-env ⟨name⟩``. -- File ⟨file⟩ if the environment variable ``GHC_ENVIRONMENT`` is set to +- File ⟨file⟩ if the environment variable :envvar:`GHC_ENVIRONMENT` is set to ⟨file⟩. - File ``$HOME/.ghc/arch-os-version/environments/name`` if the - environment variable ``GHC_ENVIRONMENT`` is set to ⟨name⟩. + environment variable :envvar:`GHC_ENVIRONMENT` is set to ⟨name⟩. Additionally, unless ``-hide-all-packages`` is specified ``ghc`` will also look for the package environment in the following locations: ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -86,6 +86,12 @@ you can simply do: ./configure --prefix= [... other configure options ...] make install +In order to support @bin@ and @lib@ directories that don't sit next to each +other, the install script: + * installs programs into @LIBDIR/ghc-VERSION/bin@ + * installs libraries into @LIBDIR/ghc-VERSION/lib@ + * installs the wrappers scripts into @BINDIR@ directory + -} bindistRules :: Rules () @@ -268,6 +274,7 @@ bindistMakefile = unlines , "install: install_mingw update_package_db" , "" , "ActualBinsDir=${ghclibdir}/bin" + , "ActualLibsDir=${ghclibdir}/lib" , "WrapperBinsDir=${bindir}" , "" , "# We need to install binaries relative to libraries." @@ -288,10 +295,10 @@ bindistMakefile = unlines , "" , "LIBRARIES = $(wildcard ./lib/*)" , "install_lib:" - , "\t at echo \"Copying libraries to $(libdir)\"" - , "\t$(INSTALL_DIR) \"$(libdir)\"" + , "\t at echo \"Copying libraries to $(ActualLibsDir)\"" + , "\t$(INSTALL_DIR) \"$(ActualLibsDir)\"" , "\tfor i in $(LIBRARIES); do \\" - , "\t\tcp -R $$i \"$(libdir)/\"; \\" + , "\t\tcp -R $$i \"$(ActualLibsDir)/\"; \\" , "\tdone" , "" , "INCLUDES = $(wildcard ./include/*)" @@ -317,9 +324,9 @@ bindistMakefile = unlines , "\t$(foreach p, $(BINARY_NAMES),\\" , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ - "$(libdir),$(docdir),$(includedir)))" + "$(ActualLibsDir),$(docdir),$(includedir)))" , "" - , "PKG_CONFS = $(wildcard $(libdir)/package.conf.d/*)" + , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" , "\t at echo \"Updating the package DB\"" , "\t$(foreach p, $(PKG_CONFS),\\" ===================================== testsuite/driver/testlib.py ===================================== @@ -1131,9 +1131,9 @@ def do_compile(name, way, should_fail, top_mod, extra_mods, extra_hc_opts, **kwa # no problems found, this test passed return passed() -def compile_cmp_asm( name, way, extra_hc_opts ): +def compile_cmp_asm( name, way, ext, extra_hc_opts ): print('Compile only, extra args = ', extra_hc_opts) - result = simple_build(name + '.cmm', way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) + result = simple_build(name + '.' + ext, way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) if badResult(result): return result @@ -1153,6 +1153,24 @@ def compile_cmp_asm( name, way, extra_hc_opts ): # no problems found, this test passed return passed() +def compile_grep_asm( name, way, ext, is_substring, extra_hc_opts ): + print('Compile only, extra args = ', extra_hc_opts) + result = simple_build(name + '.' + ext, way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) + + if badResult(result): + return result + + expected_pat_file = find_expected_file(name, 'asm') + actual_asm_file = add_suffix(name, 's') + + if not grep_output(join_normalisers(normalise_errmsg), + expected_pat_file, actual_asm_file, + is_substring): + return failBecause('asm mismatch') + + # no problems found, this test passed + return passed() + # ----------------------------------------------------------------------------- # Compile-and-run tests @@ -1735,6 +1753,43 @@ def compare_outputs(way, kind, normaliser, expected_file, actual_file, diff_file else: return False +# Checks that each line from pattern_file is present in actual_file as +# a substring or regex pattern depending on is_substring. +def grep_output(normaliser, pattern_file, actual_file, is_substring=True): + expected_path = in_srcdir(pattern_file) + actual_path = in_testdir(actual_file) + + expected_patterns = read_no_crs(expected_path).strip().split('\n') + actual_raw = read_no_crs(actual_path) + actual_str = normaliser(actual_raw) + + success = True + failed_patterns = [] + + def regex_match(pat, actual): + return re.search(pat, actual) is not None + + def substring_match(pat, actual): + return pat in actual + + def is_match(pat, actual): + if is_substring: + return substring_match(pat, actual) + else: + return regex_match(pat, actual) + + for pat in expected_patterns: + if not is_match(pat, actual_str): + success = False + failed_patterns.append(pat) + + if not success: + print('Actual output does not contain the following patterns:') + for pat in failed_patterns: + print(pat) + + return success + # Note [Output comparison] # # We do two types of output comparison: ===================================== testsuite/tests/codeGen/should_gen_asm/all.T ===================================== @@ -3,7 +3,8 @@ is_amd64_codegen = [ when(unregisterised(), skip), ] -test('memcpy', is_amd64_codegen, compile_cmp_asm, ['']) -test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['']) -test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['']) -test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['']) +test('memcpy', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('bytearray-memset-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.asm ===================================== @@ -0,0 +1,6 @@ +movq $72340172838076673,%rcx +movq %rcx,0(%rbx) +movq %rcx,8(%rbx) +movl $16843009,16(%rbx) +movw $257,20(%rbx) +movb $1,22(%rbx) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.hs ===================================== @@ -0,0 +1,17 @@ +{-# language MagicHash #-} +{-# language UnboxedTuples #-} + +module FillArray + ( fill + ) where + +import GHC.Exts +import GHC.IO + +data ByteArray = ByteArray ByteArray# + +fill :: IO ByteArray +fill = IO $ \s0 -> case newByteArray# 24# s0 of + (# s1, m #) -> case setByteArray# m 0# 23# 1# s1 of + s2 -> case unsafeFreezeByteArray# m s2 of + (# s3, r #) -> (# s3, ByteArray r #) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/06b04e44e9ac588fbd54d1c6f904275de6d248c1...8e81d0ca39594f1c9ab1fb00ac80ef6a2cd85cb9 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/06b04e44e9ac588fbd54d1c6f904275de6d248c1...8e81d0ca39594f1c9ab1fb00ac80ef6a2cd85cb9 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 16:56:22 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 12:56:22 -0400 Subject: [Git][ghc/ghc][master] Hadrian: fix library install paths in bindist Makefile (#16498) Message-ID: <5cacceb673299_62b3e6102c025664bf@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 1 changed file: - hadrian/src/Rules/BinaryDist.hs Changes: ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -86,6 +86,12 @@ you can simply do: ./configure --prefix= [... other configure options ...] make install +In order to support @bin@ and @lib@ directories that don't sit next to each +other, the install script: + * installs programs into @LIBDIR/ghc-VERSION/bin@ + * installs libraries into @LIBDIR/ghc-VERSION/lib@ + * installs the wrappers scripts into @BINDIR@ directory + -} bindistRules :: Rules () @@ -268,6 +274,7 @@ bindistMakefile = unlines , "install: install_mingw update_package_db" , "" , "ActualBinsDir=${ghclibdir}/bin" + , "ActualLibsDir=${ghclibdir}/lib" , "WrapperBinsDir=${bindir}" , "" , "# We need to install binaries relative to libraries." @@ -288,10 +295,10 @@ bindistMakefile = unlines , "" , "LIBRARIES = $(wildcard ./lib/*)" , "install_lib:" - , "\t at echo \"Copying libraries to $(libdir)\"" - , "\t$(INSTALL_DIR) \"$(libdir)\"" + , "\t at echo \"Copying libraries to $(ActualLibsDir)\"" + , "\t$(INSTALL_DIR) \"$(ActualLibsDir)\"" , "\tfor i in $(LIBRARIES); do \\" - , "\t\tcp -R $$i \"$(libdir)/\"; \\" + , "\t\tcp -R $$i \"$(ActualLibsDir)/\"; \\" , "\tdone" , "" , "INCLUDES = $(wildcard ./include/*)" @@ -317,9 +324,9 @@ bindistMakefile = unlines , "\t$(foreach p, $(BINARY_NAMES),\\" , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ - "$(libdir),$(docdir),$(includedir)))" + "$(ActualLibsDir),$(docdir),$(includedir)))" , "" - , "PKG_CONFS = $(wildcard $(libdir)/package.conf.d/*)" + , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" , "\t at echo \"Updating the package DB\"" , "\t$(foreach p, $(PKG_CONFS),\\" View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/fe40ddd9f960d89fbb430962f642ee9b053a0492 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/fe40ddd9f960d89fbb430962f642ee9b053a0492 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 16:56:44 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 12:56:44 -0400 Subject: [Git][ghc/ghc][wip/bump-windows-cabal] 6 commits: users-guide: Document how to disable package environments Message-ID: <5caccecc27be3_62b3e2f636425666a@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/bump-windows-cabal at Glasgow Haskell Compiler / GHC Commits: 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - 15 changed files: - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/codeGen/StgCmmPrim.hs - compiler/main/DynFlags.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Ppr.hs - compiler/utils/Util.hs - docs/users_guide/8.10.1-notes.rst - docs/users_guide/packages.rst - hadrian/src/Rules/BinaryDist.hs - testsuite/driver/testlib.py - testsuite/tests/codeGen/should_gen_asm/all.T - + testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.asm - + testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.hs Changes: ===================================== .gitlab-ci.yml ===================================== @@ -4,6 +4,10 @@ variables: # Commit of ghc/ci-images repository from which to pull Docker images DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + # Sequential version number capturing the versions of all tools fetched by + # .gitlab/win32-init.sh. + WINDOWS_TOOLCHAIN_VERSION: 1 + before_script: - python3 .gitlab/fix-submodules.py - git submodule sync --recursive @@ -525,7 +529,7 @@ validate-x86_64-windows-hadrian: variables: MSYSTEM: MINGW64 cache: - key: x86_64-windows-hadrian + key: "x86_64-windows-hadrian-$WINDOWS_TOOLCHAIN_VERSION" nightly-i386-windows-hadrian: extends: .build-windows-hadrian @@ -535,7 +539,7 @@ nightly-i386-windows-hadrian: variables: - $NIGHTLY cache: - key: i386-windows-hadrian + key: "i386-windows-hadrian-$WINDOWS_TOOLCHAIN_VERSION" .build-windows-make: extends: .build-windows @@ -571,7 +575,7 @@ validate-x86_64-windows: MSYSTEM: MINGW64 CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" cache: - key: x86_64-windows + key: "x86_64-windows-$WINDOWS_TOOLCHAIN_VERSION" # Normal Windows validate builds are profiled; that won't do for releases. release-x86_64-windows: @@ -592,7 +596,7 @@ release-i386-windows: BUILD_FLAVOUR: "perf" CONFIGURE_ARGS: "--target=i386-unknown-mingw32" cache: - key: i386-windows + key: "i386-windows-$WINDOWS_TOOLCHAIN_VERSION" nightly-i386-windows: extends: .build-windows-make @@ -603,7 +607,7 @@ nightly-i386-windows: MSYSTEM: MINGW32 CONFIGURE_ARGS: "--target=i386-unknown-mingw32" cache: - key: i386-windows + key: "i386-windows-$WINDOWS_TOOLCHAIN_VERSION" ############################################################ # Cleanup ===================================== .gitlab/win32-init.sh ===================================== @@ -27,7 +27,8 @@ if [ ! -e $toolchain/bin/ghc ]; then fi if [ ! -e $toolchain/bin/cabal ]; then - curl https://www.haskell.org/cabal/release/cabal-install-2.2.0.0/cabal-install-2.2.0.0-i386-unknown-mingw32.zip > /tmp/cabal.zip + url="https://downloads.haskell.org/~cabal/cabal-install-latest/cabal-install-2.4.1.0-x86_64-unknown-mingw32.zip" + curl $url > /tmp/cabal.zip unzip /tmp/cabal.zip mv cabal.exe $toolchain/bin fi ===================================== compiler/basicTypes/BasicTypes.hs ===================================== @@ -26,7 +26,7 @@ module BasicTypes( Arity, RepArity, JoinArity, - Alignment, + Alignment, mkAlignment, alignmentOf, alignmentBytes, PromotionFlag(..), isPromoted, FunctionOrData(..), @@ -116,6 +116,7 @@ import Outputable import SrcLoc ( Located,unLoc ) import Data.Data hiding (Fixity, Prefix, Infix) import Data.Function (on) +import Data.Bits {- ************************************************************************ @@ -196,8 +197,39 @@ fIRST_TAG = 1 ************************************************************************ -} -type Alignment = Int -- align to next N-byte boundary (N must be a power of 2). - +-- | A power-of-two alignment +newtype Alignment = Alignment { alignmentBytes :: Int } deriving (Eq, Ord) + +-- Builds an alignment, throws on non power of 2 input. This is not +-- ideal, but convenient for internal use and better then silently +-- passing incorrect data. +mkAlignment :: Int -> Alignment +mkAlignment n + | n == 1 = Alignment 1 + | n == 2 = Alignment 2 + | n == 4 = Alignment 4 + | n == 8 = Alignment 8 + | n == 16 = Alignment 16 + | n == 32 = Alignment 32 + | n == 64 = Alignment 64 + | n == 128 = Alignment 128 + | n == 256 = Alignment 256 + | n == 512 = Alignment 512 + | otherwise = panic "mkAlignment: received either a non power of 2 argument or > 512" + +-- Calculates an alignment of a number. x is aligned at N bytes means +-- the remainder from x / N is zero. Currently, interested in N <= 8, +-- but can be expanded to N <= 16 or N <= 32 if used within SSE or AVX +-- context. +alignmentOf :: Int -> Alignment +alignmentOf x = case x .&. 7 of + 0 -> Alignment 8 + 4 -> Alignment 4 + 2 -> Alignment 2 + _ -> Alignment 1 + +instance Outputable Alignment where + ppr (Alignment m) = ppr m {- ************************************************************************ * * ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -2073,10 +2073,17 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do -- character. doSetByteArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> FCode () -doSetByteArrayOp ba off len c - = do dflags <- getDynFlags - p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off - emitMemsetCall p c len 1 +doSetByteArrayOp ba off len c = do + dflags <- getDynFlags + + let byteArrayAlignment = wordAlignment dflags -- known since BA is allocated on heap + offsetAlignment = case off of + CmmLit (CmmInt intOff _) -> alignmentOf (fromInteger intOff) + _ -> mkAlignment 1 + align = min byteArrayAlignment offsetAlignment + + p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off + emitMemsetCall p c len align -- ---------------------------------------------------------------------------- -- Allocating arrays @@ -2347,7 +2354,7 @@ emitSetCards dst_start dst_cards_start n = do emitMemsetCall (cmmAddWord dflags dst_cards_start start_card) (mkIntExpr dflags 1) (cmmAddWord dflags (cmmSubWord dflags end_card start_card) (mkIntExpr dflags 1)) - 1 -- no alignment (1 byte) + (mkAlignment 1) -- no alignment (1 byte) -- Convert an element index to a card index cardCmm :: DynFlags -> CmmExpr -> CmmExpr @@ -2473,11 +2480,11 @@ emitMemmoveCall dst src n align = do -- | Emit a call to @memset at . The second argument must fit inside an -- unsigned char. -emitMemsetCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemsetCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemsetCall dst c n align = do emitPrimCall [ {- no results -} ] - (MO_Memset align) + (MO_Memset (alignmentBytes align)) [ dst, c, n ] emitMemcmpCall :: LocalReg -> CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () ===================================== compiler/main/DynFlags.hs ===================================== @@ -147,6 +147,7 @@ module DynFlags ( #include "GHCConstantsHaskellExports.hs" bLOCK_SIZE_W, wORD_SIZE_IN_BITS, + wordAlignment, tAG_MASK, mAX_PTR_TAG, tARGET_MIN_INT, tARGET_MAX_INT, tARGET_MAX_WORD, @@ -205,7 +206,7 @@ import Maybes import MonadUtils import qualified Pretty import SrcLoc -import BasicTypes ( IntWithInf, treatZeroAsInf ) +import BasicTypes ( Alignment, alignmentOf, IntWithInf, treatZeroAsInf ) import FastString import Fingerprint import Outputable @@ -5661,6 +5662,9 @@ bLOCK_SIZE_W dflags = bLOCK_SIZE dflags `quot` wORD_SIZE dflags wORD_SIZE_IN_BITS :: DynFlags -> Int wORD_SIZE_IN_BITS dflags = wORD_SIZE dflags * 8 +wordAlignment :: DynFlags -> Alignment +wordAlignment dflags = alignmentOf (wORD_SIZE dflags) + tAG_MASK :: DynFlags -> Int tAG_MASK dflags = (1 `shiftL` tAG_BITS dflags) - 1 ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -128,7 +128,7 @@ cmmTopCodeGen (CmmProc info lab live graph) = do Nothing -> return tops cmmTopCodeGen (CmmData sec dat) = do - return [CmmData sec (1, dat)] -- no translation, we just use CmmStatic + return [CmmData sec (mkAlignment 1, dat)] -- no translation, we just use CmmStatic basicBlockCodeGen @@ -569,7 +569,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = return (Any format code) | otherwise = do - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit loadFloatAmode True w addr code float_const_x87 = case w of @@ -583,7 +583,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = in return (Any FF80 code) _otherwise -> do - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit loadFloatAmode False w addr code -- catch simple cases of zero- or sign-extended load @@ -1247,7 +1247,7 @@ getNonClobberedOperand (CmmLit lit) = do if use_sse2 && isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit return (OpAddr addr, code) else do @@ -1303,7 +1303,7 @@ getOperand (CmmLit lit) = do if (use_sse2 && isSuitableFloatingPointLit lit) then do let CmmFloat _ w = lit - Amode addr code <- memConstant (widthInBytes w) lit + Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit return (OpAddr addr, code) else do @@ -1351,7 +1351,7 @@ addAlignmentCheck align reg = , JXX_GBL NE $ ImmCLbl mkBadAlignmentLabel ] -memConstant :: Int -> CmmLit -> NatM Amode +memConstant :: Alignment -> CmmLit -> NatM Amode memConstant align lit = do lbl <- getNewLabelNat let rosection = Section ReadOnlyData lbl @@ -1848,17 +1848,25 @@ genCCall dflags _ (PrimTarget (MO_Memset align)) _ CmmLit (CmmInt c _), CmmLit (CmmInt n _)] _ - | fromInteger insns <= maxInlineMemsetInsns dflags && align .&. 3 == 0 = do + | fromInteger insns <= maxInlineMemsetInsns dflags = do code_dst <- getAnyReg dst dst_r <- getNewRegNat format - return $ code_dst dst_r `appOL` go dst_r (fromInteger n) + if format == II64 && n >= 8 then do + code_imm8byte <- getAnyReg (CmmLit (CmmInt c8 W64)) + imm8byte_r <- getNewRegNat II64 + return $ code_dst dst_r `appOL` + code_imm8byte imm8byte_r `appOL` + go8 dst_r imm8byte_r (fromInteger n) + else + return $ code_dst dst_r `appOL` + go4 dst_r (fromInteger n) where - (format, val) = case align .&. 3 of - 2 -> (II16, c2) - 0 -> (II32, c4) - _ -> (II8, c) + maxAlignment = wordAlignment dflags -- only machine word wide MOVs are supported + effectiveAlignment = min (alignmentOf align) maxAlignment + format = intFormat . widthFromBytes $ alignmentBytes effectiveAlignment c2 = c `shiftL` 8 .|. c c4 = c2 `shiftL` 16 .|. c2 + c8 = c4 `shiftL` 32 .|. c4 -- The number of instructions we will generate (approx). We need 1 -- instructions per move. @@ -1868,25 +1876,45 @@ genCCall dflags _ (PrimTarget (MO_Memset align)) _ sizeBytes :: Integer sizeBytes = fromIntegral (formatInBytes format) - go :: Reg -> Integer -> OrdList Instr - go dst i - -- TODO: Add movabs instruction and support 64-bit sets. - | i >= sizeBytes = -- This might be smaller than the below sizes - unitOL (MOV format (OpImm (ImmInteger val)) (OpAddr dst_addr)) `appOL` - go dst (i - sizeBytes) - | i >= 4 = -- Will never happen on 32-bit - unitOL (MOV II32 (OpImm (ImmInteger c4)) (OpAddr dst_addr)) `appOL` - go dst (i - 4) - | i >= 2 = - unitOL (MOV II16 (OpImm (ImmInteger c2)) (OpAddr dst_addr)) `appOL` - go dst (i - 2) - | i >= 1 = - unitOL (MOV II8 (OpImm (ImmInteger c)) (OpAddr dst_addr)) `appOL` - go dst (i - 1) - | otherwise = nilOL + -- Depending on size returns the widest MOV instruction and its + -- width. + gen4 :: AddrMode -> Integer -> (InstrBlock, Integer) + gen4 addr size + | size >= 4 = + (unitOL (MOV II32 (OpImm (ImmInteger c4)) (OpAddr addr)), 4) + | size >= 2 = + (unitOL (MOV II16 (OpImm (ImmInteger c2)) (OpAddr addr)), 2) + | size >= 1 = + (unitOL (MOV II8 (OpImm (ImmInteger c)) (OpAddr addr)), 1) + | otherwise = (nilOL, 0) + + -- Generates a 64-bit wide MOV instruction from REG to MEM. + gen8 :: AddrMode -> Reg -> InstrBlock + gen8 addr reg8byte = + unitOL (MOV format (OpReg reg8byte) (OpAddr addr)) + + -- Unrolls memset when the widest MOV is <= 4 bytes. + go4 :: Reg -> Integer -> InstrBlock + go4 dst left = + if left <= 0 then nilOL + else curMov `appOL` go4 dst (left - curWidth) where - dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone - (ImmInteger (n - i)) + possibleWidth = minimum [left, sizeBytes] + dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone (ImmInteger (n - left)) + (curMov, curWidth) = gen4 dst_addr possibleWidth + + -- Unrolls memset when the widest MOV is 8 bytes (thus another Reg + -- argument). Falls back to go4 when all 8 byte moves are + -- exhausted. + go8 :: Reg -> Reg -> Integer -> InstrBlock + go8 dst reg8byte left = + if possibleWidth >= 8 then + let curMov = gen8 dst_addr reg8byte + in curMov `appOL` go8 dst reg8byte (left - 8) + else go4 dst left + where + possibleWidth = minimum [left, sizeBytes] + dst_addr = AddrBaseIndex (EABaseReg dst) EAIndexNone (ImmInteger (n - left)) genCCall _ _ (PrimTarget MO_WriteBarrier) _ _ _ = return nilOL -- write barrier compiles to no code on x86/x86-64; @@ -2322,7 +2350,7 @@ genCCall _ is32Bit target dest_regs args bid = do let const | FF32 <- fmt = CmmInt 0x7fffffff W32 | otherwise = CmmInt 0x7fffffffffffffff W64 - Amode amode amode_code <- memConstant (widthInBytes w) const + Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const tmp <- getNewRegNat fmt let code dst = x_code dst `appOL` amode_code `appOL` toOL [ @@ -3051,7 +3079,7 @@ createJumpTable dflags ids section lbl where blockLabel = blockLbl blockid in map jumpTableEntryRel ids | otherwise = map (jumpTableEntry dflags) ids - in CmmData section (1, Statics lbl jumpTable) + in CmmData section (mkAlignment 1, Statics lbl jumpTable) extractUnwindPoints :: [Instr] -> [UnwindPoint] extractUnwindPoints instrs = @@ -3418,7 +3446,7 @@ sse2NegCode w x = do x at FF80 -> wrongFmt x where wrongFmt x = panic $ "sse2NegCode: " ++ show x - Amode amode amode_code <- memConstant (widthInBytes w) const + Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const tmp <- getNewRegNat fmt let code dst = x_code dst `appOL` amode_code `appOL` toOL [ ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -36,7 +36,7 @@ import PprBase import Hoopl.Collections import Hoopl.Label -import BasicTypes (Alignment) +import BasicTypes (Alignment, mkAlignment, alignmentBytes) import DynFlags import Cmm hiding (topInfoTable) import BlockId @@ -72,7 +72,7 @@ import Data.Bits pprProcAlignment :: SDoc pprProcAlignment = sdocWithDynFlags $ \dflags -> - (maybe empty pprAlign . cmmProcAlignment $ dflags) + (maybe empty (pprAlign . mkAlignment) (cmmProcAlignment dflags)) pprNatCmmDecl :: NatCmmDecl (Alignment, CmmStatics) Instr -> SDoc pprNatCmmDecl (CmmData section dats) = @@ -236,14 +236,15 @@ pprLabel lbl = pprGloblDecl lbl $$ pprTypeDecl lbl $$ (ppr lbl <> char ':') -pprAlign :: Int -> SDoc -pprAlign bytes +pprAlign :: Alignment -> SDoc +pprAlign alignment = sdocWithPlatform $ \platform -> - text ".align " <> int (alignment platform) + text ".align " <> int (alignmentOn platform) where - alignment platform = if platformOS platform == OSDarwin - then log2 bytes - else bytes + bytes = alignmentBytes alignment + alignmentOn platform = if platformOS platform == OSDarwin + then log2 bytes + else bytes log2 :: Int -> Int -- cache the common ones log2 1 = 0 ===================================== compiler/utils/Util.hs ===================================== @@ -1149,7 +1149,6 @@ exactLog2 x pow2 x | x == 1 = 0 | otherwise = 1 + pow2 (x `shiftR` 1) - {- -- ----------------------------------------------------------------------------- -- Floats ===================================== docs/users_guide/8.10.1-notes.rst ===================================== @@ -61,6 +61,11 @@ Compiler :ghc-flag:`-Wredundant-record-wildcards` which warn users when they have redundant or unused uses of a record wildcard match. +- Calls to `memset` are now unrolled more aggressively and the + produced code is more efficient on `x86_64` with added support for + 64-bit `MOV`s. In particular, `setByteArray#` calls that were not + optimized before, now will be. See :ghc-ticket:`16052`. + Runtime system ~~~~~~~~~~~~~~ ===================================== docs/users_guide/packages.rst ===================================== @@ -579,6 +579,12 @@ must be relative to the location of the package environment file. Use the package environment in ⟨file⟩, or in ``$HOME/.ghc/arch-os-version/environments/⟨name⟩`` + If set to ``-`` no package environment is read. + +.. envvar:: GHC_ENVIRONMENT + + Specifies the path to the package environment file to be used by GHC. + Overridden by the :ghc-flag:`-package-env ⟨file⟩|⟨name⟩` flag if set. In order, ``ghc`` will look for the package environment in the following locations: @@ -588,11 +594,11 @@ locations: - File ``$HOME/.ghc/arch-os-version/environments/name`` if you pass the option ``-package-env ⟨name⟩``. -- File ⟨file⟩ if the environment variable ``GHC_ENVIRONMENT`` is set to +- File ⟨file⟩ if the environment variable :envvar:`GHC_ENVIRONMENT` is set to ⟨file⟩. - File ``$HOME/.ghc/arch-os-version/environments/name`` if the - environment variable ``GHC_ENVIRONMENT`` is set to ⟨name⟩. + environment variable :envvar:`GHC_ENVIRONMENT` is set to ⟨name⟩. Additionally, unless ``-hide-all-packages`` is specified ``ghc`` will also look for the package environment in the following locations: ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -86,6 +86,12 @@ you can simply do: ./configure --prefix= [... other configure options ...] make install +In order to support @bin@ and @lib@ directories that don't sit next to each +other, the install script: + * installs programs into @LIBDIR/ghc-VERSION/bin@ + * installs libraries into @LIBDIR/ghc-VERSION/lib@ + * installs the wrappers scripts into @BINDIR@ directory + -} bindistRules :: Rules () @@ -268,6 +274,7 @@ bindistMakefile = unlines , "install: install_mingw update_package_db" , "" , "ActualBinsDir=${ghclibdir}/bin" + , "ActualLibsDir=${ghclibdir}/lib" , "WrapperBinsDir=${bindir}" , "" , "# We need to install binaries relative to libraries." @@ -288,10 +295,10 @@ bindistMakefile = unlines , "" , "LIBRARIES = $(wildcard ./lib/*)" , "install_lib:" - , "\t at echo \"Copying libraries to $(libdir)\"" - , "\t$(INSTALL_DIR) \"$(libdir)\"" + , "\t at echo \"Copying libraries to $(ActualLibsDir)\"" + , "\t$(INSTALL_DIR) \"$(ActualLibsDir)\"" , "\tfor i in $(LIBRARIES); do \\" - , "\t\tcp -R $$i \"$(libdir)/\"; \\" + , "\t\tcp -R $$i \"$(ActualLibsDir)/\"; \\" , "\tdone" , "" , "INCLUDES = $(wildcard ./include/*)" @@ -317,9 +324,9 @@ bindistMakefile = unlines , "\t$(foreach p, $(BINARY_NAMES),\\" , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ - "$(libdir),$(docdir),$(includedir)))" + "$(ActualLibsDir),$(docdir),$(includedir)))" , "" - , "PKG_CONFS = $(wildcard $(libdir)/package.conf.d/*)" + , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" , "\t at echo \"Updating the package DB\"" , "\t$(foreach p, $(PKG_CONFS),\\" ===================================== testsuite/driver/testlib.py ===================================== @@ -1131,9 +1131,9 @@ def do_compile(name, way, should_fail, top_mod, extra_mods, extra_hc_opts, **kwa # no problems found, this test passed return passed() -def compile_cmp_asm( name, way, extra_hc_opts ): +def compile_cmp_asm( name, way, ext, extra_hc_opts ): print('Compile only, extra args = ', extra_hc_opts) - result = simple_build(name + '.cmm', way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) + result = simple_build(name + '.' + ext, way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) if badResult(result): return result @@ -1153,6 +1153,24 @@ def compile_cmp_asm( name, way, extra_hc_opts ): # no problems found, this test passed return passed() +def compile_grep_asm( name, way, ext, is_substring, extra_hc_opts ): + print('Compile only, extra args = ', extra_hc_opts) + result = simple_build(name + '.' + ext, way, '-keep-s-files -O ' + extra_hc_opts, 0, '', 0, 0) + + if badResult(result): + return result + + expected_pat_file = find_expected_file(name, 'asm') + actual_asm_file = add_suffix(name, 's') + + if not grep_output(join_normalisers(normalise_errmsg), + expected_pat_file, actual_asm_file, + is_substring): + return failBecause('asm mismatch') + + # no problems found, this test passed + return passed() + # ----------------------------------------------------------------------------- # Compile-and-run tests @@ -1735,6 +1753,43 @@ def compare_outputs(way, kind, normaliser, expected_file, actual_file, diff_file else: return False +# Checks that each line from pattern_file is present in actual_file as +# a substring or regex pattern depending on is_substring. +def grep_output(normaliser, pattern_file, actual_file, is_substring=True): + expected_path = in_srcdir(pattern_file) + actual_path = in_testdir(actual_file) + + expected_patterns = read_no_crs(expected_path).strip().split('\n') + actual_raw = read_no_crs(actual_path) + actual_str = normaliser(actual_raw) + + success = True + failed_patterns = [] + + def regex_match(pat, actual): + return re.search(pat, actual) is not None + + def substring_match(pat, actual): + return pat in actual + + def is_match(pat, actual): + if is_substring: + return substring_match(pat, actual) + else: + return regex_match(pat, actual) + + for pat in expected_patterns: + if not is_match(pat, actual_str): + success = False + failed_patterns.append(pat) + + if not success: + print('Actual output does not contain the following patterns:') + for pat in failed_patterns: + print(pat) + + return success + # Note [Output comparison] # # We do two types of output comparison: ===================================== testsuite/tests/codeGen/should_gen_asm/all.T ===================================== @@ -3,7 +3,8 @@ is_amd64_codegen = [ when(unregisterised(), skip), ] -test('memcpy', is_amd64_codegen, compile_cmp_asm, ['']) -test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['']) -test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['']) -test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['']) +test('memcpy', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) +test('bytearray-memset-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.asm ===================================== @@ -0,0 +1,6 @@ +movq $72340172838076673,%rcx +movq %rcx,0(%rbx) +movq %rcx,8(%rbx) +movl $16843009,16(%rbx) +movw $257,20(%rbx) +movb $1,22(%rbx) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memset-unroll.hs ===================================== @@ -0,0 +1,17 @@ +{-# language MagicHash #-} +{-# language UnboxedTuples #-} + +module FillArray + ( fill + ) where + +import GHC.Exts +import GHC.IO + +data ByteArray = ByteArray ByteArray# + +fill :: IO ByteArray +fill = IO $ \s0 -> case newByteArray# 24# s0 of + (# s1, m #) -> case setByteArray# m 0# 23# 1# s1 of + s2 -> case unsafeFreezeByteArray# m s2 of + (# s3, r #) -> (# s3, ByteArray r #) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/96eae4e70dc870abd0c56a49d0ff4e63aee186c1...9acdc4c0ea14f890045e973dabcb5ad3bb029505 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/96eae4e70dc870abd0c56a49d0ff4e63aee186c1...9acdc4c0ea14f890045e973dabcb5ad3bb029505 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 17:02:41 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 13:02:41 -0400 Subject: [Git][ghc/ghc][master] gitlab: Bump cabal-install version used by Windows builds to 2.4 Message-ID: <5cacd031d44c6_62b33fa2ef457eec2572620@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - 2 changed files: - .gitlab-ci.yml - .gitlab/win32-init.sh Changes: ===================================== .gitlab-ci.yml ===================================== @@ -4,6 +4,10 @@ variables: # Commit of ghc/ci-images repository from which to pull Docker images DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + # Sequential version number capturing the versions of all tools fetched by + # .gitlab/win32-init.sh. + WINDOWS_TOOLCHAIN_VERSION: 1 + before_script: - python3 .gitlab/fix-submodules.py - git submodule sync --recursive @@ -525,7 +529,7 @@ validate-x86_64-windows-hadrian: variables: MSYSTEM: MINGW64 cache: - key: x86_64-windows-hadrian + key: "x86_64-windows-hadrian-$WINDOWS_TOOLCHAIN_VERSION" nightly-i386-windows-hadrian: extends: .build-windows-hadrian @@ -535,7 +539,7 @@ nightly-i386-windows-hadrian: variables: - $NIGHTLY cache: - key: i386-windows-hadrian + key: "i386-windows-hadrian-$WINDOWS_TOOLCHAIN_VERSION" .build-windows-make: extends: .build-windows @@ -571,7 +575,7 @@ validate-x86_64-windows: MSYSTEM: MINGW64 CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" cache: - key: x86_64-windows + key: "x86_64-windows-$WINDOWS_TOOLCHAIN_VERSION" # Normal Windows validate builds are profiled; that won't do for releases. release-x86_64-windows: @@ -592,7 +596,7 @@ release-i386-windows: BUILD_FLAVOUR: "perf" CONFIGURE_ARGS: "--target=i386-unknown-mingw32" cache: - key: i386-windows + key: "i386-windows-$WINDOWS_TOOLCHAIN_VERSION" nightly-i386-windows: extends: .build-windows-make @@ -603,7 +607,7 @@ nightly-i386-windows: MSYSTEM: MINGW32 CONFIGURE_ARGS: "--target=i386-unknown-mingw32" cache: - key: i386-windows + key: "i386-windows-$WINDOWS_TOOLCHAIN_VERSION" ############################################################ # Cleanup ===================================== .gitlab/win32-init.sh ===================================== @@ -27,7 +27,8 @@ if [ ! -e $toolchain/bin/ghc ]; then fi if [ ! -e $toolchain/bin/cabal ]; then - curl https://www.haskell.org/cabal/release/cabal-install-2.2.0.0/cabal-install-2.2.0.0-i386-unknown-mingw32.zip > /tmp/cabal.zip + url="https://downloads.haskell.org/~cabal/cabal-install-latest/cabal-install-2.4.1.0-x86_64-unknown-mingw32.zip" + curl $url > /tmp/cabal.zip unzip /tmp/cabal.zip mv cabal.exe $toolchain/bin fi View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9acdc4c0ea14f890045e973dabcb5ad3bb029505 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9acdc4c0ea14f890045e973dabcb5ad3bb029505 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 17:11:43 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Tue, 09 Apr 2019 13:11:43 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/codeowners Message-ID: <5cacd24fb271f_62b33fa2ef294ee825747b8@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/codeowners at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/codeowners You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 19:29:24 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 15:29:24 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] 129 commits: gitlab-ci: Always build fedora27 Message-ID: <5cacf294bff52_62b33fa2ef294ee82604725@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: b8326897 by Ben Gamari at 2019-03-18T03:16:12Z gitlab-ci: Always build fedora27 This ends up being much easier to use than Debian 9 under NixOS. - - - - - acf2129d by Ben Gamari at 2019-03-18T03:17:36Z gitlab-ci: Implement head.hackage jobs - - - - - 71648c35 by Ben Gamari at 2019-03-20T03:04:18Z gitlab-ci: Implement support for i386/Windows bindists - - - - - d94ca74f by Tamar Christina at 2019-03-20T03:10:23Z err: clean up error handler - - - - - 398f2cbc by Ben Gamari at 2019-03-20T03:16:32Z Bump Cabal submodule to 3.0 Metric Increase: haddock.Cabal - - - - - 89a201e8 by Takenobu Tani at 2019-03-20T03:22:36Z users-guide: Update Wiki URLs to point to GitLab The user's guide uses the `ghc-wiki` macro, and substitution rules are complicated. So I manually edited `.rst` files without sed. I changed `Commentary/Latedmd` only to a different page. It is more appropriate as an example. [ci skip] - - - - - 98ff1a56 by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Replace nOfThem by replicate - - - - - 6a47414f by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Fix typos - - - - - 1e26e60d by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Simplify monadic code - - - - - c045bd7c by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Remove deprecated reinitializeGlobals - - - - - 6d19ad72 by Ben Gamari at 2019-03-20T03:34:49Z gitlab-ci: Bump docker images To install lndir and un-break the source distribution job. - - - - - c7a84a60 by Matthew Pickering at 2019-03-20T03:34:50Z Update .gitlab-ci.yml - - - - - db136237 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T16219 and cabal09 as broken on Windows See #16386. - - - - - 7cd8e330 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Fix expected output on Windows for various ghci tests Broke as -Wimplicit-kind-vars no longer exists. Specifically ghci024, ghci057 and T9293. - - - - - 23b639fd by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T5836 as broken on Windows See #16387. - - - - - a1bda08d by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T15904 as broken on Windows It seems to look for some sort of manifest file. See #16388. - - - - - b7f5d552 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T16190 as broken on Windows There seems to be some filepath funniness due to TH embedding going on here. See #16389. - - - - - a0c31f78 by Ben Gamari at 2019-03-20T22:41:32Z testsuite/plugins: Add multi_cpu_race modifier on Windows A few tests previously failed with various failure modes. For instance, `plugin-recomp-change` fails with: ``` Wrong exit code for plugin-recomp-change()(expected 0 , actual 2 ) Stderr ( plugin-recomp-change ): Simple Plugin Passes Queried Got options: Simple Plugin Pass Run C://GitLabRunner//builds//8fc0e283//0//ghc//ghc//inplace//mingw//bin/ld.exe: cannot find -lHSplugin-recompilation-0.1-CPeObcGoBuvHdwBnpK9jQq collect2.exe: error: ld returned 1 exit status `gcc.exe' failed in phase `Linker'. (Exit code: 1) make[2]: *** [Makefile:112: plugin-recomp-change] Error 1 *** unexpected failure for plugin-recomp-change(normal) ``` It's unclear whether the ghc-pkg concurrency issue mentioned in all.T is the culprit but the set of tests that fail overlaps strongly with the set of tests that lack the `multi_cpu_race` modifier. Let's see if adding it fixes them. - - - - - 88a6e9a4 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T10672 as broken This test, which is only run on Windows, seems to be reliably timing out. See #16390. - - - - - f4d3aaaf by Ben Gamari at 2019-03-20T22:41:32Z testsuite/plugins: Increase compile timeout on Windows I think the linker is routinely eating through the timeout, leading to many spurious failures. - - - - - ae382245 by Ben Gamari at 2019-03-20T22:41:32Z rts/RtsSymbols: Drop __mingw_vsnwprintf As described in #16387, this is already defined by mingw and consequently defining it in the RTS as well leads to multiple definition errors from the RTS linker at runtime. - - - - - f79f93e4 by Ben Gamari at 2019-03-20T22:41:32Z Don't mark cabal09 as broken It doesn't fail reliably. - - - - - d98cb763 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Don't mark T5836 as broken I believe removing __mingw_vsnwprintf from RtsSymbols fixed #16387. - - - - - 8c1a2743 by Ben Gamari at 2019-03-20T22:41:32Z Try again - - - - - 3394a7cd by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Display observed exit code on failure due to bad exit code - - - - - 36818759 by Artem Pyanykh at 2019-03-20T23:52:39Z Adjust section placement and relocation logic for Mach-O 1. Place each section on a separate page to ensure required alignment (wastes lots ot space, needs to be improved). 2. Unwire relocation logic from macho sections (the most fiddly part is adjusting internal relocations). Other todos: 0. Add a test for section alignment. 1. Investigate 32bit relocations! 2. Fix memory leak in ZEROPAGE section allocation. 3. Fix creating redundant jump islands for GOT. 4. Investigate more compact section placement. - - - - - 78c61acf by Artem Pyanykh at 2019-03-20T23:52:39Z Use segments for section layout - - - - - 7bbfb789 by Artem Pyanykh at 2019-03-20T23:52:39Z Address some todos and fixmes - - - - - 3cdcc0b5 by Artem Pyanykh at 2019-03-20T23:52:39Z Add a linker test re: section alignment - - - - - cb745c84 by Artem Pyanykh at 2019-03-20T23:52:39Z Add missing levels to SegmentProt enum - - - - - d950f11e by Artem Pyanykh at 2019-03-20T23:52:39Z Directly test section alignment, fix internal reloc probing length - - - - - 3fb10fcf by Artem Pyanykh at 2019-03-20T23:52:39Z Gracefully handle error condition in Mach-O relocateSection - - - - - dc713c71 by Ben Gamari at 2019-03-20T23:58:49Z ci: Move validate-x86_64-linux-deb9 to full-build stage The `build` stage is meant to be a minimal smoke test to weed out broken commits. The `validate-x86_64-linux-deb9` build will generally catch a subset of issues caught by `validate-x86_64-linux-deb9-debug` so only the latter should be in `build`. - - - - - 505c5ab2 by Ben Gamari at 2019-03-20T23:58:49Z ci: Add some descriptions of the stages - - - - - 646e3dc2 by Sebastian Graf at 2019-03-21T00:04:49Z Add a bench flavour to Hadrian - - - - - 8d18a873 by Ryan Scott at 2019-03-21T00:10:57Z Reject nested predicates in impredicativity checking When GHC attempts to unify a metavariable with a type containing foralls, it will be rejected as an occurrence of impredicativity. GHC was /not/ extending the same treatment to predicate types, such as in the following (erroneous) example from #11514: ```haskell foo :: forall a. (Show a => a -> a) -> () foo = undefined ``` This will attempt to instantiate `undefined` at `(Show a => a -> a) -> ()`, which is impredicative. This patch catches impredicativity arising from predicates in this fashion. Since GHC is pickier about impredicative instantiations, some test cases needed to be updated to be updated so as not to fall afoul of the new validity check. (There were a surprising number of impredicative uses of `undefined`!) Moreover, the `T14828` test case now has slightly less informative types shown with `:print`. This is due to a a much deeper issue with the GHCi debugger (see #14828). Fixes #11514. - - - - - 7b213b8d by Ömer Sinan Ağacan at 2019-03-21T00:17:05Z Print test suite results ("unexpected failures" etc.) in sorted order Fixes #16425 - - - - - f199a843 by Simon Jakobi at 2019-03-21T00:23:15Z Check.hs: Fix a few typos - - - - - 07d44ed1 by Ben Gamari at 2019-03-21T00:29:20Z base: Depend upon shlwapi on Windows As noted in #16466, `System.Environment.getExecutablePath` depends upon `PathFileExistsW` which is defined by `shlwapi`. Fixes #16466. - - - - - 1382d09e by Ryan Scott at 2019-03-21T00:35:28Z Remove unused XArrApp and XArrForm extension points !301 removed the `HsArrApp` and `HsArrForm` constructors, which renders the corresponding extension points `XArrApp` and `XArrForm` useless. This patch finally rips them out. - - - - - 3423664b by Peter Trommler at 2019-03-21T00:41:35Z Fix specification of load_load_barrier [skip-ci] - - - - - 84c77a67 by Alexandre Esteves at 2019-03-21T21:43:03Z Fix typo [skip ci] - - - - - 7092b2de by Matthew Pickering at 2019-03-22T03:38:58Z Only run check-makefiles.py linter in testsuite dir - - - - - 322239de by Matthew Pickering at 2019-03-22T03:38:58Z Run linters on merge requests It seems that it has failed to execute at all since it was implemented. We now run the linters on merge requests. - - - - - 8f8d532c by Ben Gamari at 2019-03-22T03:45:03Z gitlab-ci: Do full `perf` build when building Windows releases - - - - - 2ef72d3f by Ben Gamari at 2019-03-22T03:45:03Z gitlab-ci: Pass --target explicitly to configure on Windows Otherwise configure fails in the 32-bit case with ``` This GHC (c:/GitLabRunner/builds/8fc0e283/0/ghc/ghc/toolchain/bin/ghc) does not generate code for the build platform GHC target platform : x86_64-unknown-mingw32 Desired build platform : i386-unknown-mingw32 ``` - - - - - 8b14f536 by Ben Gamari at 2019-03-22T03:51:08Z Bump cabal submodule Due to https://github.com/haskell/cabal/issues/5953. - - - - - dbe4557f by Matthew Pickering at 2019-03-22T14:02:32Z CI: Allow failure in packaging step This depends on the windows build which is still allowed to fail. If that job fails then the packaging job will also fail. - - - - - 366f1c68 by Ben Gamari at 2019-03-22T14:08:38Z gitlab: Deploy documentation snapshot via GitLab Pages - - - - - d608d543 by Tamar Christina at 2019-03-22T14:14:45Z Force LF line ending for md5sum [skip-ci] - - - - - cd07086a by Ben Gamari at 2019-03-22T14:34:51Z gitlab-ci: Fix linters - - - - - ab51bee4 by Herbert Valerio Riedel at 2019-03-22T14:34:51Z base: Remove `Monad(fail)` method and reexport `MonadFail(fail)` instead As per https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail Coauthored-by: Ben Gamari <ben at well-typed.com> - - - - - 266b49ca by Ben Gamari at 2019-03-22T22:33:20Z gitlab-ci: Clean up linter I'm not sure why these steps were done but they seem counterproductive and unnecessary. - - - - - 44b08ede by Ben Gamari at 2019-03-22T22:38:11Z gitlab-ci: Fix YAML syntax - - - - - 971f4530 by Ben Gamari at 2019-03-22T22:49:34Z gitlab-ci: Compute merge base against remote tracking branch Previously we would use the local branch with the name `$CI_MERGE_REQUEST_TARGET_BRANCH_NAME` to compute the merge base when linting. However, this branch isn't necessarily up-to-date. We should rather use `origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME`. - - - - - 8d01b572 by Ben Gamari at 2019-03-23T16:37:56Z gitlab-ci: Explicitly fetch target branch `git fetch`, which we used previously, doesn't update the remote tracking branches. - - - - - cd85f8a7 by Ben Gamari at 2019-03-24T12:46:13Z gitlab-ci: Allow linters to fail for now They are broken and I don't have time to fix them at the moment. - - - - - d763b2e7 by Haskell-mouse at 2019-03-25T18:02:22Z User's Guide: extensions compatibility Adds the mention that extensions "AllowAmbiguousTypes" and "RankNTypes" are not always compatible with each other. Specifies the conditions and causes of failing in resolving of ambiguity. - - - - - 200d65ef by Matthew Pickering at 2019-03-25T18:02:25Z Check hadrian/ghci.sh script output to determine pass/fail ghci always exits with exit code 0 so you have to check the output to see if the modules loaded succesfully. - - - - - 8e07368f by Matthew Pickering at 2019-03-25T18:02:27Z Refactor ./hadrian/ghci.sh for better error messages By separating these two lines, if the first command fails then `ghci` is not loaded. Before it would still load ghci but display lots of errors about not being able to find modules. - - - - - 3769e3a8 by Takenobu Tani at 2019-03-25T18:02:29Z Update Wiki URLs to point to GitLab This moves all URL references to Trac Wiki to their corresponding GitLab counterparts. This substitution is classified as follows: 1. Automated substitution using sed with Ben's mapping rule [1] Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy... New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy... 2. Manual substitution for URLs containing `#` index Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz 3. Manual substitution for strings starting with `Commentary` Old: Commentary/XxxYyy... New: commentary/xxx-yyy... See also !539 [1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json - - - - - b9da2868 by Ryan Scott at 2019-03-25T18:02:33Z Correct duplicate 4.12.0.0 entry in base's changelog See #16490. [ci skip] - - - - - ab41c1b4 by Andrey Mokhov at 2019-03-27T11:20:03Z Hadrian: Bump Shake to 0.17.6 The new release of Shake comes with these relevant features: * use symlinks for --shared * add --compact for a Bazel/Buck style output - - - - - 646f2e79 by Andrey Mokhov at 2019-03-27T11:20:03Z Hadrian: trace the execution of expensive Cabal calls We use Cabal to parse, configure, register and copy packages, which are expensive operations that are currently not visible to Shake's profiling infrastructure. By using `traced` we tell Shake to add these IO actions to the profiling report, helping us to identify performance bottlenecks. We use short tracing keys, as recommended in Shake docs: the name of the current target is already available in the rest of the profiling information. - - - - - fb12f53c by Alp Mestanogullari at 2019-03-27T11:20:05Z Hadrian: introduce an easy way for users to build with -split-sections Any user can now trivially build any number of Haskell packages with `-split-sections` by using `splitSections`/`splitSectionsIf` on any existing or new flavour: -- build all packages but the ghc library with -split-sections splitSections :: Flavour -> Flavour -- build all packages that satisfy the given predicate -- with --split-sections splitSectionsIf :: (Package -> Bool) -> Flavour -> Flavour See the new section in `doc/user-settings.md`. - - - - - 3dec527a by David Eichmann at 2019-03-27T11:20:09Z Hadrian: don't use -zorigin on darwin. - - - - - 5730f863 by Ömer Sinan Ağacan at 2019-03-27T11:20:10Z Minor refactoring in copy array primops: - `emitCopySmallArray` now checks size before generating code and doesn't generate any code when size is 0. `emitCopyArray` already does this so this makes small/large array cases the same in argument checking. - In both `emitCopySmallArray` and `emitCopyArray` read the `dflags` after checking the argument. - - - - - 4acdb769 by Chaitanya Koparkar at 2019-03-27T11:20:11Z Fix a few broken Trac links [skip ci] This patch only attempts to fix links that don't automatically re-direct to the correct URL. - - - - - 97ad5cfb by Artem Pelenitsyn at 2019-03-29T18:18:12Z Add some tips to the Troubleshooting section of README - - - - - 8a20bfc2 by Michael Peyton Jones at 2019-03-29T18:18:14Z Visibility: handle multiple units with the same name Fixes #16228. The included test case is adapted from the reproduction in the issue, and fails without this patch. ------ We compute an initial visilibity mapping for units based on what is present in the package databases. To seed this, we compute a set of all the package configs to add visibilities for. However, this set was keyed off the unit's *package name*. This is correct, since we compare packages across databases by version. However, we would only ever consider a single, most-preferable unit from the database in which it was found. The effect of this was that only one of the libraries in a Cabal package would be added to this initial set. This would cause attempts to use modules from the omitted libraries to fail, claiming that the package was hidden (even though `ghc-pkg` would correctly show it as visible). A solution is to do the selection of the most preferable packages separately, and then be sure to consider exposing all units in the same package in the same package db. We can do this by picking a most-preferable unit for each package name, and then considering exposing all units that are equi-preferable with that unit. ------ Why wasn't this bug apparent to all people trying to use sub-libraries in Cabal? The answer is that Cabal explicitly passes `-package` and `-package-id` flags for all the packages it wants to use, rather than relying on the state of the package database. So this bug only really affects people who are trying to use package databases produced by Cabal outside of Cabal itself. One particular example of this is the way that the Nixpkgs Haskell infrastructure provides wrapped GHCs: typically these are equipped with a package database containing all the needed package dependencies, and the user is not expected to pass `-package` flags explicitly. - - - - - 754b5455 by Artem Pelenitsyn at 2019-03-29T18:18:20Z docs: make nfib compute the Fibonacci sequence [skipci] - - - - - 1a567133 by Ben Gamari at 2019-03-29T18:18:20Z ci: Check that changelogs don't contain "TBA" This ensures that the release dates in the library changelogs are properly set. - - - - - 6e15ca54 by Ben Gamari at 2019-03-29T18:18:22Z Bump transformers to 0.5.6.2 See #16199. - - - - - 6f7115df by Ben Gamari at 2019-03-30T11:42:38Z ci: Ensure index.html is preserved in documentation tarball - - - - - 33173a51 by Alexandre at 2019-04-01T07:32:28Z Add support for bitreverse primop This commit includes the necessary changes in code and documentation to support a primop that reverses a word's bits. It also includes a test. - - - - - a3971b4e by Alexandre at 2019-04-01T07:32:28Z Bump ghc-prim's version where needed - - - - - 061276ea by Michael Sloan at 2019-04-01T07:32:30Z Remove unnecessary uses of UnboxedTuples pragma (see #13101 / #15454) Also removes a couple unnecessary MagicHash pragmas - - - - - e468c613 by David Eichmann at 2019-04-01T07:32:34Z Support Shake's --lint-fsatrace feature. Using this feature requires fsatrace (e.g. https://github.com/jacereda/fsatrace). Simply use the `--lint-fsatrace` option when running hadrian. Shake version >= 0.17.7 is required to support linting out of tree build dirs. - - - - - 1e9e4197 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add merge request template for backports for 8.8 - - - - - 55650d14 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add some simply issue templates - - - - - 27b99ed8 by Takenobu Tani at 2019-04-01T07:32:36Z Clean up URLs to point to GitLab This moves URL references to old Trac to their corresponding GitLab counterparts. This patch does not update the submodule library, such as libraries/Cabal. See also !539, !606, !618 [ci skip] - - - - - 18d1555d by Adam Sandberg Eriksson at 2019-04-01T07:32:38Z configure: document the use of the LD variable - - - - - 10352efa by Ben Gamari at 2019-04-01T22:22:34Z gitlab: Add feature request MR template - - - - - 1e52054b by Ben Gamari at 2019-04-01T23:16:21Z gitlab: Move feature request template to issue_templates Whoops. - - - - - e5c21ca9 by Ben Gamari at 2019-04-01T23:16:25Z gitlab: Mention ~"user facing" label - - - - - 39282422 by Ryan Scott at 2019-04-02T00:01:38Z Bump array submodule This bumps `array` to version 0.5.4.0 so that we can distinguish it with `MIN_VERSION_array` (as it introduces some changes to the `Show` instance for `UArray`). - - - - - 7cf5ba3d by Michal Terepeta at 2019-04-02T00:07:49Z Improve performance of newSmallArray# This: - Hoists part of the condition outside of the initialization loop in `stg_newSmallArrayzh`. - Annotates one of the unlikely branches as unlikely, also in `stg_newSmallArrayzh`. - Adds a couple of annotations to `allocateMightFail` indicating which branches are likely to be taken. Together this gives about 5% improvement. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - dd9c82ef by David Eichmann at 2019-04-02T00:13:55Z Hadrian: correct deps for ghc builder. Previously, when needing ghc as a builder, the ghcDeps (Files the GHC binary depends on) for the current stage were needed. This is incorrect as the previous stage's ghc is used for building. This commit fixes the issue, needing the previous stage's ghcDeps. - - - - - 345306d3 by Alexandre Baldé at 2019-04-02T16:34:30Z Fix formatting issue in ghc-prim's changelog [skip ci] - - - - - f54b5124 by David Eichmann at 2019-04-02T16:40:39Z Hadrian: traceAllow deep dependencies when compilling haskell object files. - - - - - d132b30a by David Eichmann at 2019-04-02T16:40:39Z Hadrian: lint ignore autom4te and ghc-pkg cache files. - - - - - bf734195 by Simon Marlow at 2019-04-02T16:46:46Z Add myself to libraries/ghci - - - - - 5a75ccd0 by klebinger.andreas at gmx.at at 2019-04-03T04:34:57Z Fix faulty substitutions in StgCse (#11532). `substBndr` should rename bindings which shadow existing ids. However while it was renaming the bindings it was not adding proper substitutions for renamed bindings. Instead of adding a substitution of the form `old -> new` for renamed bindings it mistakenly added `old -> old` if no replacement had taken place while adding none if `old` had been renamed. As a byproduct this should improve performance, as we no longer add useless substitutions for unshadowed bindings. - - - - - 2ec749b5 by Nathan Collins at 2019-04-03T04:41:05Z users-guide: Fix typo - - - - - ea192a09 by Andrew Martin at 2019-04-03T04:41:05Z base: Add documentation that liftA2 used to not be a typeclass method - - - - - 733f1b52 by Frank Steffahn at 2019-04-03T04:41:05Z users-guide: Typo in Users Guide, Glasgow Exts - - - - - 3364def0 by Ben Gamari at 2019-04-03T04:41:05Z integer-gmp: Write friendlier documentation for Integer - - - - - dd3a3d08 by Ben Gamari at 2019-04-03T04:41:05Z integer-simple: Add documentation for Integer type - - - - - 722fdddf by Chris Martin at 2019-04-03T04:41:05Z Correct two misspellings of "separately" - - - - - bf6dbe3d by Chris Martin at 2019-04-03T04:41:05Z Inline the definition of 'ap' in the Monad laws The law as it is currently written is meaningless, because nowhere have we defined the implementation of 'ap'. The reader of the Control.Monad documentation is provided with only a type signature, > ap :: Monad m => m (a -> b) -> m a -> m b an informal description, > In many situations, the liftM operations can be replaced by uses of > ap, which promotes function application. and a relationship between 'ap' and the 'liftM' functions > return f `ap` x1 `ap` ... `ap` xn > is equivalent to > liftMn f x1 x2 ... xn Without knowing how 'ap' is defined, a law involving 'ap' cannot provide any guidance for how to write a lawful Monad instance, nor can we conclude anything from the law. I suspect that a reader equipped with the understanding that 'ap' was defined prior to the invention of the Applicative class could deduce that 'ap' must be defined in terms of (>>=), but nowhere as far as I can tell have we written this down explicitly for readers without the benefit of historical context. If the law is meant to express a relationship among (<*>), (>>=), and 'return', it seems that it is better off making this statement directly, sidestepping 'ap' altogether. - - - - - 7b090b53 by Ben Gamari at 2019-04-03T07:57:40Z configure: Always use AC_LINK_ELSEIF when testing against assembler This fixes #16440, where the build system incorrectly concluded that the `.subsections_via_symbols` assembler directive was supported on a Linux system. This was caused by the fact that gcc was invoked with `-flto`; when so-configured gcc does not call the assembler but rather simply serialises its AST for compilation during the final link. This is described in Note [autoconf assembler checks and -flto]. - - - - - 4626cf21 by Sebastian Graf at 2019-04-03T08:03:47Z Fix Uncovered set of literal patterns Issues #16289 and #15713 are proof that the pattern match checker did an unsound job of estimating the value set abstraction corresponding to the uncovered set. The reason is that the fix from #11303 introducing `NLit` was incomplete: The `LitCon` case desugared to `Var` rather than `LitVar`, which would have done the necessary case splitting analogous to the `ConVar` case. This patch rectifies that by introducing the fresh unification variable in `LitCon` in value abstraction position rather than pattern postition, recording a constraint equating it to the constructor expression rather than the literal. Fixes #16289 and #15713. - - - - - 6f13e7b1 by Ben Gamari at 2019-04-03T12:12:26Z gitlab-ci: Build hyperlinked sources for releases Fixes #16445. - - - - - 895394c2 by Ben Gamari at 2019-04-03T12:15:06Z gitlab: Fix label names in issue templates - - - - - 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fb648b5c by Carter Tazio Schonwald at 2019-04-09T19:26:07Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 30 changed files: - .circleci/config.yml - .ghcid - + .gitattributes - .gitignore - .gitlab-ci.yml - + .gitlab/issue_templates/bug.md - + .gitlab/issue_templates/feature_request.md - .gitlab/linters/check-makefiles.py - + .gitlab/merge_request_templates/backport-for-8.8.md - .gitlab/merge_request_templates/merge-request.md - + .gitlab/start-head.hackage.sh - .gitlab/win32-init.sh - .mailmap - ANNOUNCE - CODEOWNERS - HACKING.md - INSTALL.md - MAKEHELP.md - Makefile - README.md - aclocal.m4 - bindisttest/Makefile - bindisttest/ghc.mk - boot - compiler/Makefile - compiler/basicTypes/BasicTypes.hs - compiler/basicTypes/MkId.hs - compiler/basicTypes/Module.hs - compiler/basicTypes/RdrName.hs - compiler/basicTypes/Unique.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/9d78aeaafa962119b2d26253103bae7023232bbf...fb648b5ccfa16c12f903fdc623f358a4f535d0e0 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/9d78aeaafa962119b2d26253103bae7023232bbf...fb648b5ccfa16c12f903fdc623f358a4f535d0e0 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 19:56:04 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 15:56:04 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cacf8d4b5d25_62b33fa29229ef042612899@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: f4eed071 by Carter Tazio Schonwald at 2019-04-09T19:55:46Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 29 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/Cabal - libraries/array - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc - libraries/transformers Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -95,12 +94,16 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) w = typeWidth ty gcp | isGcPtrType ty = VGcPtr | otherwise = VNonGcPtr + -- basically this now means handle args one way on x86_64 and x86 + -- and do the registers differently on the other platforms + -- THIS should be cleanuped. passFloatInXmm = passFloatArgsInXmm dflags passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - _ -> False + ArchX86 -> False + _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't -- support vector registers in its calling convention. However, this has now ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,8 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1729,6 +1729,9 @@ vecElemProjectCast _ _ _ = Nothing -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,19 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u + instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +108,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +117,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -240,46 +240,13 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this is + | X87Store Format AddrMode -- src(fpreg), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +394,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +544,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +668,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +687,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +741,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +759,9 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because float and double + -- use the same register class..on x86_64 _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -975,52 +891,8 @@ i386_insert_ffrees -> [GenBasicBlock Instr] i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False + -- ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprGInstr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,9 +25,10 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + FF64 -> VirtualRegD u _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc @@ -37,11 +38,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +51,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/Cabal ===================================== @@ -1 +1 @@ -Subproject commit 5d258537b754005d2a1d170b44d764b63ff4fc75 +Subproject commit fd51946bbb3850165de5f7b394fa987d1f4bd28e ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 +Subproject commit 8593a10f65020da3854b1c8478082d454b416118 ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd +Subproject commit 49655191d33912815a9389b764e2d89e92140938 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/f4eed0712c80e1a64c5f50a9ae03a4abff26cddc -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/f4eed0712c80e1a64c5f50a9ae03a4abff26cddc You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 20:20:22 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Tue, 09 Apr 2019 16:20:22 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] Allow unregisterised build to fail Message-ID: <5cacfe8626d4c_62b33fa2ccc0a81026192ae@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: 8f5f791b by Ben Gamari at 2019-04-09T20:20:08Z Allow unregisterised build to fail - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -388,6 +388,7 @@ nightly-x86_64-linux-deb9-integer-simple: validate-x86_64-linux-deb9-unreg: extends: .build-x86_64-linux-deb9 stage: full-build + allow_failure: true variables: CONFIGURE_ARGS: --enable-unregisterised TEST_ENV: "x86_64-linux-deb9-unreg" View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/8f5f791b79f33e0aebffadd0d824c7ae9ff5bc6e -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/8f5f791b79f33e0aebffadd0d824c7ae9ff5bc6e You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 20:27:13 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 16:27:13 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad00211a541_62b33fa2eb506e14262028a@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: 714a02e3 by Carter Tazio Schonwald at 2019-04-09T20:26:53Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 29 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/Cabal - libraries/array - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc - libraries/transformers Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -95,12 +94,16 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) w = typeWidth ty gcp | isGcPtrType ty = VGcPtr | otherwise = VNonGcPtr + -- basically this now means handle args one way on x86_64 and x86 + -- and do the registers differently on the other platforms + -- THIS should be cleanuped. passFloatInXmm = passFloatArgsInXmm dflags passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - _ -> False + ArchX86 -> False + _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't -- support vector registers in its calling convention. However, this has now ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,8 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1729,6 +1729,22 @@ vecElemProjectCast _ _ _ = Nothing -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities. +-- Phrased differently, its worth experimenting/exploring supporting +-- other register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC. + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,19 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u + instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +108,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +117,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -240,46 +240,13 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this is + | X87Store Format AddrMode -- src(fpreg), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +394,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +544,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +668,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +687,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +741,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +759,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -975,52 +892,8 @@ i386_insert_ffrees -> [GenBasicBlock Instr] i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False + -- ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprGInstr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/Cabal ===================================== @@ -1 +1 @@ -Subproject commit 5d258537b754005d2a1d170b44d764b63ff4fc75 +Subproject commit fd51946bbb3850165de5f7b394fa987d1f4bd28e ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 +Subproject commit 8593a10f65020da3854b1c8478082d454b416118 ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd +Subproject commit 49655191d33912815a9389b764e2d89e92140938 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/714a02e3358176ab1ef14a2288f50bf891083a37 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/714a02e3358176ab1ef14a2288f50bf891083a37 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 20:35:00 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 16:35:00 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad01f4373a3_62b34bdd54026234d7@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: 6b109d01 by Carter Tazio Schonwald at 2019-04-09T20:34:20Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 29 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/Cabal - libraries/array - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc - libraries/transformers Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -95,12 +94,16 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) w = typeWidth ty gcp | isGcPtrType ty = VGcPtr | otherwise = VNonGcPtr + -- basically this now means handle args one way on x86_64 and x86 + -- and do the registers differently on the other platforms + -- THIS should be cleanuped. passFloatInXmm = passFloatArgsInXmm dflags passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - _ -> False + ArchX86 -> False + _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't -- support vector registers in its calling convention. However, this has now ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,27 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities. +-- Phrased differently, its worth experimenting/exploring supporting +-- other register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,19 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u + instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +108,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +117,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- src(fpreg), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -975,52 +893,8 @@ i386_insert_ffrees -> [GenBasicBlock Instr] i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False + -- ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprGInstr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/Cabal ===================================== @@ -1 +1 @@ -Subproject commit 5d258537b754005d2a1d170b44d764b63ff4fc75 +Subproject commit fd51946bbb3850165de5f7b394fa987d1f4bd28e ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 +Subproject commit 8593a10f65020da3854b1c8478082d454b416118 ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd +Subproject commit 49655191d33912815a9389b764e2d89e92140938 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/6b109d018cff2418fa2bfcb555b27329d9689729 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/6b109d018cff2418fa2bfcb555b27329d9689729 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 20:38:37 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 16:38:37 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad02cd8216d_62b33fa2bd1455d8262728b@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: ea0a7096 by Carter Tazio Schonwald at 2019-04-09T20:38:12Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 29 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/Cabal - libraries/array - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc - libraries/transformers Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,27 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities. +-- Phrased differently, its worth experimenting/exploring supporting +-- other register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,19 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u + instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +108,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +117,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- src(fpreg), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -975,52 +893,8 @@ i386_insert_ffrees -> [GenBasicBlock Instr] i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False + -- ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprGInstr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/Cabal ===================================== @@ -1 +1 @@ -Subproject commit 5d258537b754005d2a1d170b44d764b63ff4fc75 +Subproject commit fd51946bbb3850165de5f7b394fa987d1f4bd28e ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 +Subproject commit 8593a10f65020da3854b1c8478082d454b416118 ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd +Subproject commit 49655191d33912815a9389b764e2d89e92140938 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/ea0a7096d6e878da135fa6a1bdb2cb1dbda84a25 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/ea0a7096d6e878da135fa6a1bdb2cb1dbda84a25 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 20:40:44 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 16:40:44 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad034c782b6_62b33fa2ef294ee8262985c@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: 1010e1aa by Carter Tazio Schonwald at 2019-04-09T20:40:30Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 29 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/Cabal - libraries/array - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc - libraries/transformers Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,27 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities. +-- Phrased differently, its worth experimenting/exploring supporting +-- other register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,19 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u + instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +108,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +117,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -975,52 +893,8 @@ i386_insert_ffrees -> [GenBasicBlock Instr] i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False + -- ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprGInstr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/Cabal ===================================== @@ -1 +1 @@ -Subproject commit 5d258537b754005d2a1d170b44d764b63ff4fc75 +Subproject commit fd51946bbb3850165de5f7b394fa987d1f4bd28e ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 +Subproject commit 8593a10f65020da3854b1c8478082d454b416118 ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd +Subproject commit 49655191d33912815a9389b764e2d89e92140938 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1010e1aa34b1ce1dc2c06a0294388834cf63c267 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1010e1aa34b1ce1dc2c06a0294388834cf63c267 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 20:42:15 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 16:42:15 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad03a7719f2_62b33fa2ccc0a81026357fe@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: 94cf402b by Carter Tazio Schonwald at 2019-04-09T20:41:59Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 29 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/Cabal - libraries/array - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc - libraries/transformers Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,27 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities. +-- Phrased differently, its worth experimenting/exploring supporting +-- other register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,18 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +107,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +116,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -975,52 +893,8 @@ i386_insert_ffrees -> [GenBasicBlock Instr] i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False + -- ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprGInstr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/Cabal ===================================== @@ -1 +1 @@ -Subproject commit 5d258537b754005d2a1d170b44d764b63ff4fc75 +Subproject commit fd51946bbb3850165de5f7b394fa987d1f4bd28e ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 +Subproject commit 8593a10f65020da3854b1c8478082d454b416118 ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd +Subproject commit 49655191d33912815a9389b764e2d89e92140938 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/94cf402bd647e4240c2c8b1766f6e80d29f51c45 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/94cf402bd647e4240c2c8b1766f6e80d29f51c45 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 21:03:09 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 17:03:09 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad088ddc444_62b3e6102c026400ea@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: c6654de1 by Carter Tazio Schonwald at 2019-04-09T21:02:54Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 29 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/Cabal - libraries/array - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc - libraries/transformers Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,31 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities? +-- Phrased differently, it is worth experimenting with supporting +-- different register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! +-- The current design with respect to register mapping of scalars could +-- very well be the best,but exploring the design space and doing careful +-- measurments is the only only way to validate that. + + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,18 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +107,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +116,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -975,52 +893,8 @@ i386_insert_ffrees -> [GenBasicBlock Instr] i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False + -- ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprGInstr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/Cabal ===================================== @@ -1 +1 @@ -Subproject commit 5d258537b754005d2a1d170b44d764b63ff4fc75 +Subproject commit fd51946bbb3850165de5f7b394fa987d1f4bd28e ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 +Subproject commit 8593a10f65020da3854b1c8478082d454b416118 ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd +Subproject commit 49655191d33912815a9389b764e2d89e92140938 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/c6654de1ff2e22c24307471c2cbe75294d9da0f0 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/c6654de1ff2e22c24307471c2cbe75294d9da0f0 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 21:14:42 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 17:14:42 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad0b42adf65_62b33fa2bd1455d82646017@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: ca21949a by Carter Tazio Schonwald at 2019-04-09T21:14:23Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 29 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/Cabal - libraries/array - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc - libraries/transformers Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,31 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities? +-- Phrased differently, it is worth experimenting with supporting +-- different register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! +-- The current design with respect to register mapping of scalars could +-- very well be the best,but exploring the design space and doing careful +-- measurments is the only only way to validate that. + + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,18 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +107,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +116,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -10,7 +10,7 @@ module X86.Instr (Instr(..), Operand(..), PrefetchVariant(..), JumpDest(..), getJumpDestBlockId, canShortcut, shortcutStatics, - shortcutJump, i386_insert_ffrees, allocMoreStack, + shortcutJump, allocMoreStack, maxSpillSlots, archWordFormat ) where @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -970,58 +888,6 @@ x86_mkStackDeallocInstr platform amount ArchX86_64 -> [ADD II64 (OpImm (ImmInt amount)) (OpReg rsp)] _ -> panic "x86_mkStackDeallocInstr" -i386_insert_ffrees - :: [GenBasicBlock Instr] - -> [GenBasicBlock Instr] - -i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise - = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False - -- -- Note [extra spill slots] ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprX87Instr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/Cabal ===================================== @@ -1 +1 @@ -Subproject commit 5d258537b754005d2a1d170b44d764b63ff4fc75 +Subproject commit fd51946bbb3850165de5f7b394fa987d1f4bd28e ===================================== libraries/array ===================================== @@ -1 +1 @@ -Subproject commit 58a7ea0336363b29513164487190f6570b8ea834 +Subproject commit 8593a10f65020da3854b1c8478082d454b416118 ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 ===================================== libraries/transformers ===================================== @@ -1 +1 @@ -Subproject commit def8c55d0c47c1c40de985d83f052f3659b40cfd +Subproject commit 49655191d33912815a9389b764e2d89e92140938 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/ca21949aea82ef18539f7357b639c3277316529a -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/ca21949aea82ef18539f7357b639c3277316529a You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 21:34:12 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 17:34:12 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad0fd4239af_62b33fa2bd1455d826579fb@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: d406ac90 by Carter Tazio Schonwald at 2019-04-09T21:33:54Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 26 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs - libraries/hpc Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,31 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities? +-- Phrased differently, it is worth experimenting with supporting +-- different register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! +-- The current design with respect to register mapping of scalars could +-- very well be the best,but exploring the design space and doing careful +-- measurments is the only only way to validate that. + + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,18 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +107,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +116,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -10,7 +10,7 @@ module X86.Instr (Instr(..), Operand(..), PrefetchVariant(..), JumpDest(..), getJumpDestBlockId, canShortcut, shortcutStatics, - shortcutJump, i386_insert_ffrees, allocMoreStack, + shortcutJump, allocMoreStack, maxSpillSlots, archWordFormat ) where @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -970,58 +888,6 @@ x86_mkStackDeallocInstr platform amount ArchX86_64 -> [ADD II64 (OpImm (ImmInt amount)) (OpReg rsp)] _ -> panic "x86_mkStackDeallocInstr" -i386_insert_ffrees - :: [GenBasicBlock Instr] - -> [GenBasicBlock Instr] - -i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise - = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False - -- -- Note [extra spill slots] ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprX87Instr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 +Subproject commit 75f898badf40cddba7b3bcf149648e49095a52f9 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d406ac90af582edeb0278be8a02883a7630a0317 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d406ac90af582edeb0278be8a02883a7630a0317 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 21:40:47 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 17:40:47 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad115f2d8f4_62b33fa29229ef042659387@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: 40515694 by Carter Tazio Schonwald at 2019-04-09T21:40:30Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 25 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,31 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities? +-- Phrased differently, it is worth experimenting with supporting +-- different register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! +-- The current design with respect to register mapping of scalars could +-- very well be the best,but exploring the design space and doing careful +-- measurments is the only only way to validate that. + + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,18 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +107,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +116,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -10,7 +10,7 @@ module X86.Instr (Instr(..), Operand(..), PrefetchVariant(..), JumpDest(..), getJumpDestBlockId, canShortcut, shortcutStatics, - shortcutJump, i386_insert_ffrees, allocMoreStack, + shortcutJump, allocMoreStack, maxSpillSlots, archWordFormat ) where @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -970,58 +888,6 @@ x86_mkStackDeallocInstr platform amount ArchX86_64 -> [ADD II64 (OpImm (ImmInt amount)) (OpReg rsp)] _ -> panic "x86_mkStackDeallocInstr" -i386_insert_ffrees - :: [GenBasicBlock Instr] - -> [GenBasicBlock Instr] - -i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise - = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False - -- -- Note [extra spill slots] ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprX87Instr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/40515694bf04b0dcc870495561745684ba59f894 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/40515694bf04b0dcc870495561745684ba59f894 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Tue Apr 9 22:04:58 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Tue, 09 Apr 2019 18:04:58 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cad170afc98_62b33fa2ef457eec266823@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: 402feb2f by Carter Tazio Schonwald at 2019-04-09T22:04:36Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 25 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,38 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities? +-- Phrased differently, it is worth experimenting with supporting +-- different register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! +-- The current design with respect to register mapping of scalars could +-- very well be the best,but exploring the design space and doing careful +-- measurments is the only only way to validate that. +-- In some next generation CPU ISAs, notably RISC V, the SIMD extension +-- includes support for a sort of run time CPU dependent vectorization parameter, +-- where a loop may act upon a single scalar each iteration OR some 2,4,8 ... +-- element chunk! Time will tell if that direction sees wide adoption, +-- but it is from that context that unifying our handling of simd and scalars +-- may benefit. It is not likely to benefit current architectures, though +-- it may very well be a design perspective that helps guide improving the NCG. + + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,18 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +107,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +116,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -10,7 +10,7 @@ module X86.Instr (Instr(..), Operand(..), PrefetchVariant(..), JumpDest(..), getJumpDestBlockId, canShortcut, shortcutStatics, - shortcutJump, i386_insert_ffrees, allocMoreStack, + shortcutJump, allocMoreStack, maxSpillSlots, archWordFormat ) where @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -970,58 +888,6 @@ x86_mkStackDeallocInstr platform amount ArchX86_64 -> [ADD II64 (OpImm (ImmInt amount)) (OpReg rsp)] _ -> panic "x86_mkStackDeallocInstr" -i386_insert_ffrees - :: [GenBasicBlock Instr] - -> [GenBasicBlock Instr] - -i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise - = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False - -- -- Note [extra spill slots] ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprX87Instr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/402feb2fb40932003c09076f43a831ae7027fa46 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/402feb2fb40932003c09076f43a831ae7027fa46 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Wed Apr 10 03:07:38 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 23:07:38 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 4 commits: Hadrian: fix library install paths in bindist Makefile (#16498) Message-ID: <5cad5dfa41249_62b33fa2ccc0a810269351c@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - a240f145 by Joachim Breitner at 2019-04-10T03:07:30Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - 0f532329 by Ryan Scott at 2019-04-10T03:07:32Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 10 changed files: - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - ghc/ghc-bin.cabal.in - hadrian/src/Rules/BinaryDist.hs - libraries/base/base.cabal - libraries/ghc-compact/ghc-compact.cabal - libraries/ghc-heap/ghc-heap.cabal.in - libraries/integer-gmp/integer-gmp.cabal Changes: ===================================== .gitlab-ci.yml ===================================== @@ -4,6 +4,10 @@ variables: # Commit of ghc/ci-images repository from which to pull Docker images DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + # Sequential version number capturing the versions of all tools fetched by + # .gitlab/win32-init.sh. + WINDOWS_TOOLCHAIN_VERSION: 1 + before_script: - python3 .gitlab/fix-submodules.py - git submodule sync --recursive @@ -525,7 +529,7 @@ validate-x86_64-windows-hadrian: variables: MSYSTEM: MINGW64 cache: - key: x86_64-windows-hadrian + key: "x86_64-windows-hadrian-$WINDOWS_TOOLCHAIN_VERSION" nightly-i386-windows-hadrian: extends: .build-windows-hadrian @@ -535,7 +539,7 @@ nightly-i386-windows-hadrian: variables: - $NIGHTLY cache: - key: i386-windows-hadrian + key: "i386-windows-hadrian-$WINDOWS_TOOLCHAIN_VERSION" .build-windows-make: extends: .build-windows @@ -571,7 +575,7 @@ validate-x86_64-windows: MSYSTEM: MINGW64 CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" cache: - key: x86_64-windows + key: "x86_64-windows-$WINDOWS_TOOLCHAIN_VERSION" # Normal Windows validate builds are profiled; that won't do for releases. release-x86_64-windows: @@ -592,7 +596,7 @@ release-i386-windows: BUILD_FLAVOUR: "perf" CONFIGURE_ARGS: "--target=i386-unknown-mingw32" cache: - key: i386-windows + key: "i386-windows-$WINDOWS_TOOLCHAIN_VERSION" nightly-i386-windows: extends: .build-windows-make @@ -603,7 +607,7 @@ nightly-i386-windows: MSYSTEM: MINGW32 CONFIGURE_ARGS: "--target=i386-unknown-mingw32" cache: - key: i386-windows + key: "i386-windows-$WINDOWS_TOOLCHAIN_VERSION" ############################################################ # Cleanup ===================================== .gitlab/win32-init.sh ===================================== @@ -27,7 +27,8 @@ if [ ! -e $toolchain/bin/ghc ]; then fi if [ ! -e $toolchain/bin/cabal ]; then - curl https://www.haskell.org/cabal/release/cabal-install-2.2.0.0/cabal-install-2.2.0.0-i386-unknown-mingw32.zip > /tmp/cabal.zip + url="https://downloads.haskell.org/~cabal/cabal-install-latest/cabal-install-2.4.1.0-x86_64-unknown-mingw32.zip" + curl $url > /tmp/cabal.zip unzip /tmp/cabal.zip mv cabal.exe $toolchain/bin fi ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,7 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, + tablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" ===================================== ghc/ghc-bin.cabal.in ===================================== @@ -58,7 +58,7 @@ Executable ghc Build-depends: containers >= 0.5 && < 0.7, deepseq == 1.4.*, - ghc-prim >= 0.5.0 && <= 0.6.1, + ghc-prim >= 0.5.0 && < 0.7, ghci == @ProjectVersionMunged@, haskeline == 0.7.*, time >= 1.8 && < 1.10, ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -86,6 +86,12 @@ you can simply do: ./configure --prefix= [... other configure options ...] make install +In order to support @bin@ and @lib@ directories that don't sit next to each +other, the install script: + * installs programs into @LIBDIR/ghc-VERSION/bin@ + * installs libraries into @LIBDIR/ghc-VERSION/lib@ + * installs the wrappers scripts into @BINDIR@ directory + -} bindistRules :: Rules () @@ -268,6 +274,7 @@ bindistMakefile = unlines , "install: install_mingw update_package_db" , "" , "ActualBinsDir=${ghclibdir}/bin" + , "ActualLibsDir=${ghclibdir}/lib" , "WrapperBinsDir=${bindir}" , "" , "# We need to install binaries relative to libraries." @@ -288,10 +295,10 @@ bindistMakefile = unlines , "" , "LIBRARIES = $(wildcard ./lib/*)" , "install_lib:" - , "\t at echo \"Copying libraries to $(libdir)\"" - , "\t$(INSTALL_DIR) \"$(libdir)\"" + , "\t at echo \"Copying libraries to $(ActualLibsDir)\"" + , "\t$(INSTALL_DIR) \"$(ActualLibsDir)\"" , "\tfor i in $(LIBRARIES); do \\" - , "\t\tcp -R $$i \"$(libdir)/\"; \\" + , "\t\tcp -R $$i \"$(ActualLibsDir)/\"; \\" , "\tdone" , "" , "INCLUDES = $(wildcard ./include/*)" @@ -317,9 +324,9 @@ bindistMakefile = unlines , "\t$(foreach p, $(BINARY_NAMES),\\" , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ - "$(libdir),$(docdir),$(includedir)))" + "$(ActualLibsDir),$(docdir),$(includedir)))" , "" - , "PKG_CONFS = $(wildcard $(libdir)/package.conf.d/*)" + , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" , "\t at echo \"Updating the package DB\"" , "\t$(foreach p, $(PKG_CONFS),\\" ===================================== libraries/base/base.cabal ===================================== @@ -95,7 +95,7 @@ Library UnliftedFFITypes Unsafe - build-depends: rts == 1.0, ghc-prim >= 0.5.1.0 && <= 0.6.1 + build-depends: rts == 1.0, ghc-prim >= 0.5.1.0 && < 0.7 -- sanity-check to ensure exactly one flag is set if !((flag(integer-gmp) && !flag(integer-simple)) || (!flag(integer-gmp) && flag(integer-simple))) ===================================== libraries/ghc-compact/ghc-compact.cabal ===================================== @@ -36,7 +36,7 @@ library UnboxedTuples CPP - build-depends: ghc-prim >= 0.5.3 && <= 0.6.1, + build-depends: ghc-prim >= 0.5.3 && < 0.7, base >= 4.9.0 && < 4.14, bytestring >= 0.10.6.0 ghc-options: -Wall ===================================== libraries/ghc-heap/ghc-heap.cabal.in ===================================== @@ -23,7 +23,7 @@ library default-language: Haskell2010 build-depends: base >= 4.9.0 && < 5.0 - , ghc-prim > 0.2 && <= 0.6.1 + , ghc-prim > 0.2 && < 0.7 , rts == 1.0.* ghc-options: -Wall ===================================== libraries/integer-gmp/integer-gmp.cabal ===================================== @@ -58,7 +58,7 @@ library StandaloneDeriving UnboxedTuples UnliftedFFITypes - build-depends: ghc-prim >= 0.5.1.0 && <= 0.6.1 + build-depends: ghc-prim >= 0.5.1.0 && < 0.7 hs-source-dirs: src/ -- We need to set the unit ID to integer-wired-in -- (without a version number) as it's magic. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/8e81d0ca39594f1c9ab1fb00ac80ef6a2cd85cb9...0f53232902b383977541ad6ae0622934329219a7 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/8e81d0ca39594f1c9ab1fb00ac80ef6a2cd85cb9...0f53232902b383977541ad6ae0622934329219a7 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Wed Apr 10 03:23:46 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 23:23:46 -0400 Subject: [Git][ghc/ghc][master] GHC no longer ever defines TABLES_NEXT_TO_CODE on its own Message-ID: <5cad61c2d2ad0_62b33fa2eedc3d242702813@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - 2 changed files: - compiler/main/DynFlags.hs - compiler/main/SysTools.hs Changes: ===================================== compiler/main/DynFlags.hs ===================================== @@ -58,7 +58,7 @@ module DynFlags ( fFlags, fLangFlags, xFlags, wWarningFlags, dynFlagDependencies, - tablesNextToCode, mkTablesNextToCode, + tablesNextToCode, makeDynFlagsConsistent, shouldUseColor, shouldUseHexWordLiterals, ===================================== compiler/main/SysTools.hs ===================================== @@ -199,15 +199,9 @@ initSysTools top_dir let unreg_gcc_args = if targetUnregisterised then ["-DNO_REGS", "-DUSE_MINIINTERPRETER"] else [] - -- TABLES_NEXT_TO_CODE affects the info table layout. - tntc_gcc_args - | mkTablesNextToCode targetUnregisterised - = ["-DTABLES_NEXT_TO_CODE"] - | otherwise = [] cpp_args= map Option (words cpp_args_str) gcc_args = map Option (words gcc_args_str - ++ unreg_gcc_args - ++ tntc_gcc_args) + ++ unreg_gcc_args) ldSupportsCompactUnwind <- getBooleanSetting "ld supports compact unwind" ldSupportsBuildId <- getBooleanSetting "ld supports build-id" ldSupportsFilelist <- getBooleanSetting "ld supports filelist" View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/fc3f421bd63cbf550cd0c8771aaf11e9c362f4d1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/fc3f421bd63cbf550cd0c8771aaf11e9c362f4d1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Wed Apr 10 03:29:54 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Tue, 09 Apr 2019 23:29:54 -0400 Subject: [Git][ghc/ghc][master] Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Message-ID: <5cad6332d594b_62b33fa2ccc0a810270503a@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 5 changed files: - ghc/ghc-bin.cabal.in - libraries/base/base.cabal - libraries/ghc-compact/ghc-compact.cabal - libraries/ghc-heap/ghc-heap.cabal.in - libraries/integer-gmp/integer-gmp.cabal Changes: ===================================== ghc/ghc-bin.cabal.in ===================================== @@ -58,7 +58,7 @@ Executable ghc Build-depends: containers >= 0.5 && < 0.7, deepseq == 1.4.*, - ghc-prim >= 0.5.0 && <= 0.6.1, + ghc-prim >= 0.5.0 && < 0.7, ghci == @ProjectVersionMunged@, haskeline == 0.7.*, time >= 1.8 && < 1.10, ===================================== libraries/base/base.cabal ===================================== @@ -95,7 +95,7 @@ Library UnliftedFFITypes Unsafe - build-depends: rts == 1.0, ghc-prim >= 0.5.1.0 && <= 0.6.1 + build-depends: rts == 1.0, ghc-prim >= 0.5.1.0 && < 0.7 -- sanity-check to ensure exactly one flag is set if !((flag(integer-gmp) && !flag(integer-simple)) || (!flag(integer-gmp) && flag(integer-simple))) ===================================== libraries/ghc-compact/ghc-compact.cabal ===================================== @@ -36,7 +36,7 @@ library UnboxedTuples CPP - build-depends: ghc-prim >= 0.5.3 && <= 0.6.1, + build-depends: ghc-prim >= 0.5.3 && < 0.7, base >= 4.9.0 && < 4.14, bytestring >= 0.10.6.0 ghc-options: -Wall ===================================== libraries/ghc-heap/ghc-heap.cabal.in ===================================== @@ -23,7 +23,7 @@ library default-language: Haskell2010 build-depends: base >= 4.9.0 && < 5.0 - , ghc-prim > 0.2 && <= 0.6.1 + , ghc-prim > 0.2 && < 0.7 , rts == 1.0.* ghc-options: -Wall ===================================== libraries/integer-gmp/integer-gmp.cabal ===================================== @@ -58,7 +58,7 @@ library StandaloneDeriving UnboxedTuples UnliftedFFITypes - build-depends: ghc-prim >= 0.5.1.0 && <= 0.6.1 + build-depends: ghc-prim >= 0.5.1.0 && < 0.7 hs-source-dirs: src/ -- We need to set the unit ID to integer-wired-in -- (without a version number) as it's magic. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/be0dde8e3c27ca56477d1d1801bb77621f3618e1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/be0dde8e3c27ca56477d1d1801bb77621f3618e1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Wed Apr 10 03:47:47 2019 From: gitlab at gitlab.haskell.org (John Ericson) Date: Tue, 09 Apr 2019 23:47:47 -0400 Subject: [Git][ghc/ghc][wip/D5082] 24 commits: testsuite: Add testcase for #16111 Message-ID: <5cad6763b2e3d_62b33fa2ef457eec27070e6@gitlab.haskell.org.mail> John Ericson pushed to branch wip/D5082 at Glasgow Haskell Compiler / GHC Commits: 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - ca475136 by Joachim Breitner at 2019-04-10T03:42:48Z Make tablesNextToCode a proper dynamic flag (#15548) Summary: There is no more use of the TABLES_NEXT_TO_CODE CPP macro in `compiler/`. GHCI_TABLES_NEXT_TO_CODE is also removed entirely. The default value of `tablesNextToCode` is calculated as before, but now users of the GHCI API can modify this flag. That said, GHC still is hardcoded to define TABLES_NEXT_TO_CODE based on that default value. This is bad, but neccessary until the remaining uses of TABLES_NEXT_TO_CODE get it from make/Hadrian. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 Differential Revision: https://phabricator.haskell.org/D5082 - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/ghc.mk - compiler/ghci/ByteCodeItbls.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Ppr.hs - compiler/simplCore/CoreMonad.hs - compiler/specialise/Specialise.hs - compiler/typecheck/TcTyClsDecls.hs - compiler/utils/Util.hs - docs/users_guide/8.10.1-notes.rst - docs/users_guide/packages.rst - docs/users_guide/phases.rst - docs/users_guide/separate_compilation.rst - ghc/ghc-bin.cabal.in - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Settings/Packages.hs - libraries/base/base.cabal The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/4a4467c3ffcbc931731044453df6a9f2db53406f...ca4751366b074e52ff8d06298593f9193d9f7080 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/4a4467c3ffcbc931731044453df6a9f2db53406f...ca4751366b074e52ff8d06298593f9193d9f7080 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Wed Apr 10 07:46:33 2019 From: gitlab at gitlab.haskell.org (=?UTF-8?B?w5ZtZXIgU2luYW4gQcSfYWNhbg==?=) Date: Wed, 10 Apr 2019 03:46:33 -0400 Subject: [Git][ghc/ghc][wip/gc/aligned-block-allocation] 208 commits: Be more careful when naming TyCon binders Message-ID: <5cad9f59a42de_62b33fa2ef294ee82723938@gitlab.haskell.org.mail> Ömer Sinan Ağacan pushed to branch wip/gc/aligned-block-allocation at Glasgow Haskell Compiler / GHC Commits: 80dfcee6 by Simon Peyton Jones at 2019-03-05T08:09:41Z Be more careful when naming TyCon binders This patch fixes two rather gnarly test cases: * Trac #16342 (mutual recursion) See Note [Tricky scoping in generaliseTcTyCon] * Trac #16221 (shadowing) See Note [Unification variables need fresh Names] The main changes are: * Substantial reworking of TcTyClsDecls.generaliseTcTyCon This is the big change, and involves the rather tricky function TcHsSyn.zonkRecTyVarBndrs. See Note [Inferring kinds for type declarations] and Note [Tricky scoping in generaliseTcTyCon] for the details. * bindExplicitTKBndrs_Tv and bindImplicitTKBndrs_Tv both now allocate /freshly-named/ unification variables. Indeed, more generally, unification variables are always fresh; see Note [Unification variables need fresh Names] in TcMType * Clarify the role of tcTyConScopedTyVars. See Note [Scoped tyvars in a TcTyCon] in TyCon As usual, this dragged in some more refactoring: * Renamed TcMType.zonkTyCoVarBndr to zonkAndSkolemise * I renamed checkValidTelescope to checkTyConTelescope; it's only used on TyCons, and indeed takes a TyCon as argument. * I folded the slightly-mysterious reportFloatingKvs into checkTyConTelescope. (Previously all its calls immediately followed a call to checkTyConTelescope.) It makes much more sense there. * I inlined some called-once functions to simplify checkValidTyFamEqn. It's less spaghetti-like now. * This patch also fixes Trac #16251. I'm not quite sure why #16251 went wrong in the first place, nor how this patch fixes it, but hey, it's good, and life is short. - - - - - 6c4e45b0 by David Eichmann at 2019-03-05T08:15:47Z Test Runner: don't show missing baseline warning for performance tests with expected changes on the current commit. Trac #16359 - - - - - 646b6dfb by Krzysztof Gogolewski at 2019-03-05T08:21:53Z Fix map/coerce rule for newtypes with wrappers This addresses Trac #16208 by marking newtype wrapper unfoldings as compulsory. Furthermore, we can remove the special case for newtypes in exprIsConApp_maybe (introduced in 7833cf407d1f). - - - - - 37f257af by Ben Gamari at 2019-03-06T03:22:40Z Rip out object splitting The splitter is an evil Perl script that processes assembler code. Its job can be done better by the linker's --gc-sections flag. GHC passes this flag to the linker whenever -split-sections is passed on the command line. This is based on @DemiMarie's D2768. Fixes Trac #11315 Fixes Trac #9832 Fixes Trac #8964 Fixes Trac #8685 Fixes Trac #8629 - - - - - 23342e1f by Ömer Sinan Ağacan at 2019-03-06T03:28:45Z rts/Printer: Introduce a few more printing utilities These include printLargeAndPinnedObjects, printWeakLists, and printStaticObjects. These are generally useful things to have. - - - - - c19a401d by Ömer Sinan Ağacan at 2019-03-06T03:28:45Z rts/Printer: Print forwarding pointers - - - - - db039a4a by Ryan Scott at 2019-03-06T03:40:54Z Add regression test for #15918 The test case in #15918 no longer triggers an `ASSERT` failure on GHC HEAD, likely due to commit 682783828275cca5fd8bf5be5b52054c75e0e22c (`Make a smart mkAppTyM`). This patch adds a regression test for #15918 to finally put it to rest. - - - - - 2ff77b98 by P.C. Shyamshankar at 2019-03-06T14:17:22Z Handle absolute paths to build roots in Hadrian. Fixes #16187. This patch fixes various path concatenation issues to allow functioning builds with hadrian when the build root location is specified with an absolute path. Remarks: - The path concatenation operator (-/-) now handles absolute second operands appropriately. Its behavior should match that of POSIX (</>) in this regard. - The `getDirectoryFiles*` family of functions only searches for matches under the directory tree rooted by its first argument; all of the results are also relative to this root. If the first argument is the empty string, the current working directory is used. This patch passes the appropriate directory (almost always either `top` or `root`), and subsequently attaches that directory prefix so that the paths refer to the appropriate files. - Windows `tar` does not like colons (':') in paths to archive files, it tries to resolve them as remote paths. The `--force-local` option remedies this, and is applied on windows builds. - - - - - 5aab1d9c by Ömer Sinan Ağacan at 2019-03-06T20:53:32Z rts: Unglobalize dead_weak_ptr_list and resurrected_threads In the concurrent nonmoving collector we will need the ability to call `traverseWeakPtrList` concurrently with minor generation collections. This global state stands in the way of this. However, refactoring it away is straightforward since this list only persists the length of a single GC. - - - - - a4944d8d by Ben Gamari at 2019-03-06T20:53:32Z Fix it - - - - - 78dd04f9 by Ryan Scott at 2019-03-06T21:05:45Z Fix #16385 by appending _maybe to a use of lookupGlobalOcc `instance forall c. c` claimed that `c` was out of scope because the renamer was invoking `lookupGlobalOcc` on `c` (in `RnNames.getLocalNonValBinders`) without binding `c` first. To avoid this, this patch changes GHC to invoke `lookupGlobalOcc_maybe` on `c` instead, and if that returns `Nothing`, then bail out, resulting in a better error message. - - - - - 3caeb443 by Zejun Wu at 2019-03-06T21:11:52Z Move reifyGHCi function into GhciMonad type class This was the suggested change in !176 but missed the batch merge (!263). - - - - - 4ca271d1 by Ben Gamari at 2019-03-07T02:48:10Z testsuite: Introduce fragile modifier Now since we have been a bit more stringent in testsuite cleanliness we have been marking a lot of tests as fragile using the `skip` modifier. However, this unfortunately means that we lose the association with the ticket number documenting the fragility. Here we introduce `fragile` and `fragile_for` to retain this information. - - - - - 910185a3 by Ben Gamari at 2019-03-07T02:48:10Z testsuite: Mark heapprof001 as fragile on i386 - - - - - a65bcbe7 by Ben Gamari at 2019-03-07T02:48:10Z testsuite: Use fragile modifier for more tests - - - - - f624dc15 by Ben Gamari at 2019-03-07T02:48:10Z gitlab-ci: Don't allow i386-deb9 to fail Also account for testsuite metric drift. Metric Increase: haddock.Cabal haddock.base T14683 - - - - - 07f378ce by Simon Peyton Jones at 2019-03-07T02:54:17Z Add tests for Trac #16221 and #16342 - - - - - 25c3dd39 by Simon Peyton Jones at 2019-03-07T02:54:17Z Test Trac #16263 - - - - - 7a68254a by Phuong Trinh at 2019-03-07T19:01:42Z Fix #16392: revertCAFs in external interpreter when necessary We revert CAFs when loading/adding modules in ghci (presumably to refresh execution states and to allow for object code to be unloaded from the runtime). However, with `-fexternal-interpreter` enabled, we are only doing it in the ghci process instead of the external interpreter process where the cafs are allocated and computed. This makes sure that revertCAFs is done in the appropriate process no matter if that flag is present or not. - - - - - 068b7e98 by Ryan Scott at 2019-03-07T19:07:49Z Fix #16391 by using occCheckExpand in TcValidity The type-variables-escaping-their-scope-via-kinds check in `TcValidity` was failing to properly expand type synonyms, which led to #16391. This is easily fixed by using `occCheckExpand` before performing the validity check. Along the way, I refactored this check out into its own function, and sprinkled references to Notes to better explain all of the moving parts. Many thanks to @simonpj for the suggestions. Bumps the haddock submodule. - - - - - 1675d40a by Sebastian Graf at 2019-03-08T01:44:08Z Always do the worker/wrapper split for NOINLINEs Trac #10069 revealed that small NOINLINE functions didn't get split into worker and wrapper. This was due to `certainlyWillInline` saying that any unfoldings with a guidance of `UnfWhen` inline unconditionally. That isn't the case for NOINLINE functions, so we catch this case earlier now. Nofib results: -------------------------------------------------------------------------------- Program Allocs Instrs -------------------------------------------------------------------------------- fannkuch-redux -0.3% 0.0% gg +0.0% +0.1% maillist -0.2% -0.2% minimax 0.0% -0.8% -------------------------------------------------------------------------------- Min -0.3% -0.8% Max +0.0% +0.1% Geometric Mean -0.0% -0.0% Fixes #10069. ------------------------- Metric Increase: T9233 ------------------------- - - - - - 48927a9a by Alp Mestanogullari at 2019-03-08T10:50:26Z Hadrian: various improvements around the 'test' rule - introduce a -k/--keep-test-files flag to prevent cleanup - add -dstg-lint to the options that are always passed to tests - infer library ways from the compiler to be tested instead of getting them from the flavour (like make) - likewise for figuring out whether the compiler to be tested is "debugged" - specify config.exeext - correctly specify config.in_tree_compiler, instead of always passing True - fix formatting of how we pass a few test options - add (potential) extensions to check-* program names - build check-* programs with the compiler to be tested - set TEST_HC_OPTS_INTERACTIVE and PYTHON env vars when running tests - - - - - 5d744143 by Andrey Mokhov at 2019-03-08T10:56:32Z Hadrian: Drop remaining symlink traversal code from build scripts This partly resolves #16325 (https://ghc.haskell.org/trac/ghc/ticket/16325). As previously discussed in https://github.com/snowleopard/hadrian/issues/667, we do not need the symlink traversal code in build scripts. However, it appears we forgot to delete this code from our Stack-based build scripts, which led to placing all build artefacts in an unexpected location when using Hadrian in combination with symlink trees. This commit fixes this. - - - - - 82628254 by Vladislav Zavialov at 2019-03-08T11:02:37Z Testsuite: use 'fragile' instead of 'skip' for T3424, T14697 Also, replace some tabs with spaces to avoid a "mixed indent" warning that vim gives me. - - - - - 5be7ad78 by Simon Peyton Jones at 2019-03-08T11:08:41Z Use captureTopConstraints in TcRnDriver calls Trac #16376 showed the danger of failing to report an error that exists only in the unsolved constraints, if an exception is raised (via failM). Well, the commit 5c1f268e (Fail fast in solveLocalEqualities) did just that -- i.e. it found errors in the constraints, and called failM to avoid a misleading cascade. So we need to be sure to call captureTopConstraints to report those insolubles. This was wrong in TcRnDriver.tcRnExpr and in TcRnDriver.tcRnType. As a result the error messages from test T13466 improved slightly, a happy outcome. - - - - - 224a6b86 by Sylvain Henry at 2019-03-08T19:05:10Z TH: support raw bytes literals (#14741) GHC represents String literals as ByteString internally for efficiency reasons. However, until now it wasn't possible to efficiently create large string literals with TH (e.g. to embed a file in a binary, cf #14741): TH code had to unpack the bytes into a [Word8] that GHC then had to re-pack into a ByteString. This patch adds the possibility to efficiently create a "string" literal from raw bytes. We get the following compile times for different sizes of TH created literals: || Size || Before || After || Gain || || 30K || 2.307s || 2.299 || 0% || || 3M || 3.073s || 2.400s || 21% || || 30M || 8.517s || 3.390s || 60% || Ticket #14741 can be fixed if the original code uses this new TH feature. - - - - - 2762f94d by Roland Senn at 2019-03-08T19:11:19Z Fix #13839: GHCi warnings do not respect the default module header - - - - - 1f5cc9dc by Simon Peyton Jones at 2019-03-09T07:07:53Z Stop inferring over-polymorphic kinds Before this patch GHC was trying to be too clever (Trac #16344); it succeeded in kind-checking this polymorphic-recursive declaration data T ka (a::ka) b = MkT (T Type Int Bool) (T (Type -> Type) Maybe Bool) As Note [No polymorphic recursion] discusses, the "solution" was horribly fragile. So this patch deletes the key lines in TcHsType, and a wodge of supporting stuff in the renamer. There were two regressions, both the same: a closed type family decl like this (T12785b) does not have a CUSK: type family Payload (n :: Peano) (s :: HTree n x) where Payload Z (Point a) = a Payload (S n) (a `Branch` stru) = a To kind-check the equations we need a dependent kind for Payload, and we don't get that any more. Solution: make it a CUSK by giving the result kind -- probably a good thing anyway. The other case (T12442) was very similar: a close type family declaration without a CUSK. - - - - - cfbedf17 by Niklas Hambüchen at 2019-03-09T07:14:13Z compiler: Write .o files atomically. See #14533 This issue was reproduced with, and the fix confirmed with, the `hatrace` tool for syscall-based fault injection: https://github.com/nh2/hatrace The concrete test case for GHC is at https://github.com/nh2/hatrace/blob/e23d35a2d2c79e8bf49e9e2266b3ff7094267f29/test/HatraceSpec.hs#L185 A previous, nondeterministic reproducer for the issue was provided by Alexey Kuleshevich in https://github.com/lehins/exec-kill-loop Signed-off-by: Niklas Hambüchen <niklas at fpcomplete.com> Reviewed-by: Alexey Kuleshevich <alexey at fpcomplete.com> - - - - - 08ad38a9 by Niklas Hambüchen at 2019-03-09T07:14:13Z compiler: Refactor: extract `withAtomicRename` - - - - - e76ee675 by Ben Gamari at 2019-03-09T12:30:17Z rts: Factor out large bitmap walking This will be needed by the mark phase of the non-moving collector so let's factor it out. - - - - - 6e3e537e by Edward Z. Yang at 2019-03-09T12:36:26Z Make bkpcabal01 test compatible with new ordering requirements. Previously, our test did something like this: 1. Typecheck p 2. Typecheck q (which made use of an instantiated p) 3. Build instantiated p 4. Build instantiated q Cabal previously permitted this, under the reasoning that during typechecking there's no harm in using the instantiated p even if we haven't build it yet; we'll just instantiate it on the fly with p. However, this is not true! If q makes use of a Template Haskell splice from p, we absolutely must have built the instantiated p before we typecheck q, since this typechecking will need to run some splices. Cabal now complains that you haven't done it correctly, which we indeed have not! Reordering so that we do this: 1. Typecheck p 3. Build instantiated p 2. Typecheck q (which made use of an instantiated p) 4. Build instantiated q Fixes the problem. If Cabal had managed the ordering itself, it would have gotten it right. Signed-off-by: Edward Z. Yang <ezyang at fb.com> - - - - - 6b2f0991 by Sylvain Henry at 2019-03-09T12:42:34Z NCG: correctly escape path strings on Windows (#16389) GHC native code generator generates .incbin and .file directives. We need to escape those strings correctly on Windows (see #16389). - - - - - b760269c by Ben Gamari at 2019-03-09T12:48:38Z Rip out perl dependency The object splitter was the last major user of perl. There remain a few uses in nofib but we can just rely on the system's perl for this since it's not critical to the build. - - - - - 0cd98957 by Ben Gamari at 2019-03-09T12:48:38Z Drop utils/count_lines This doesn't appear to be used anywhere in the build system and it relies on perl. Drop it. - - - - - bcb6769c by Alec Theriault at 2019-03-11T22:11:59Z Ignore more version numbers in the testsuite Prevents some tests from failing just due to mismatched version numbers. These version numbers shouldn't cause tests to fail, especially since we *expect* them to be regularly incremented. The motivation for this particular set of changes came from the changes that came along with the `base` version bump in 8f19ecc95fbaf2cc977531d721085d8441dc09b7. - - - - - 60b03ade by Krzysztof Gogolewski at 2019-03-11T22:18:06Z Change the warning in substTy back to an assertion We'd like to enforce the substitution invariant (Trac #11371). In a492af06d326453 the assertion was downgraded to a warning; I'm restoring the assertion and making the calls that don't maintain the invariant as unchecked. - - - - - 2f453414 by Krzysztof Gogolewski at 2019-03-11T22:18:06Z Add a test for Trac #13951 It no longer gives a warning. - - - - - b2322310 by Matthew Pickering at 2019-03-12T13:04:52Z Hadrian: Allow passing CABFLAGS into build.cabal.sh Setting `CABFLAGS=args` will pass the additional arguments to cabal when it is invoked. - - - - - 61264556 by Matthew Pickering at 2019-03-12T13:04:52Z Hadrian: Make libsuf and distDir stage aware The version suffix needs to be the version of the stage 0 compiler when building shared libraries with the stage 0 compiler. - - - - - 705fa21d by Matthew Pickering at 2019-03-12T13:04:52Z Hadrian: Make makeRelativeNoSysLink total makeRelativeNoSysLink would previously crash for no reason if the first argument as `./` due to the call to `head`. This refactoring keeps the behaviour the same but doesn't crash in this corner case. - - - - - 4cf2160a by Matthew Pickering at 2019-03-12T13:04:52Z Hadrian: Fix rpath so shared objects work after being copied After being copied all the shared objects end up in the same directory. Therefore the correct rpath is `$ORIGIN` rather than the computed path which is relative to the directory where it is built. - - - - - 2d7dd028 by Matthew Pickering at 2019-03-12T13:04:52Z Hadrian: Add ./hadrian/ghci.sh script for fast development feedback Running the `./hadrian/ghci` target will load the main compiler into a ghci session. This is intended for fast development feedback, modules are only typechecked so it isn't possible to run any functions in the repl. You can also use this target with `ghcid`. The first time this command is run hadrian will need to compile a few dependencies which will take 1-2 minutes. Loading GHC into GHCi itself takes about 30 seconds. Internally this works by calling a new hadrian target called `tool-args`. This target prints out the package and include flags which are necessary to load files into ghci. The same target is intended to be used by other tooling which uses the GHC API in order to set up the correct GHC API session. For example, using this target it is also possible to use HIE when developing on GHC. - - - - - bb684e65 by Matthew Pickering at 2019-03-12T13:04:52Z Remove training whitespace - - - - - 72c455a4 by Matthew Pickering at 2019-03-12T13:04:52Z CI: Add ghc-in-ghci build job This is a separate build job to the other hadrian jobs as it only takes about 2-3 minutes to run from cold. The CI tests that the `./hadrian/ghci` script loads `ghc/Main.hs` successfully. - - - - - 5165378d by Matthew Pickering at 2019-03-12T13:04:52Z Remove trailing whitespace - - - - - 50249a9f by Simon Peyton Jones at 2019-03-12T13:13:28Z Use transSuperClasses in TcErrors Code in TcErrors was recursively using immSuperClasses, which loops in the presence of UndecidableSuperClasses. Better to use transSuperClasses instead, which has a loop-breaker mechanism built in. Fixes issue #16414. - - - - - 62db9295 by Ömer Sinan Ağacan at 2019-03-12T13:19:29Z Remove duplicate functions in StgCmmUtils, use functions from CgUtils Also remove unused arg from get_Regtable_addr_from_offset - - - - - 4db9bdd9 by Ryan Scott at 2019-03-12T13:25:39Z Add regression test for #16347 Commit 1f5cc9dc8aeeafa439d6d12c3c4565ada524b926 ended up fixing #16347. Let's add a regression test to ensure that it stays fixed. - - - - - 02ddf947 by Matthew Pickering at 2019-03-12T13:42:53Z CI: Update ci-images commit - - - - - a0cab873 by Matthew Pickering at 2019-03-12T13:44:45Z Revert: Update ci-images commit - - - - - 23fc6156 by Ben Gamari at 2019-03-13T19:03:53Z testsuite: Mark heapprof001 as fragile on all platforms See #15382. - - - - - cb17c2da by Alp Mestanogullari at 2019-03-13T19:10:01Z Hadrian: build (and retrieve) binary distributions in CI With all the recent fixes to the binary-dist rule in Hadrian, we can now run that rule in CI and keep the bindists around in gitlab as artifacts, just like we do for the make CI jobs. To get 'autoreconf' to work in the Windows CI, we have to run it through the shell interpreter, so this commit does that along the way. - - - - - 36546a43 by Ryan Scott at 2019-03-13T19:16:08Z Fix #16411 by making dataConCannotMatch aware of (~~) The `dataConCannotMatch` function (which powers the `-Wpartial-fields` warning, among other things) had special reasoning for explicit equality constraints of the form `a ~ b`, but it did not extend that reasoning to `a ~~ b` constraints, leading to #16411. Easily fixed. - - - - - 10a97120 by Ben Gamari at 2019-03-14T16:20:50Z testsuite: Add testcase for #16394 - - - - - 8162eab2 by Ryan Scott at 2019-03-15T13:59:30Z Remove the GHCi debugger's panicking isUnliftedType check The GHCi debugger has never been that robust in the face of higher-rank types, or even types that are _interally_ higher-rank, such as the types of many class methods (e.g., `fmap`). In GHC 8.2, however, things became even worse, as the debugger would start to _panic_ when a user tries passing the name of a higher-rank thing to `:print`. This all ties back to a strange `isUnliftedType` check in `Debugger` that was mysteriously added 11 years ago (in commit 4d71f5ee6dbbfedb4a55767e4375f4c0aadf70bb) with no explanation whatsoever. After some experimentation, no one is quite sure what this `isUnliftedType` check is actually accomplishing. The test suite still passes if it's removed, and I am unable to observe any differences in debugger before even with data types that _do_ have fields of unlifted types (e.g., `data T = MkT Int#`). Given that this is actively causing problems (see #14828), the prudent thing to do seems to be just removing this `isUnliftedType` check, and waiting to see if anyone shouts about it. This patch accomplishes just that. Note that this patch fix the underlying issues behind #14828, as the debugger will still print unhelpful info if you try this: ``` λ> f :: (forall a. a -> a) -> b -> b; f g x = g x λ> :print f f = (_t1::t1) ``` But fixing this will require much more work, so let's start with the simple stuff for now. - - - - - d10e2368 by David Eichmann at 2019-03-15T14:05:38Z Hadrian: remove unneeded imports. - - - - - 4df75772 by David Eichmann at 2019-03-15T14:05:38Z Hadrian: remove unneeded rpaths. Issue #12770 - - - - - afc80730 by David Eichmann at 2019-03-15T14:11:47Z Git ignore .hadrian_ghci (generated by the ./hadrian/ghci.sh) [skip ci] - - - - - 610ec224 by Ryan Scott at 2019-03-15T14:17:54Z Update Trac ticket URLs to point to GitLab This moves all URL references to Trac tickets to their corresponding GitLab counterparts. - - - - - 97032ed9 by Simon Peyton Jones at 2019-03-15T14:24:01Z Report better suggestion for GADT data constructor This addresses issue #16427. An easy fix. - - - - - 83e09d3c by Peter Trommler at 2019-03-15T14:30:08Z PPC NCG: Use liveness information in CmmCall We make liveness information for global registers available on `JMP` and `BCTR`, which were the last instructions missing. With complete liveness information we do not need to reserve global registers in `freeReg` anymore. Moreover we assign R9 and R10 to callee saves registers. Cleanup by removing `Reg_Su`, which was unused, from `freeReg` and removing unused register definitions. The calculation of the number of floating point registers is too conservative. Just follow X86 and specify the constants directly. Overall on PowerPC this results in 0.3 % smaller code size in nofib while runtime is slightly better in some tests. - - - - - 57201beb by Simon Peyton Jones at 2019-03-15T14:36:14Z Add flavours link - - - - - 4927117c by Simon Peyton Jones at 2019-03-16T12:08:25Z Improve error recovery in the typechecker Issue #16418 showed that we were carrying on too eagerly after a bogus type signature was identified (a bad telescope in fact), leading to a subsequent crash. This led me in to a maze of twisty little passages in the typechecker's error recovery, and I ended up doing some refactoring in TcRnMonad. Some specfifics * TcRnMonad.try_m is now called attemptM. * I switched the order of the result pair in tryTc, to make it consistent with other similar functions. * The actual exception used in the Tc monad is irrelevant so, to avoid polluting type signatures, I made tcTryM, a simple wrapper around tryM, and used it. The more important changes are in * TcSimplify.captureTopConstraints, where we should have been calling simplifyTop rather than reportUnsolved, so that levity defaulting takes place properly. * TcUnify.emitResidualTvConstraint, where we need to set the correct status for a new implication constraint. (Previously we ended up with an Insoluble constraint wrapped in an Unsolved implication, which meant that insolubleWC gave the wrong answer. - - - - - 600a1ac3 by Simon Peyton Jones at 2019-03-16T12:08:25Z Add location to the extra-constraints wildcard The extra-constraints wildcard had lost its location (issue #16431). Happily this is easy to fix. Lots of error improvements. - - - - - 1c1b63d6 by Ben Gamari at 2019-03-16T23:13:36Z compiler: Disable atomic renaming on Windows As discussed in #16450, this feature regresses CI on Windows, causing non-deterministic failures due to missing files. - - - - - 6764da43 by Ben Gamari at 2019-03-16T23:16:56Z gitlab-ci: Explicitly set bindist tarball name - - - - - ad79ccd9 by Ben Gamari at 2019-03-16T23:17:46Z gitlab-ci: Generate documentation tarball - - - - - 3f2291e4 by Ben Gamari at 2019-03-16T23:17:46Z gitlab-ci: Generate source tarballs - - - - - cb61371e by Ben Gamari at 2019-03-17T09:05:10Z ghc-heap: Introduce closureSize This function allows the user to compute the (non-transitive) size of a heap object in words. The "closure" in the name is admittedly confusing but we are stuck with this nomenclature at this point. - - - - - c01d5af3 by Michael Sloan at 2019-03-18T02:23:19Z Extract out use of UnboxedTuples from GHCi.Leak See #13101 + #15454 for motivation. This change reduces the number of modules that need to be compiled to object code when loading GHC into GHCi. - - - - - 6113d0d4 by Radosław Rowicki at 2019-03-18T02:29:25Z Update bug tracker link to point to gitlab instead of deprecated trac - - - - - b8326897 by Ben Gamari at 2019-03-18T03:16:12Z gitlab-ci: Always build fedora27 This ends up being much easier to use than Debian 9 under NixOS. - - - - - acf2129d by Ben Gamari at 2019-03-18T03:17:36Z gitlab-ci: Implement head.hackage jobs - - - - - 71648c35 by Ben Gamari at 2019-03-20T03:04:18Z gitlab-ci: Implement support for i386/Windows bindists - - - - - d94ca74f by Tamar Christina at 2019-03-20T03:10:23Z err: clean up error handler - - - - - 398f2cbc by Ben Gamari at 2019-03-20T03:16:32Z Bump Cabal submodule to 3.0 Metric Increase: haddock.Cabal - - - - - 89a201e8 by Takenobu Tani at 2019-03-20T03:22:36Z users-guide: Update Wiki URLs to point to GitLab The user's guide uses the `ghc-wiki` macro, and substitution rules are complicated. So I manually edited `.rst` files without sed. I changed `Commentary/Latedmd` only to a different page. It is more appropriate as an example. [ci skip] - - - - - 98ff1a56 by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Replace nOfThem by replicate - - - - - 6a47414f by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Fix typos - - - - - 1e26e60d by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Simplify monadic code - - - - - c045bd7c by Krzysztof Gogolewski at 2019-03-20T03:28:42Z Remove deprecated reinitializeGlobals - - - - - 6d19ad72 by Ben Gamari at 2019-03-20T03:34:49Z gitlab-ci: Bump docker images To install lndir and un-break the source distribution job. - - - - - c7a84a60 by Matthew Pickering at 2019-03-20T03:34:50Z Update .gitlab-ci.yml - - - - - db136237 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T16219 and cabal09 as broken on Windows See #16386. - - - - - 7cd8e330 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Fix expected output on Windows for various ghci tests Broke as -Wimplicit-kind-vars no longer exists. Specifically ghci024, ghci057 and T9293. - - - - - 23b639fd by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T5836 as broken on Windows See #16387. - - - - - a1bda08d by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T15904 as broken on Windows It seems to look for some sort of manifest file. See #16388. - - - - - b7f5d552 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T16190 as broken on Windows There seems to be some filepath funniness due to TH embedding going on here. See #16389. - - - - - a0c31f78 by Ben Gamari at 2019-03-20T22:41:32Z testsuite/plugins: Add multi_cpu_race modifier on Windows A few tests previously failed with various failure modes. For instance, `plugin-recomp-change` fails with: ``` Wrong exit code for plugin-recomp-change()(expected 0 , actual 2 ) Stderr ( plugin-recomp-change ): Simple Plugin Passes Queried Got options: Simple Plugin Pass Run C://GitLabRunner//builds//8fc0e283//0//ghc//ghc//inplace//mingw//bin/ld.exe: cannot find -lHSplugin-recompilation-0.1-CPeObcGoBuvHdwBnpK9jQq collect2.exe: error: ld returned 1 exit status `gcc.exe' failed in phase `Linker'. (Exit code: 1) make[2]: *** [Makefile:112: plugin-recomp-change] Error 1 *** unexpected failure for plugin-recomp-change(normal) ``` It's unclear whether the ghc-pkg concurrency issue mentioned in all.T is the culprit but the set of tests that fail overlaps strongly with the set of tests that lack the `multi_cpu_race` modifier. Let's see if adding it fixes them. - - - - - 88a6e9a4 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Mark T10672 as broken This test, which is only run on Windows, seems to be reliably timing out. See #16390. - - - - - f4d3aaaf by Ben Gamari at 2019-03-20T22:41:32Z testsuite/plugins: Increase compile timeout on Windows I think the linker is routinely eating through the timeout, leading to many spurious failures. - - - - - ae382245 by Ben Gamari at 2019-03-20T22:41:32Z rts/RtsSymbols: Drop __mingw_vsnwprintf As described in #16387, this is already defined by mingw and consequently defining it in the RTS as well leads to multiple definition errors from the RTS linker at runtime. - - - - - f79f93e4 by Ben Gamari at 2019-03-20T22:41:32Z Don't mark cabal09 as broken It doesn't fail reliably. - - - - - d98cb763 by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Don't mark T5836 as broken I believe removing __mingw_vsnwprintf from RtsSymbols fixed #16387. - - - - - 8c1a2743 by Ben Gamari at 2019-03-20T22:41:32Z Try again - - - - - 3394a7cd by Ben Gamari at 2019-03-20T22:41:32Z testsuite: Display observed exit code on failure due to bad exit code - - - - - 36818759 by Artem Pyanykh at 2019-03-20T23:52:39Z Adjust section placement and relocation logic for Mach-O 1. Place each section on a separate page to ensure required alignment (wastes lots ot space, needs to be improved). 2. Unwire relocation logic from macho sections (the most fiddly part is adjusting internal relocations). Other todos: 0. Add a test for section alignment. 1. Investigate 32bit relocations! 2. Fix memory leak in ZEROPAGE section allocation. 3. Fix creating redundant jump islands for GOT. 4. Investigate more compact section placement. - - - - - 78c61acf by Artem Pyanykh at 2019-03-20T23:52:39Z Use segments for section layout - - - - - 7bbfb789 by Artem Pyanykh at 2019-03-20T23:52:39Z Address some todos and fixmes - - - - - 3cdcc0b5 by Artem Pyanykh at 2019-03-20T23:52:39Z Add a linker test re: section alignment - - - - - cb745c84 by Artem Pyanykh at 2019-03-20T23:52:39Z Add missing levels to SegmentProt enum - - - - - d950f11e by Artem Pyanykh at 2019-03-20T23:52:39Z Directly test section alignment, fix internal reloc probing length - - - - - 3fb10fcf by Artem Pyanykh at 2019-03-20T23:52:39Z Gracefully handle error condition in Mach-O relocateSection - - - - - dc713c71 by Ben Gamari at 2019-03-20T23:58:49Z ci: Move validate-x86_64-linux-deb9 to full-build stage The `build` stage is meant to be a minimal smoke test to weed out broken commits. The `validate-x86_64-linux-deb9` build will generally catch a subset of issues caught by `validate-x86_64-linux-deb9-debug` so only the latter should be in `build`. - - - - - 505c5ab2 by Ben Gamari at 2019-03-20T23:58:49Z ci: Add some descriptions of the stages - - - - - 646e3dc2 by Sebastian Graf at 2019-03-21T00:04:49Z Add a bench flavour to Hadrian - - - - - 8d18a873 by Ryan Scott at 2019-03-21T00:10:57Z Reject nested predicates in impredicativity checking When GHC attempts to unify a metavariable with a type containing foralls, it will be rejected as an occurrence of impredicativity. GHC was /not/ extending the same treatment to predicate types, such as in the following (erroneous) example from #11514: ```haskell foo :: forall a. (Show a => a -> a) -> () foo = undefined ``` This will attempt to instantiate `undefined` at `(Show a => a -> a) -> ()`, which is impredicative. This patch catches impredicativity arising from predicates in this fashion. Since GHC is pickier about impredicative instantiations, some test cases needed to be updated to be updated so as not to fall afoul of the new validity check. (There were a surprising number of impredicative uses of `undefined`!) Moreover, the `T14828` test case now has slightly less informative types shown with `:print`. This is due to a a much deeper issue with the GHCi debugger (see #14828). Fixes #11514. - - - - - 7b213b8d by Ömer Sinan Ağacan at 2019-03-21T00:17:05Z Print test suite results ("unexpected failures" etc.) in sorted order Fixes #16425 - - - - - f199a843 by Simon Jakobi at 2019-03-21T00:23:15Z Check.hs: Fix a few typos - - - - - 07d44ed1 by Ben Gamari at 2019-03-21T00:29:20Z base: Depend upon shlwapi on Windows As noted in #16466, `System.Environment.getExecutablePath` depends upon `PathFileExistsW` which is defined by `shlwapi`. Fixes #16466. - - - - - 1382d09e by Ryan Scott at 2019-03-21T00:35:28Z Remove unused XArrApp and XArrForm extension points !301 removed the `HsArrApp` and `HsArrForm` constructors, which renders the corresponding extension points `XArrApp` and `XArrForm` useless. This patch finally rips them out. - - - - - 3423664b by Peter Trommler at 2019-03-21T00:41:35Z Fix specification of load_load_barrier [skip-ci] - - - - - 84c77a67 by Alexandre Esteves at 2019-03-21T21:43:03Z Fix typo [skip ci] - - - - - 7092b2de by Matthew Pickering at 2019-03-22T03:38:58Z Only run check-makefiles.py linter in testsuite dir - - - - - 322239de by Matthew Pickering at 2019-03-22T03:38:58Z Run linters on merge requests It seems that it has failed to execute at all since it was implemented. We now run the linters on merge requests. - - - - - 8f8d532c by Ben Gamari at 2019-03-22T03:45:03Z gitlab-ci: Do full `perf` build when building Windows releases - - - - - 2ef72d3f by Ben Gamari at 2019-03-22T03:45:03Z gitlab-ci: Pass --target explicitly to configure on Windows Otherwise configure fails in the 32-bit case with ``` This GHC (c:/GitLabRunner/builds/8fc0e283/0/ghc/ghc/toolchain/bin/ghc) does not generate code for the build platform GHC target platform : x86_64-unknown-mingw32 Desired build platform : i386-unknown-mingw32 ``` - - - - - 8b14f536 by Ben Gamari at 2019-03-22T03:51:08Z Bump cabal submodule Due to https://github.com/haskell/cabal/issues/5953. - - - - - dbe4557f by Matthew Pickering at 2019-03-22T14:02:32Z CI: Allow failure in packaging step This depends on the windows build which is still allowed to fail. If that job fails then the packaging job will also fail. - - - - - 366f1c68 by Ben Gamari at 2019-03-22T14:08:38Z gitlab: Deploy documentation snapshot via GitLab Pages - - - - - d608d543 by Tamar Christina at 2019-03-22T14:14:45Z Force LF line ending for md5sum [skip-ci] - - - - - cd07086a by Ben Gamari at 2019-03-22T14:34:51Z gitlab-ci: Fix linters - - - - - ab51bee4 by Herbert Valerio Riedel at 2019-03-22T14:34:51Z base: Remove `Monad(fail)` method and reexport `MonadFail(fail)` instead As per https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail Coauthored-by: Ben Gamari <ben at well-typed.com> - - - - - 266b49ca by Ben Gamari at 2019-03-22T22:33:20Z gitlab-ci: Clean up linter I'm not sure why these steps were done but they seem counterproductive and unnecessary. - - - - - 44b08ede by Ben Gamari at 2019-03-22T22:38:11Z gitlab-ci: Fix YAML syntax - - - - - 971f4530 by Ben Gamari at 2019-03-22T22:49:34Z gitlab-ci: Compute merge base against remote tracking branch Previously we would use the local branch with the name `$CI_MERGE_REQUEST_TARGET_BRANCH_NAME` to compute the merge base when linting. However, this branch isn't necessarily up-to-date. We should rather use `origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME`. - - - - - 8d01b572 by Ben Gamari at 2019-03-23T16:37:56Z gitlab-ci: Explicitly fetch target branch `git fetch`, which we used previously, doesn't update the remote tracking branches. - - - - - cd85f8a7 by Ben Gamari at 2019-03-24T12:46:13Z gitlab-ci: Allow linters to fail for now They are broken and I don't have time to fix them at the moment. - - - - - d763b2e7 by Haskell-mouse at 2019-03-25T18:02:22Z User's Guide: extensions compatibility Adds the mention that extensions "AllowAmbiguousTypes" and "RankNTypes" are not always compatible with each other. Specifies the conditions and causes of failing in resolving of ambiguity. - - - - - 200d65ef by Matthew Pickering at 2019-03-25T18:02:25Z Check hadrian/ghci.sh script output to determine pass/fail ghci always exits with exit code 0 so you have to check the output to see if the modules loaded succesfully. - - - - - 8e07368f by Matthew Pickering at 2019-03-25T18:02:27Z Refactor ./hadrian/ghci.sh for better error messages By separating these two lines, if the first command fails then `ghci` is not loaded. Before it would still load ghci but display lots of errors about not being able to find modules. - - - - - 3769e3a8 by Takenobu Tani at 2019-03-25T18:02:29Z Update Wiki URLs to point to GitLab This moves all URL references to Trac Wiki to their corresponding GitLab counterparts. This substitution is classified as follows: 1. Automated substitution using sed with Ben's mapping rule [1] Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy... New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy... 2. Manual substitution for URLs containing `#` index Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz 3. Manual substitution for strings starting with `Commentary` Old: Commentary/XxxYyy... New: commentary/xxx-yyy... See also !539 [1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json - - - - - b9da2868 by Ryan Scott at 2019-03-25T18:02:33Z Correct duplicate 4.12.0.0 entry in base's changelog See #16490. [ci skip] - - - - - ab41c1b4 by Andrey Mokhov at 2019-03-27T11:20:03Z Hadrian: Bump Shake to 0.17.6 The new release of Shake comes with these relevant features: * use symlinks for --shared * add --compact for a Bazel/Buck style output - - - - - 646f2e79 by Andrey Mokhov at 2019-03-27T11:20:03Z Hadrian: trace the execution of expensive Cabal calls We use Cabal to parse, configure, register and copy packages, which are expensive operations that are currently not visible to Shake's profiling infrastructure. By using `traced` we tell Shake to add these IO actions to the profiling report, helping us to identify performance bottlenecks. We use short tracing keys, as recommended in Shake docs: the name of the current target is already available in the rest of the profiling information. - - - - - fb12f53c by Alp Mestanogullari at 2019-03-27T11:20:05Z Hadrian: introduce an easy way for users to build with -split-sections Any user can now trivially build any number of Haskell packages with `-split-sections` by using `splitSections`/`splitSectionsIf` on any existing or new flavour: -- build all packages but the ghc library with -split-sections splitSections :: Flavour -> Flavour -- build all packages that satisfy the given predicate -- with --split-sections splitSectionsIf :: (Package -> Bool) -> Flavour -> Flavour See the new section in `doc/user-settings.md`. - - - - - 3dec527a by David Eichmann at 2019-03-27T11:20:09Z Hadrian: don't use -zorigin on darwin. - - - - - 5730f863 by Ömer Sinan Ağacan at 2019-03-27T11:20:10Z Minor refactoring in copy array primops: - `emitCopySmallArray` now checks size before generating code and doesn't generate any code when size is 0. `emitCopyArray` already does this so this makes small/large array cases the same in argument checking. - In both `emitCopySmallArray` and `emitCopyArray` read the `dflags` after checking the argument. - - - - - 4acdb769 by Chaitanya Koparkar at 2019-03-27T11:20:11Z Fix a few broken Trac links [skip ci] This patch only attempts to fix links that don't automatically re-direct to the correct URL. - - - - - 97ad5cfb by Artem Pelenitsyn at 2019-03-29T18:18:12Z Add some tips to the Troubleshooting section of README - - - - - 8a20bfc2 by Michael Peyton Jones at 2019-03-29T18:18:14Z Visibility: handle multiple units with the same name Fixes #16228. The included test case is adapted from the reproduction in the issue, and fails without this patch. ------ We compute an initial visilibity mapping for units based on what is present in the package databases. To seed this, we compute a set of all the package configs to add visibilities for. However, this set was keyed off the unit's *package name*. This is correct, since we compare packages across databases by version. However, we would only ever consider a single, most-preferable unit from the database in which it was found. The effect of this was that only one of the libraries in a Cabal package would be added to this initial set. This would cause attempts to use modules from the omitted libraries to fail, claiming that the package was hidden (even though `ghc-pkg` would correctly show it as visible). A solution is to do the selection of the most preferable packages separately, and then be sure to consider exposing all units in the same package in the same package db. We can do this by picking a most-preferable unit for each package name, and then considering exposing all units that are equi-preferable with that unit. ------ Why wasn't this bug apparent to all people trying to use sub-libraries in Cabal? The answer is that Cabal explicitly passes `-package` and `-package-id` flags for all the packages it wants to use, rather than relying on the state of the package database. So this bug only really affects people who are trying to use package databases produced by Cabal outside of Cabal itself. One particular example of this is the way that the Nixpkgs Haskell infrastructure provides wrapped GHCs: typically these are equipped with a package database containing all the needed package dependencies, and the user is not expected to pass `-package` flags explicitly. - - - - - 754b5455 by Artem Pelenitsyn at 2019-03-29T18:18:20Z docs: make nfib compute the Fibonacci sequence [skipci] - - - - - 1a567133 by Ben Gamari at 2019-03-29T18:18:20Z ci: Check that changelogs don't contain "TBA" This ensures that the release dates in the library changelogs are properly set. - - - - - 6e15ca54 by Ben Gamari at 2019-03-29T18:18:22Z Bump transformers to 0.5.6.2 See #16199. - - - - - 6f7115df by Ben Gamari at 2019-03-30T11:42:38Z ci: Ensure index.html is preserved in documentation tarball - - - - - 33173a51 by Alexandre at 2019-04-01T07:32:28Z Add support for bitreverse primop This commit includes the necessary changes in code and documentation to support a primop that reverses a word's bits. It also includes a test. - - - - - a3971b4e by Alexandre at 2019-04-01T07:32:28Z Bump ghc-prim's version where needed - - - - - 061276ea by Michael Sloan at 2019-04-01T07:32:30Z Remove unnecessary uses of UnboxedTuples pragma (see #13101 / #15454) Also removes a couple unnecessary MagicHash pragmas - - - - - e468c613 by David Eichmann at 2019-04-01T07:32:34Z Support Shake's --lint-fsatrace feature. Using this feature requires fsatrace (e.g. https://github.com/jacereda/fsatrace). Simply use the `--lint-fsatrace` option when running hadrian. Shake version >= 0.17.7 is required to support linting out of tree build dirs. - - - - - 1e9e4197 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add merge request template for backports for 8.8 - - - - - 55650d14 by Ben Gamari at 2019-04-01T07:32:34Z gitlab: Add some simply issue templates - - - - - 27b99ed8 by Takenobu Tani at 2019-04-01T07:32:36Z Clean up URLs to point to GitLab This moves URL references to old Trac to their corresponding GitLab counterparts. This patch does not update the submodule library, such as libraries/Cabal. See also !539, !606, !618 [ci skip] - - - - - 18d1555d by Adam Sandberg Eriksson at 2019-04-01T07:32:38Z configure: document the use of the LD variable - - - - - 10352efa by Ben Gamari at 2019-04-01T22:22:34Z gitlab: Add feature request MR template - - - - - 1e52054b by Ben Gamari at 2019-04-01T23:16:21Z gitlab: Move feature request template to issue_templates Whoops. - - - - - e5c21ca9 by Ben Gamari at 2019-04-01T23:16:25Z gitlab: Mention ~"user facing" label - - - - - 39282422 by Ryan Scott at 2019-04-02T00:01:38Z Bump array submodule This bumps `array` to version 0.5.4.0 so that we can distinguish it with `MIN_VERSION_array` (as it introduces some changes to the `Show` instance for `UArray`). - - - - - 7cf5ba3d by Michal Terepeta at 2019-04-02T00:07:49Z Improve performance of newSmallArray# This: - Hoists part of the condition outside of the initialization loop in `stg_newSmallArrayzh`. - Annotates one of the unlikely branches as unlikely, also in `stg_newSmallArrayzh`. - Adds a couple of annotations to `allocateMightFail` indicating which branches are likely to be taken. Together this gives about 5% improvement. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - dd9c82ef by David Eichmann at 2019-04-02T00:13:55Z Hadrian: correct deps for ghc builder. Previously, when needing ghc as a builder, the ghcDeps (Files the GHC binary depends on) for the current stage were needed. This is incorrect as the previous stage's ghc is used for building. This commit fixes the issue, needing the previous stage's ghcDeps. - - - - - 345306d3 by Alexandre Baldé at 2019-04-02T16:34:30Z Fix formatting issue in ghc-prim's changelog [skip ci] - - - - - f54b5124 by David Eichmann at 2019-04-02T16:40:39Z Hadrian: traceAllow deep dependencies when compilling haskell object files. - - - - - d132b30a by David Eichmann at 2019-04-02T16:40:39Z Hadrian: lint ignore autom4te and ghc-pkg cache files. - - - - - bf734195 by Simon Marlow at 2019-04-02T16:46:46Z Add myself to libraries/ghci - - - - - 5a75ccd0 by klebinger.andreas at gmx.at at 2019-04-03T04:34:57Z Fix faulty substitutions in StgCse (#11532). `substBndr` should rename bindings which shadow existing ids. However while it was renaming the bindings it was not adding proper substitutions for renamed bindings. Instead of adding a substitution of the form `old -> new` for renamed bindings it mistakenly added `old -> old` if no replacement had taken place while adding none if `old` had been renamed. As a byproduct this should improve performance, as we no longer add useless substitutions for unshadowed bindings. - - - - - 2ec749b5 by Nathan Collins at 2019-04-03T04:41:05Z users-guide: Fix typo - - - - - ea192a09 by Andrew Martin at 2019-04-03T04:41:05Z base: Add documentation that liftA2 used to not be a typeclass method - - - - - 733f1b52 by Frank Steffahn at 2019-04-03T04:41:05Z users-guide: Typo in Users Guide, Glasgow Exts - - - - - 3364def0 by Ben Gamari at 2019-04-03T04:41:05Z integer-gmp: Write friendlier documentation for Integer - - - - - dd3a3d08 by Ben Gamari at 2019-04-03T04:41:05Z integer-simple: Add documentation for Integer type - - - - - 722fdddf by Chris Martin at 2019-04-03T04:41:05Z Correct two misspellings of "separately" - - - - - bf6dbe3d by Chris Martin at 2019-04-03T04:41:05Z Inline the definition of 'ap' in the Monad laws The law as it is currently written is meaningless, because nowhere have we defined the implementation of 'ap'. The reader of the Control.Monad documentation is provided with only a type signature, > ap :: Monad m => m (a -> b) -> m a -> m b an informal description, > In many situations, the liftM operations can be replaced by uses of > ap, which promotes function application. and a relationship between 'ap' and the 'liftM' functions > return f `ap` x1 `ap` ... `ap` xn > is equivalent to > liftMn f x1 x2 ... xn Without knowing how 'ap' is defined, a law involving 'ap' cannot provide any guidance for how to write a lawful Monad instance, nor can we conclude anything from the law. I suspect that a reader equipped with the understanding that 'ap' was defined prior to the invention of the Applicative class could deduce that 'ap' must be defined in terms of (>>=), but nowhere as far as I can tell have we written this down explicitly for readers without the benefit of historical context. If the law is meant to express a relationship among (<*>), (>>=), and 'return', it seems that it is better off making this statement directly, sidestepping 'ap' altogether. - - - - - 7b090b53 by Ben Gamari at 2019-04-03T07:57:40Z configure: Always use AC_LINK_ELSEIF when testing against assembler This fixes #16440, where the build system incorrectly concluded that the `.subsections_via_symbols` assembler directive was supported on a Linux system. This was caused by the fact that gcc was invoked with `-flto`; when so-configured gcc does not call the assembler but rather simply serialises its AST for compilation during the final link. This is described in Note [autoconf assembler checks and -flto]. - - - - - 4626cf21 by Sebastian Graf at 2019-04-03T08:03:47Z Fix Uncovered set of literal patterns Issues #16289 and #15713 are proof that the pattern match checker did an unsound job of estimating the value set abstraction corresponding to the uncovered set. The reason is that the fix from #11303 introducing `NLit` was incomplete: The `LitCon` case desugared to `Var` rather than `LitVar`, which would have done the necessary case splitting analogous to the `ConVar` case. This patch rectifies that by introducing the fresh unification variable in `LitCon` in value abstraction position rather than pattern postition, recording a constraint equating it to the constructor expression rather than the literal. Fixes #16289 and #15713. - - - - - 6f13e7b1 by Ben Gamari at 2019-04-03T12:12:26Z gitlab-ci: Build hyperlinked sources for releases Fixes #16445. - - - - - 895394c2 by Ben Gamari at 2019-04-03T12:15:06Z gitlab: Fix label names in issue templates - - - - - 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 1930de8f by Ömer Sinan Ağacan at 2019-04-10T07:46:08Z rts/BlockAlloc: Allow aligned allocation requests This implements support for block group allocations which are aligned to an integral number of blocks. This will be used by the nonmoving garbage collector, which uses the block allocator to allocate the segments which back its heap. These segments are a fixed number of blocks in size, with each segment being aligned to the segment size boundary. This allows us to easily find the segment metadata stored at the beginning of the segment. - - - - - 6fd24a46 by Ben Gamari at 2019-04-10T07:46:08Z testsuite/testblockalloc: A bit of refactoring - - - - - 5519e0ac by Ben Gamari at 2019-04-10T07:46:08Z testsuite/testblockalloc: Test aligned block group allocation - - - - - 380088bb by Ben Gamari at 2019-04-10T07:46:08Z rts/BlockAlloc: Wibbles - - - - - 7b0d21b3 by Ben Gamari at 2019-04-10T07:46:08Z rts/BlockAlloc: Use allocLargeChunk in aligned block allocation - - - - - 5fbe5f0d by Ömer Sinan Ağacan at 2019-04-10T07:46:08Z Disallow allocating megablocks, update tests - - - - - 30 changed files: - .circleci/config.yml - .ghcid - + .gitattributes - .gitignore - .gitlab-ci.yml - + .gitlab/issue_templates/bug.md - + .gitlab/issue_templates/feature_request.md - .gitlab/linters/check-makefiles.py - + .gitlab/merge_request_templates/backport-for-8.8.md - .gitlab/merge_request_templates/merge-request.md - + .gitlab/start-head.hackage.sh - .gitlab/win32-init.sh - .mailmap - ANNOUNCE - CODEOWNERS - HACKING.md - INSTALL.md - MAKEHELP.md - Makefile - README.md - aclocal.m4 - bindisttest/Makefile - bindisttest/ghc.mk - boot - compiler/Makefile - compiler/basicTypes/BasicTypes.hs - compiler/basicTypes/DataCon.hs - compiler/basicTypes/Demand.hs - compiler/basicTypes/Id.hs - compiler/basicTypes/Lexeme.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/b2e2fa647e5a005c83a50fae71777a3187583c23...5fbe5f0dacd5c0c10919815470f41ffbd2cb2727 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/b2e2fa647e5a005c83a50fae71777a3187583c23...5fbe5f0dacd5c0c10919815470f41ffbd2cb2727 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Wed Apr 10 12:12:54 2019 From: gitlab at gitlab.haskell.org (=?UTF-8?B?w5ZtZXIgU2luYW4gQcSfYWNhbg==?=) Date: Wed, 10 Apr 2019 08:12:54 -0400 Subject: [Git][ghc/ghc][wip/gc/aligned-block-allocation] Fix lint errors Message-ID: <5cadddc6f1dbd_62b33fa2e3ea83442744682@gitlab.haskell.org.mail> Ömer Sinan Ağacan pushed to branch wip/gc/aligned-block-allocation at Glasgow Haskell Compiler / GHC Commits: 9ba77dd0 by Ömer Sinan Ağacan at 2019-04-10T12:12:32Z Fix lint errors - - - - - 1 changed file: - rts/sm/BlockAlloc.c Changes: ===================================== rts/sm/BlockAlloc.c ===================================== @@ -535,7 +535,7 @@ allocAlignedGroupOnNode (uint32_t node, W_ n) ASSERT(slop_low_blocks + slop_high_blocks + n == num_blocks); -#ifdef DEBUG +#if defined(DEBUG) checkFreeListSanity(); W_ free_before = countFreeList(); #endif @@ -545,7 +545,7 @@ allocAlignedGroupOnNode (uint32_t node, W_ n) ASSERT(countBlocks(bd) == num_blocks - slop_low_blocks); } -#ifdef DEBUG +#if defined(DEBUG) ASSERT(countFreeList() == free_before + slop_low_blocks); checkFreeListSanity(); #endif @@ -553,7 +553,7 @@ allocAlignedGroupOnNode (uint32_t node, W_ n) // At this point the bd should be aligned, but we may have slop on the high side ASSERT((uintptr_t)bd->start % group_size == 0); -#ifdef DEBUG +#if defined(DEBUG) free_before = countFreeList(); #endif @@ -562,7 +562,7 @@ allocAlignedGroupOnNode (uint32_t node, W_ n) ASSERT(bd->blocks == n); } -#ifdef DEBUG +#if defined(DEBUG) ASSERT(countFreeList() == free_before + slop_high_blocks); checkFreeListSanity(); #endif View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9ba77dd0ffd5724c1302e8550750a413589257d2 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9ba77dd0ffd5724c1302e8550750a413589257d2 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Wed Apr 10 18:58:56 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Wed, 10 Apr 2019 14:58:56 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] removing x87 register support from native code gen Message-ID: <5cae3cf010e75_62b33fa287e5f1502805130@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: 61e0a1e4 by Carter Tazio Schonwald at 2019-04-10T18:58:38Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 27 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,38 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities? +-- Phrased differently, it is worth experimenting with supporting +-- different register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! +-- The current design with respect to register mapping of scalars could +-- very well be the best,but exploring the design space and doing careful +-- measurments is the only only way to validate that. +-- In some next generation CPU ISAs, notably RISC V, the SIMD extension +-- includes support for a sort of run time CPU dependent vectorization parameter, +-- where a loop may act upon a single scalar each iteration OR some 2,4,8 ... +-- element chunk! Time will tell if that direction sees wide adoption, +-- but it is from that context that unifying our handling of simd and scalars +-- may benefit. It is not likely to benefit current architectures, though +-- it may very well be a design perspective that helps guide improving the NCG. + + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/AsmCodeGen.hs ===================================== @@ -179,7 +179,7 @@ nativeCodeGen dflags this_mod modLoc h us cmms x86NcgImpl :: DynFlags -> NcgImpl (Alignment, CmmStatics) X86.Instr.Instr X86.Instr.JumpDest x86NcgImpl dflags - = (x86_64NcgImpl dflags) { ncg_x86fp_kludge = map x86fp_kludge } + = (x86_64NcgImpl dflags) x86_64NcgImpl :: DynFlags -> NcgImpl (Alignment, CmmStatics) X86.Instr.Instr X86.Instr.JumpDest @@ -194,7 +194,6 @@ x86_64NcgImpl dflags ,pprNatCmmDecl = X86.Ppr.pprNatCmmDecl ,maxSpillSlots = X86.Instr.maxSpillSlots dflags ,allocatableRegs = X86.Regs.allocatableRegs platform - ,ncg_x86fp_kludge = id ,ncgAllocMoreStack = X86.Instr.allocMoreStack platform ,ncgExpandTop = id ,ncgMakeFarBranches = const id @@ -215,7 +214,6 @@ ppcNcgImpl dflags ,pprNatCmmDecl = PPC.Ppr.pprNatCmmDecl ,maxSpillSlots = PPC.Instr.maxSpillSlots dflags ,allocatableRegs = PPC.Regs.allocatableRegs platform - ,ncg_x86fp_kludge = id ,ncgAllocMoreStack = PPC.Instr.allocMoreStack platform ,ncgExpandTop = id ,ncgMakeFarBranches = PPC.Instr.makeFarBranches @@ -236,7 +234,6 @@ sparcNcgImpl dflags ,pprNatCmmDecl = SPARC.Ppr.pprNatCmmDecl ,maxSpillSlots = SPARC.Instr.maxSpillSlots dflags ,allocatableRegs = SPARC.Regs.allocatableRegs - ,ncg_x86fp_kludge = id ,ncgAllocMoreStack = noAllocMoreStack ,ncgExpandTop = map SPARC.CodeGen.Expand.expandTop ,ncgMakeFarBranches = const id @@ -680,19 +677,10 @@ cmmNativeGen dflags this_mod modLoc ncgImpl us fileIds dbgMap cmm count foldl' (\m (from,to) -> addImmediateSuccessor from to m ) cfgWithFixupBlks stack_updt_blks - ---- x86fp_kludge. This pass inserts ffree instructions to clear - ---- the FPU stack on x86. The x86 ABI requires that the FPU stack - ---- is clear, and library functions can return odd results if it - ---- isn't. - ---- - ---- NB. must happen before shortcutBranches, because that - ---- generates JXX_GBLs which we can't fix up in x86fp_kludge. - let kludged = {-# SCC "x86fp_kludge" #-} ncg_x86fp_kludge ncgImpl alloced - ---- generate jump tables let tabled = {-# SCC "generateJumpTables" #-} - generateJumpTables ncgImpl kludged + generateJumpTables ncgImpl alloced dumpIfSet_dyn dflags Opt_D_dump_cfg_weights "CFG Update information" @@ -787,12 +775,6 @@ checkLayout procsUnsequenced procsSequenced = getBlockIds (CmmProc _ _ _ (ListGraph blocks)) = setFromList $ map blockId blocks - -x86fp_kludge :: NatCmmDecl (Alignment, CmmStatics) X86.Instr.Instr -> NatCmmDecl (Alignment, CmmStatics) X86.Instr.Instr -x86fp_kludge top@(CmmData _ _) = top -x86fp_kludge (CmmProc info lbl live (ListGraph code)) = - CmmProc info lbl live (ListGraph $ X86.Instr.i386_insert_ffrees code) - -- | Compute unwinding tables for the blocks of a procedure computeUnwinding :: Instruction instr => DynFlags -> NcgImpl statics instr jumpDest ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/NCGMonad.hs ===================================== @@ -76,7 +76,6 @@ data NcgImpl statics instr jumpDest = NcgImpl { pprNatCmmDecl :: NatCmmDecl statics instr -> SDoc, maxSpillSlots :: Int, allocatableRegs :: [RealReg], - ncg_x86fp_kludge :: [NatCmmDecl statics instr] -> [NatCmmDecl statics instr], ncgExpandTop :: [NatCmmDecl statics instr] -> [NatCmmDecl statics instr], ncgAllocMoreStack :: Int -> NatCmmDecl statics instr -> UniqSM (NatCmmDecl statics instr, [(BlockId,BlockId)]), ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,18 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +107,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +116,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -10,7 +10,7 @@ module X86.Instr (Instr(..), Operand(..), PrefetchVariant(..), JumpDest(..), getJumpDestBlockId, canShortcut, shortcutStatics, - shortcutJump, i386_insert_ffrees, allocMoreStack, + shortcutJump, allocMoreStack, maxSpillSlots, archWordFormat ) where @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -970,58 +888,6 @@ x86_mkStackDeallocInstr platform amount ArchX86_64 -> [ADD II64 (OpImm (ImmInt amount)) (OpReg rsp)] _ -> panic "x86_mkStackDeallocInstr" -i386_insert_ffrees - :: [GenBasicBlock Instr] - -> [GenBasicBlock Instr] - -i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise - = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False - -- -- Note [extra spill slots] ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprX87Instr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/61e0a1e476add298618b2372f7a192563ee4fb09 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/61e0a1e476add298618b2372f7a192563ee4fb09 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Wed Apr 10 19:02:21 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Wed, 10 Apr 2019 15:02:21 -0400 Subject: [Git][ghc/ghc][wip/carter/remove_x87Registers] 3 commits: GHC no longer ever defines TABLES_NEXT_TO_CODE on its own Message-ID: <5cae3dbd635f6_62b33fa2cdfae44828069dc@gitlab.haskell.org.mail> Carter Schonwald pushed to branch wip/carter/remove_x87Registers at Glasgow Haskell Compiler / GHC Commits: fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - fbf7331b by Carter Tazio Schonwald at 2019-04-10T19:02:11Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 30 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - ghc/ghc-bin.cabal.in - includes/CodeGen.Platform.hs - libraries/base/base.cabal - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/61e0a1e476add298618b2372f7a192563ee4fb09...fbf7331bfe6f3422c331cc7d30803d8c1d9abfe5 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/61e0a1e476add298618b2372f7a192563ee4fb09...fbf7331bfe6f3422c331cc7d30803d8c1d9abfe5 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 00:28:59 2019 From: gitlab at gitlab.haskell.org (Carter Schonwald) Date: Wed, 10 Apr 2019 20:28:59 -0400 Subject: [Git][ghc/ghc][master] removing x87 register support from native code gen Message-ID: <5cae8a4b17b38_62b33fa2e55a12d028230f4@gitlab.haskell.org.mail> Carter Schonwald pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 27 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - includes/CodeGen.Platform.hs - libraries/base/tests/Numeric/all.T - libraries/base/tests/Numeric/num009.hs Changes: ===================================== compiler/cmm/CmmCallConv.hs ===================================== @@ -81,7 +81,6 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) | passFloatInXmm -> k (RegisterParam (DoubleReg s), (vs, fs, ds, ls, ss)) (W64, (vs, fs, d:ds, ls, ss)) | not passFloatInXmm -> k (RegisterParam d, (vs, fs, ds, ls, ss)) - (W80, _) -> panic "F80 unsupported register type" _ -> (assts, (r:rs)) int = case (w, regs) of (W128, _) -> panic "W128 unsupported register type" @@ -100,6 +99,7 @@ assignArgumentsPos dflags off conv arg_ty reps = (stk_off, assignments) passFloatArgsInXmm :: DynFlags -> Bool passFloatArgsInXmm dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True + ArchX86 -> False _ -> False -- We used to spill vector registers to the stack since the LLVM backend didn't ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -474,6 +474,9 @@ instance Eq GlobalReg where FloatReg i == FloatReg j = i==j DoubleReg i == DoubleReg j = i==j LongReg i == LongReg j = i==j + -- NOTE: XMM, YMM, ZMM registers actually are the same registers + -- at least with respect to store at YMM i and then read from XMM i + -- and similarly for ZMM etc. XmmReg i == XmmReg j = i==j YmmReg i == YmmReg j = i==j ZmmReg i == ZmmReg j = i==j @@ -584,6 +587,9 @@ globalRegType dflags (VanillaReg _ VNonGcPtr) = bWord dflags globalRegType _ (FloatReg _) = cmmFloat W32 globalRegType _ (DoubleReg _) = cmmFloat W64 globalRegType _ (LongReg _) = cmmBits W64 +-- TODO: improve the internal model of SIMD/vectorized registers +-- the right design SHOULd improve handling of float and double code too. +-- see remarks in "NOTE [SIMD Design for the future]"" in StgCmmPrim globalRegType _ (XmmReg _) = cmmVec 4 (cmmBits W32) globalRegType _ (YmmReg _) = cmmVec 8 (cmmBits W32) globalRegType _ (ZmmReg _) = cmmVec 16 (cmmBits W32) ===================================== compiler/cmm/CmmType.hs ===================================== @@ -166,9 +166,6 @@ isFloat64 _other = False ----------------------------------------------------------------------------- data Width = W8 | W16 | W32 | W64 - | W80 -- Extended double-precision float, - -- used in x86 native codegen only. - -- (we use Ord, so it'd better be in this order) | W128 | W256 | W512 @@ -185,7 +182,7 @@ mrStr W64 = sLit("W64") mrStr W128 = sLit("W128") mrStr W256 = sLit("W256") mrStr W512 = sLit("W512") -mrStr W80 = sLit("W80") + -------- Common Widths ------------ @@ -222,7 +219,7 @@ widthInBits W64 = 64 widthInBits W128 = 128 widthInBits W256 = 256 widthInBits W512 = 512 -widthInBits W80 = 80 + widthInBytes :: Width -> Int widthInBytes W8 = 1 @@ -232,7 +229,7 @@ widthInBytes W64 = 8 widthInBytes W128 = 16 widthInBytes W256 = 32 widthInBytes W512 = 64 -widthInBytes W80 = 10 + widthFromBytes :: Int -> Width widthFromBytes 1 = W8 @@ -242,7 +239,7 @@ widthFromBytes 8 = W64 widthFromBytes 16 = W128 widthFromBytes 32 = W256 widthFromBytes 64 = W512 -widthFromBytes 10 = W80 + widthFromBytes n = pprPanic "no width for given number of bytes" (ppr n) -- log_2 of the width in bytes, useful for generating shifts. @@ -254,7 +251,7 @@ widthInLog W64 = 3 widthInLog W128 = 4 widthInLog W256 = 5 widthInLog W512 = 6 -widthInLog W80 = panic "widthInLog: F80" + -- widening / narrowing ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -1727,8 +1727,38 @@ vecElemProjectCast dflags WordVec W32 = Just (mo_u_32ToWord dflags) vecElemProjectCast _ WordVec W64 = Nothing vecElemProjectCast _ _ _ = Nothing + +-- NOTE [SIMD Design for the future] -- Check to make sure that we can generate code for the specified vector type -- given the current set of dynamic flags. +-- Currently these checks are specific to x86 and x86_64 architecture. +-- This should be fixed! +-- In particular, +-- 1) Add better support for other architectures! (this may require a redesign) +-- 2) Decouple design choices from LLVM's pseudo SIMD model! +-- The high level LLVM naive rep makes per CPU family SIMD generation is own +-- optimization problem, and hides important differences in eg ARM vs x86_64 simd +-- 3) Depending on the architecture, the SIMD registers may also support general +-- computations on Float/Double/Word/Int scalars, but currently on +-- for example x86_64, we always put Word/Int (or sized) in GPR +-- (general purpose) registers. Would relaxing that allow for +-- useful optimization opportunities? +-- Phrased differently, it is worth experimenting with supporting +-- different register mapping strategies than we currently have, especially if +-- someday we want SIMD to be a first class denizen in GHC along with scalar +-- values! +-- The current design with respect to register mapping of scalars could +-- very well be the best,but exploring the design space and doing careful +-- measurments is the only only way to validate that. +-- In some next generation CPU ISAs, notably RISC V, the SIMD extension +-- includes support for a sort of run time CPU dependent vectorization parameter, +-- where a loop may act upon a single scalar each iteration OR some 2,4,8 ... +-- element chunk! Time will tell if that direction sees wide adoption, +-- but it is from that context that unifying our handling of simd and scalars +-- may benefit. It is not likely to benefit current architectures, though +-- it may very well be a design perspective that helps guide improving the NCG. + + checkVecCompatibility :: DynFlags -> PrimOpVecCat -> Length -> Width -> FCode () checkVecCompatibility dflags vcat l w = do when (hscTarget dflags /= HscLlvm) $ do ===================================== compiler/llvmGen/LlvmCodeGen/Base.hs ===================================== @@ -97,7 +97,6 @@ cmmToLlvmType ty | isVecType ty = LMVector (vecLength ty) (cmmToLlvmType (vecE widthToLlvmFloat :: Width -> LlvmType widthToLlvmFloat W32 = LMFloat widthToLlvmFloat W64 = LMDouble -widthToLlvmFloat W80 = LMFloat80 widthToLlvmFloat W128 = LMFloat128 widthToLlvmFloat w = panic $ "widthToLlvmFloat: Bad float size: " ++ show w ===================================== compiler/main/DynFlags.hs ===================================== @@ -5833,20 +5833,24 @@ data SseVersion = SSE1 isSseEnabled :: DynFlags -> Bool isSseEnabled dflags = case platformArch (targetPlatform dflags) of ArchX86_64 -> True - ArchX86 -> sseVersion dflags >= Just SSE1 + ArchX86 -> True _ -> False isSse2Enabled :: DynFlags -> Bool isSse2Enabled dflags = case platformArch (targetPlatform dflags) of - ArchX86_64 -> -- SSE2 is fixed on for x86_64. It would be - -- possible to make it optional, but we'd need to - -- fix at least the foreign call code where the - -- calling convention specifies the use of xmm regs, - -- and possibly other places. - True - ArchX86 -> sseVersion dflags >= Just SSE2 + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> True + ArchX86 -> True _ -> False + isSse4_2Enabled :: DynFlags -> Bool isSse4_2Enabled dflags = sseVersion dflags >= Just SSE42 ===================================== compiler/nativeGen/AsmCodeGen.hs ===================================== @@ -179,7 +179,7 @@ nativeCodeGen dflags this_mod modLoc h us cmms x86NcgImpl :: DynFlags -> NcgImpl (Alignment, CmmStatics) X86.Instr.Instr X86.Instr.JumpDest x86NcgImpl dflags - = (x86_64NcgImpl dflags) { ncg_x86fp_kludge = map x86fp_kludge } + = (x86_64NcgImpl dflags) x86_64NcgImpl :: DynFlags -> NcgImpl (Alignment, CmmStatics) X86.Instr.Instr X86.Instr.JumpDest @@ -194,7 +194,6 @@ x86_64NcgImpl dflags ,pprNatCmmDecl = X86.Ppr.pprNatCmmDecl ,maxSpillSlots = X86.Instr.maxSpillSlots dflags ,allocatableRegs = X86.Regs.allocatableRegs platform - ,ncg_x86fp_kludge = id ,ncgAllocMoreStack = X86.Instr.allocMoreStack platform ,ncgExpandTop = id ,ncgMakeFarBranches = const id @@ -215,7 +214,6 @@ ppcNcgImpl dflags ,pprNatCmmDecl = PPC.Ppr.pprNatCmmDecl ,maxSpillSlots = PPC.Instr.maxSpillSlots dflags ,allocatableRegs = PPC.Regs.allocatableRegs platform - ,ncg_x86fp_kludge = id ,ncgAllocMoreStack = PPC.Instr.allocMoreStack platform ,ncgExpandTop = id ,ncgMakeFarBranches = PPC.Instr.makeFarBranches @@ -236,7 +234,6 @@ sparcNcgImpl dflags ,pprNatCmmDecl = SPARC.Ppr.pprNatCmmDecl ,maxSpillSlots = SPARC.Instr.maxSpillSlots dflags ,allocatableRegs = SPARC.Regs.allocatableRegs - ,ncg_x86fp_kludge = id ,ncgAllocMoreStack = noAllocMoreStack ,ncgExpandTop = map SPARC.CodeGen.Expand.expandTop ,ncgMakeFarBranches = const id @@ -680,19 +677,10 @@ cmmNativeGen dflags this_mod modLoc ncgImpl us fileIds dbgMap cmm count foldl' (\m (from,to) -> addImmediateSuccessor from to m ) cfgWithFixupBlks stack_updt_blks - ---- x86fp_kludge. This pass inserts ffree instructions to clear - ---- the FPU stack on x86. The x86 ABI requires that the FPU stack - ---- is clear, and library functions can return odd results if it - ---- isn't. - ---- - ---- NB. must happen before shortcutBranches, because that - ---- generates JXX_GBLs which we can't fix up in x86fp_kludge. - let kludged = {-# SCC "x86fp_kludge" #-} ncg_x86fp_kludge ncgImpl alloced - ---- generate jump tables let tabled = {-# SCC "generateJumpTables" #-} - generateJumpTables ncgImpl kludged + generateJumpTables ncgImpl alloced dumpIfSet_dyn dflags Opt_D_dump_cfg_weights "CFG Update information" @@ -787,12 +775,6 @@ checkLayout procsUnsequenced procsSequenced = getBlockIds (CmmProc _ _ _ (ListGraph blocks)) = setFromList $ map blockId blocks - -x86fp_kludge :: NatCmmDecl (Alignment, CmmStatics) X86.Instr.Instr -> NatCmmDecl (Alignment, CmmStatics) X86.Instr.Instr -x86fp_kludge top@(CmmData _ _) = top -x86fp_kludge (CmmProc info lbl live (ListGraph code)) = - CmmProc info lbl live (ListGraph $ X86.Instr.i386_insert_ffrees code) - -- | Compute unwinding tables for the blocks of a procedure computeUnwinding :: Instruction instr => DynFlags -> NcgImpl statics instr jumpDest ===================================== compiler/nativeGen/Format.hs ===================================== @@ -47,7 +47,6 @@ data Format | II64 | FF32 | FF64 - | FF80 deriving (Show, Eq) @@ -70,7 +69,7 @@ floatFormat width = case width of W32 -> FF32 W64 -> FF64 - W80 -> FF80 + other -> pprPanic "Format.floatFormat" (ppr other) @@ -80,7 +79,6 @@ isFloatFormat format = case format of FF32 -> True FF64 -> True - FF80 -> True _ -> False @@ -101,7 +99,7 @@ formatToWidth format II64 -> W64 FF32 -> W32 FF64 -> W64 - FF80 -> W80 + formatInBytes :: Format -> Int formatInBytes = widthInBytes . formatToWidth ===================================== compiler/nativeGen/NCGMonad.hs ===================================== @@ -76,7 +76,6 @@ data NcgImpl statics instr jumpDest = NcgImpl { pprNatCmmDecl :: NatCmmDecl statics instr -> SDoc, maxSpillSlots :: Int, allocatableRegs :: [RealReg], - ncg_x86fp_kludge :: [NatCmmDecl statics instr] -> [NatCmmDecl statics instr], ncgExpandTop :: [NatCmmDecl statics instr] -> [NatCmmDecl statics instr], ncgAllocMoreStack :: Int -> NatCmmDecl statics instr -> UniqSM (NatCmmDecl statics instr, [(BlockId,BlockId)]), ===================================== compiler/nativeGen/PPC/CodeGen.hs ===================================== @@ -1593,7 +1593,7 @@ genCCall' -> [CmmActual] -- arguments (of mixed type) -> NatM InstrBlock -{- +{- PowerPC Linux uses the System V Release 4 Calling Convention for PowerPC. It is described in the "System V Application Binary Interface PowerPC Processor Supplement". @@ -1906,7 +1906,7 @@ genCCall' dflags gcp target dest_regs args FF32 -> (1, 1, 4, fprs) FF64 -> (2, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" + GCP32ELF -> case cmmTypeFormat rep of II8 -> (1, 0, 4, gprs) @@ -1916,7 +1916,6 @@ genCCall' dflags gcp target dest_regs args FF32 -> (0, 1, 4, fprs) FF64 -> (0, 1, 8, fprs) II64 -> panic "genCCall' passArguments II64" - FF80 -> panic "genCCall' passArguments FF80" GCP64ELF _ -> case cmmTypeFormat rep of II8 -> (1, 0, 8, gprs) @@ -1928,7 +1927,6 @@ genCCall' dflags gcp target dest_regs args -- the FPRs. FF32 -> (1, 1, 8, fprs) FF64 -> (1, 1, 8, fprs) - FF80 -> panic "genCCall' passArguments FF80" moveResult reduceToFF32 = case dest_regs of ===================================== compiler/nativeGen/PPC/Ppr.hs ===================================== @@ -161,7 +161,7 @@ pprReg r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr_reg_no :: Int -> SDoc ppr_reg_no i @@ -179,8 +179,7 @@ pprFormat x II32 -> sLit "w" II64 -> sLit "d" FF32 -> sLit "fs" - FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprFormat: no match") + FF64 -> sLit "fd") pprCond :: Cond -> SDoc @@ -365,7 +364,6 @@ pprInstr (LD fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', @@ -405,7 +403,6 @@ pprInstr (LA fmt reg addr) = hcat [ II64 -> sLit "d" FF32 -> sLit "fs" FF64 -> sLit "fd" - _ -> panic "PPC.Ppr.pprInstr: no match" ), case addr of AddrRegImm _ _ -> empty AddrRegReg _ _ -> char 'x', ===================================== compiler/nativeGen/PPC/Regs.hs ===================================== @@ -131,7 +131,7 @@ regDotColor reg RcInteger -> text "blue" RcFloat -> text "red" RcDouble -> text "green" - RcDoubleSSE -> text "yellow" + -- immediates ------------------------------------------------------------------ ===================================== compiler/nativeGen/Reg.hs ===================================== @@ -56,7 +56,7 @@ data VirtualReg | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register | VirtualRegF {-# UNPACK #-} !Unique | VirtualRegD {-# UNPACK #-} !Unique - | VirtualRegSSE {-# UNPACK #-} !Unique + deriving (Eq, Show) -- This is laborious, but necessary. We can't derive Ord because @@ -69,15 +69,14 @@ instance Ord VirtualReg where compare (VirtualRegHi a) (VirtualRegHi b) = nonDetCmpUnique a b compare (VirtualRegF a) (VirtualRegF b) = nonDetCmpUnique a b compare (VirtualRegD a) (VirtualRegD b) = nonDetCmpUnique a b - compare (VirtualRegSSE a) (VirtualRegSSE b) = nonDetCmpUnique a b + compare VirtualRegI{} _ = LT compare _ VirtualRegI{} = GT compare VirtualRegHi{} _ = LT compare _ VirtualRegHi{} = GT compare VirtualRegF{} _ = LT compare _ VirtualRegF{} = GT - compare VirtualRegD{} _ = LT - compare _ VirtualRegD{} = GT + instance Uniquable VirtualReg where @@ -87,16 +86,18 @@ instance Uniquable VirtualReg where VirtualRegHi u -> u VirtualRegF u -> u VirtualRegD u -> u - VirtualRegSSE u -> u instance Outputable VirtualReg where ppr reg = case reg of VirtualRegI u -> text "%vI_" <> pprUniqueAlways u VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u - VirtualRegF u -> text "%vF_" <> pprUniqueAlways u - VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + -- this code is kinda wrong on x86 + -- because float and double occupy the same register set + -- namely SSE2 register xmm0 .. xmm15 + VirtualRegF u -> text "%vFloat_" <> pprUniqueAlways u + VirtualRegD u -> text "%vDouble_" <> pprUniqueAlways u + renameVirtualReg :: Unique -> VirtualReg -> VirtualReg @@ -106,7 +107,6 @@ renameVirtualReg u r VirtualRegHi _ -> VirtualRegHi u VirtualRegF _ -> VirtualRegF u VirtualRegD _ -> VirtualRegD u - VirtualRegSSE _ -> VirtualRegSSE u classOfVirtualReg :: VirtualReg -> RegClass @@ -116,7 +116,7 @@ classOfVirtualReg vr VirtualRegHi{} -> RcInteger VirtualRegF{} -> RcFloat VirtualRegD{} -> RcDouble - VirtualRegSSE{} -> RcDoubleSSE + -- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform ===================================== compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs ===================================== @@ -134,6 +134,10 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcInteger conflicts excl trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclusions | let cALLOCATABLE_REGS_FLOAT = (case platformArch platform of + -- On x86_64 and x86, Float and RcDouble + -- use the same registers, + -- so we only use RcDouble to represent the + -- register allocation problem on those types. ArchX86 -> 0 ArchX86_64 -> 0 ArchPPC -> 0 @@ -160,8 +164,14 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcFloat conflicts exclus trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclusions | let cALLOCATABLE_REGS_DOUBLE = (case platformArch platform of - ArchX86 -> 6 - ArchX86_64 -> 0 + ArchX86 -> 8 + -- in x86 32bit mode sse2 there are only + -- 8 XMM registers xmm0 ... xmm7 + ArchX86_64 -> 10 + -- in x86_64 there are 16 XMM registers + -- xmm0 .. xmm15, here 10 is a + -- "dont need to solve conflicts" count that + -- was chosen at some point in the past. ArchPPC -> 26 ArchSPARC -> 11 ArchSPARC64 -> panic "trivColorable ArchSPARC64" @@ -183,31 +193,7 @@ trivColorable platform virtualRegSqueeze realRegSqueeze RcDouble conflicts exclu = count3 < cALLOCATABLE_REGS_DOUBLE -trivColorable platform virtualRegSqueeze realRegSqueeze RcDoubleSSE conflicts exclusions - | let cALLOCATABLE_REGS_SSE - = (case platformArch platform of - ArchX86 -> 8 - ArchX86_64 -> 10 - ArchPPC -> 0 - ArchSPARC -> 0 - ArchSPARC64 -> panic "trivColorable ArchSPARC64" - ArchPPC_64 _ -> 0 - ArchARM _ _ _ -> panic "trivColorable ArchARM" - ArchARM64 -> panic "trivColorable ArchARM64" - ArchAlpha -> panic "trivColorable ArchAlpha" - ArchMipseb -> panic "trivColorable ArchMipseb" - ArchMipsel -> panic "trivColorable ArchMipsel" - ArchJavaScript-> panic "trivColorable ArchJavaScript" - ArchUnknown -> panic "trivColorable ArchUnknown") - , count2 <- accSqueeze 0 cALLOCATABLE_REGS_SSE - (virtualRegSqueeze RcDoubleSSE) - conflicts - - , count3 <- accSqueeze count2 cALLOCATABLE_REGS_SSE - (realRegSqueeze RcDoubleSSE) - exclusions - = count3 < cALLOCATABLE_REGS_SSE -- Specification Code ---------------------------------------------------------- ===================================== compiler/nativeGen/RegClass.hs ===================================== @@ -18,7 +18,6 @@ data RegClass = RcInteger | RcFloat | RcDouble - | RcDoubleSSE -- x86 only: the SSE regs are a separate class deriving Eq @@ -26,10 +25,8 @@ instance Uniquable RegClass where getUnique RcInteger = mkRegClassUnique 0 getUnique RcFloat = mkRegClassUnique 1 getUnique RcDouble = mkRegClassUnique 2 - getUnique RcDoubleSSE = mkRegClassUnique 3 instance Outputable RegClass where ppr RcInteger = Outputable.text "I" ppr RcFloat = Outputable.text "F" ppr RcDouble = Outputable.text "D" - ppr RcDoubleSSE = Outputable.text "S" ===================================== compiler/nativeGen/SPARC/Instr.hs ===================================== @@ -384,7 +384,6 @@ sparc_mkSpillInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkSpillInstr" in ST fmt reg (fpRel (negate off_w)) @@ -405,7 +404,6 @@ sparc_mkLoadInstr dflags reg _ slot RcInteger -> II32 RcFloat -> FF32 RcDouble -> FF64 - _ -> panic "sparc_mkLoadInstr" in LD fmt (fpRel (- off_w)) reg @@ -454,7 +452,6 @@ sparc_mkRegRegMoveInstr platform src dst RcInteger -> ADD False False src (RIReg g0) dst RcDouble -> FMOV FF64 src dst RcFloat -> FMOV FF32 src dst - _ -> panic "sparc_mkRegRegMoveInstr" | otherwise = panic "SPARC.Instr.mkRegRegMoveInstr: classes of src and dest not the same" ===================================== compiler/nativeGen/SPARC/Ppr.hs ===================================== @@ -143,7 +143,7 @@ pprReg reg VirtualRegHi u -> text "%vHi_" <> pprUniqueAlways u VirtualRegF u -> text "%vF_" <> pprUniqueAlways u VirtualRegD u -> text "%vD_" <> pprUniqueAlways u - VirtualRegSSE u -> text "%vSSE_" <> pprUniqueAlways u + RegReal rr -> case rr of @@ -211,8 +211,7 @@ pprFormat x II32 -> sLit "" II64 -> sLit "d" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") -- | Pretty print a format for an instruction suffix. @@ -226,8 +225,8 @@ pprStFormat x II32 -> sLit "" II64 -> sLit "x" FF32 -> sLit "" - FF64 -> sLit "d" - _ -> panic "SPARC.Ppr.pprFormat: no match") + FF64 -> sLit "d") + -- | Pretty print a condition code. ===================================== compiler/nativeGen/SPARC/Regs.hs ===================================== @@ -104,7 +104,6 @@ virtualRegSqueeze cls vr VirtualRegD{} -> 1 _other -> 0 - _other -> 0 {-# INLINE realRegSqueeze #-} realRegSqueeze :: RegClass -> RealReg -> Int @@ -135,7 +134,6 @@ realRegSqueeze cls rr RealRegPair{} -> 1 - _other -> 0 -- | All the allocatable registers in the machine, -- including register pairs. ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -98,17 +98,25 @@ is32BitPlatform = do sse2Enabled :: NatM Bool sse2Enabled = do dflags <- getDynFlags - return (isSse2Enabled dflags) + case platformArch (targetPlatform dflags) of + -- We Assume SSE1 and SSE2 operations are available on both + -- x86 and x86_64. Historically we didn't default to SSE2 and + -- SSE1 on x86, which results in defacto nondeterminism for how + -- rounding behaves in the associated x87 floating point instructions + -- because variations in the spill/fpu stack placement of arguments for + -- operations would change the precision and final result of what + -- would otherwise be the same expressions with respect to single or + -- double precision IEEE floating point computations. + ArchX86_64 -> return True + ArchX86 -> return True + _ -> panic "trying to generate x86/x86_64 on the wrong platform" + sse4_2Enabled :: NatM Bool sse4_2Enabled = do dflags <- getDynFlags return (isSse4_2Enabled dflags) -if_sse2 :: NatM a -> NatM a -> NatM a -if_sse2 sse2 x87 = do - b <- sse2Enabled - if b then sse2 else x87 cmmTopCodeGen :: RawCmmDecl @@ -284,15 +292,14 @@ swizzleRegisterRep (Any _ codefn) format = Any format codefn -- | Grab the Reg for a CmmReg -getRegisterReg :: Platform -> Bool -> CmmReg -> Reg +getRegisterReg :: Platform -> CmmReg -> Reg -getRegisterReg _ use_sse2 (CmmLocal (LocalReg u pk)) - = let fmt = cmmTypeFormat pk in - if isFloatFormat fmt && not use_sse2 - then RegVirtual (mkVirtualReg u FF80) - else RegVirtual (mkVirtualReg u fmt) +getRegisterReg _ (CmmLocal (LocalReg u pk)) + = -- by Assuming SSE2, Int,Word,Float,Double all can be register allocated + let fmt = cmmTypeFormat pk in + RegVirtual (mkVirtualReg u fmt) -getRegisterReg platform _ (CmmGlobal mid) +getRegisterReg platform (CmmGlobal mid) = case globalRegMaybe platform mid of Just reg -> RegReal $ reg Nothing -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) @@ -513,15 +520,14 @@ getRegister' dflags is32Bit (CmmReg reg) do reg' <- getPicBaseNat (archWordFormat is32Bit) return (Fixed (archWordFormat is32Bit) reg' nilOL) _ -> - do use_sse2 <- sse2Enabled + do let fmt = cmmTypeFormat (cmmRegType dflags reg) - format | not use_sse2 && isFloatFormat fmt = FF80 - | otherwise = fmt + format = fmt -- let platform = targetPlatform dflags return (Fixed format - (getRegisterReg platform use_sse2 reg) + (getRegisterReg platform reg) nilOL) @@ -557,8 +563,7 @@ getRegister' _ is32Bit (CmmMachOp (MO_SS_Conv W64 W32) [x]) return $ Fixed II32 rlo code getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = - if_sse2 float_const_sse2 float_const_x87 - where + float_const_sse2 where float_const_sse2 | f == 0.0 = do let @@ -570,21 +575,7 @@ getRegister' _ _ (CmmLit lit@(CmmFloat f w)) = | otherwise = do Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode True w addr code - - float_const_x87 = case w of - W64 - | f == 0.0 -> - let code dst = unitOL (GLDZ dst) - in return (Any FF80 code) - - | f == 1.0 -> - let code dst = unitOL (GLD1 dst) - in return (Any FF80 code) - - _otherwise -> do - Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit - loadFloatAmode False w addr code + loadFloatAmode w addr code -- catch simple cases of zero- or sign-extended load getRegister' _ _ (CmmMachOp (MO_UU_Conv W8 W32) [CmmLoad addr _]) = do @@ -641,11 +632,9 @@ getRegister' _ is32Bit (CmmMachOp (MO_Add W64) [CmmReg (CmmGlobal PicBaseReg), LEA II64 (OpAddr (ripRel (litToImm displacement))) (OpReg dst)) getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps - sse2 <- sse2Enabled case mop of - MO_F_Neg w - | sse2 -> sse2NegCode w x - | otherwise -> trivialUFCode FF80 (GNEG FF80) x + MO_F_Neg w -> sse2NegCode w x + MO_S_Neg w -> triv_ucode NEGI (intFormat w) MO_Not w -> triv_ucode NOT (intFormat w) @@ -711,9 +700,8 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps MO_XX_Conv W16 W64 | not is32Bit -> integerExtend W16 W64 MOV x MO_XX_Conv W32 W64 | not is32Bit -> integerExtend W32 W64 MOV x - MO_FF_Conv W32 W64 - | sse2 -> coerceFP2FP W64 x - | otherwise -> conversionNop FF80 x + MO_FF_Conv W32 W64 -> coerceFP2FP W64 x + MO_FF_Conv W64 W32 -> coerceFP2FP W32 x @@ -776,7 +764,6 @@ getRegister' dflags is32Bit (CmmMachOp mop [x]) = do -- unary MachOps getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps - sse2 <- sse2Enabled case mop of MO_F_Eq _ -> condFltReg is32Bit EQQ x y MO_F_Ne _ -> condFltReg is32Bit NE x y @@ -800,14 +787,14 @@ getRegister' _ is32Bit (CmmMachOp mop [x, y]) = do -- dyadic MachOps MO_U_Lt _ -> condIntReg LU x y MO_U_Le _ -> condIntReg LEU x y - MO_F_Add w | sse2 -> trivialFCode_sse2 w ADD x y - | otherwise -> trivialFCode_x87 GADD x y - MO_F_Sub w | sse2 -> trivialFCode_sse2 w SUB x y - | otherwise -> trivialFCode_x87 GSUB x y - MO_F_Quot w | sse2 -> trivialFCode_sse2 w FDIV x y - | otherwise -> trivialFCode_x87 GDIV x y - MO_F_Mul w | sse2 -> trivialFCode_sse2 w MUL x y - | otherwise -> trivialFCode_x87 GMUL x y + MO_F_Add w -> trivialFCode_sse2 w ADD x y + + MO_F_Sub w -> trivialFCode_sse2 w SUB x y + + MO_F_Quot w -> trivialFCode_sse2 w FDIV x y + + MO_F_Mul w -> trivialFCode_sse2 w MUL x y + MO_Add rep -> add_code rep x y MO_Sub rep -> sub_code rep x y @@ -1001,8 +988,7 @@ getRegister' _ _ (CmmLoad mem pk) | isFloatType pk = do Amode addr mem_code <- getAmode mem - use_sse2 <- sse2Enabled - loadFloatAmode use_sse2 (typeWidth pk) addr mem_code + loadFloatAmode (typeWidth pk) addr mem_code getRegister' _ is32Bit (CmmLoad mem pk) | is32Bit && not (isWord64 pk) @@ -1132,9 +1118,7 @@ getNonClobberedReg expr = do return (reg, code) reg2reg :: Format -> Reg -> Reg -> Instr -reg2reg format src dst - | format == FF80 = GMOV src dst - | otherwise = MOV format (OpReg src) (OpReg dst) +reg2reg format src dst = MOV format (OpReg src) (OpReg dst) -------------------------------------------------------------------------------- @@ -1243,8 +1227,7 @@ x86_complex_amode base index shift offset getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) getNonClobberedOperand (CmmLit lit) = do - use_sse2 <- sse2Enabled - if use_sse2 && isSuitableFloatingPointLit lit + if isSuitableFloatingPointLit lit then do let CmmFloat _ w = lit Amode addr code <- memConstant (mkAlignment $ widthInBytes w) lit @@ -1259,9 +1242,12 @@ getNonClobberedOperand (CmmLit lit) = do getNonClobberedOperand (CmmLoad mem pk) = do is32Bit <- is32BitPlatform - use_sse2 <- sse2Enabled - if (not (isFloatType pk) || use_sse2) - && (if is32Bit then not (isWord64 pk) else True) + -- this logic could be simplified + -- TODO FIXME + if (if is32Bit then not (isWord64 pk) else True) + -- if 32bit and pk is at float/double/simd value + -- or if 64bit + -- this could use some eyeballs or i'll need to stare at it more later then do dflags <- getDynFlags let platform = targetPlatform dflags @@ -1278,6 +1264,7 @@ getNonClobberedOperand (CmmLoad mem pk) = do return (src, nilOL) return (OpAddr src', mem_code `appOL` save_code) else do + -- if its a word or gcptr on 32bit? getNonClobberedOperand_generic (CmmLoad mem pk) getNonClobberedOperand e = getNonClobberedOperand_generic e @@ -1370,14 +1357,13 @@ memConstant align lit = do return (Amode addr code) -loadFloatAmode :: Bool -> Width -> AddrMode -> InstrBlock -> NatM Register -loadFloatAmode use_sse2 w addr addr_code = do +loadFloatAmode :: Width -> AddrMode -> InstrBlock -> NatM Register +loadFloatAmode w addr addr_code = do let format = floatFormat w code dst = addr_code `snocOL` - if use_sse2 - then MOV format (OpAddr addr) (OpReg dst) - else GLD format addr dst - return (Any (if use_sse2 then format else FF80) code) + MOV format (OpAddr addr) (OpReg dst) + + return (Any format code) -- if we want a floating-point literal as an operand, we can @@ -1538,19 +1524,9 @@ condIntCode' _ cond x y = do condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode condFltCode cond x y - = if_sse2 condFltCode_sse2 condFltCode_x87 + = condFltCode_sse2 where - condFltCode_x87 - = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do - (x_reg, x_code) <- getNonClobberedReg x - (y_reg, y_code) <- getSomeReg y - let - code = x_code `appOL` y_code `snocOL` - GCMP cond x_reg y_reg - -- The GCMP insn does the test and sets the zero flag if comparable - -- and true. Hence we always supply EQQ as the condition to test. - return (CondCode True EQQ code) -- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be -- an operand, but the right must be a reg. We can probably do better @@ -1634,35 +1610,33 @@ assignReg_IntCode pk reg (CmmLoad src _) = do load_code <- intLoadCode (MOV pk) src dflags <- getDynFlags let platform = targetPlatform dflags - return (load_code (getRegisterReg platform False{-no sse2-} reg)) + return (load_code (getRegisterReg platform reg)) -- dst is a reg, but src could be anything assignReg_IntCode _ reg src = do dflags <- getDynFlags let platform = targetPlatform dflags code <- getAnyReg src - return (code (getRegisterReg platform False{-no sse2-} reg)) + return (code (getRegisterReg platform reg)) -- Floating point assignment to memory assignMem_FltCode pk addr src = do (src_reg, src_code) <- getNonClobberedReg src Amode addr addr_code <- getAmode addr - use_sse2 <- sse2Enabled let code = src_code `appOL` addr_code `snocOL` - if use_sse2 then MOV pk (OpReg src_reg) (OpAddr addr) - else GST pk src_reg addr + MOV pk (OpReg src_reg) (OpAddr addr) + return code -- Floating point assignment to a register/temporary assignReg_FltCode _ reg src = do - use_sse2 <- sse2Enabled src_code <- getAnyReg src dflags <- getDynFlags let platform = targetPlatform dflags - return (src_code (getRegisterReg platform use_sse2 reg)) + return (src_code (getRegisterReg platform reg)) genJump :: CmmExpr{-the branch target-} -> [Reg] -> NatM InstrBlock @@ -1945,7 +1919,7 @@ genCCall _ is32bit (PrimTarget (MO_Prefetch_Data n )) _ [src] _ = genCCall dflags is32Bit (PrimTarget (MO_BSwap width)) [dst] [src] _ = do let platform = targetPlatform dflags - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) case width of W64 | is32Bit -> do ChildCode64 vcode rlo <- iselExpr64 src @@ -1972,7 +1946,7 @@ genCCall dflags is32Bit (PrimTarget (MO_PopCnt width)) dest_regs@[dst] if sse4_2 then do code_src <- getAnyReg src src_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` (if width == W8 then -- The POPCNT instruction doesn't take a r/m8 @@ -2004,7 +1978,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pdep width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PDEP instruction doesn't take a r/m8 @@ -2037,7 +2011,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Pext width)) dest_regs@[dst] code_mask <- getAnyReg mask src_r <- getNewRegNat format mask_r <- getNewRegNat format - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) return $ code_src src_r `appOL` code_mask mask_r `appOL` (if width == W8 then -- The PEXT instruction doesn't take a r/m8 @@ -2073,7 +2047,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Clz width)) dest_regs@[dst] args@[src] b | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do src_r <- getNewRegNat (intFormat width) @@ -2110,7 +2084,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | is32Bit, width == W64 = do ChildCode64 vcode rlo <- iselExpr64 src let rhi = getHiVRegFromLo rlo - dst_r = getRegisterReg platform False (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) lbl1 <- getBlockIdNat lbl2 <- getBlockIdNat let format = if width == W8 then II16 else intFormat width @@ -2150,7 +2124,7 @@ genCCall dflags is32Bit (PrimTarget (MO_Ctz width)) [dst] [src] bid | otherwise = do code_src <- getAnyReg src - let dst_r = getRegisterReg platform False (CmmLocal dst) + let dst_r = getRegisterReg platform (CmmLocal dst) if isBmi2Enabled dflags then do @@ -2201,9 +2175,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) else getSimpleAmode dflags is32Bit addr -- See genCCall for MO_Cmpxchg arg <- getNewRegNat format arg_code <- getAnyReg n - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code <- op_code dst_r arg amode return $ addr_code `appOL` arg_code arg `appOL` code where @@ -2260,8 +2233,8 @@ genCCall dflags is32Bit (PrimTarget (MO_AtomicRMW width amop)) genCCall dflags _ (PrimTarget (MO_AtomicRead width)) [dst] [addr] _ = do load_code <- intLoadCode (MOV (intFormat width)) addr let platform = targetPlatform dflags - use_sse2 <- sse2Enabled - return (load_code (getRegisterReg platform use_sse2 (CmmLocal dst))) + + return (load_code (getRegisterReg platform (CmmLocal dst))) genCCall _ _ (PrimTarget (MO_AtomicWrite width)) [] [addr, val] _ = do code <- assignMem_IntCode (intFormat width) addr val @@ -2276,9 +2249,8 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ newval_code <- getAnyReg new oldval <- getNewRegNat format oldval_code <- getAnyReg old - use_sse2 <- sse2Enabled let platform = targetPlatform dflags - dst_r = getRegisterReg platform use_sse2 (CmmLocal dst) + dst_r = getRegisterReg platform (CmmLocal dst) code = toOL [ MOV format (OpReg oldval) (OpReg eax) , LOCK (CMPXCHG format (OpReg newval) (OpAddr amode)) @@ -2292,14 +2264,12 @@ genCCall dflags is32Bit (PrimTarget (MO_Cmpxchg width)) [dst] [addr, old, new] _ genCCall _ is32Bit target dest_regs args bid = do dflags <- getDynFlags let platform = targetPlatform dflags - sse2 = isSse2Enabled dflags case (target, dest_regs) of -- void return type prim op (PrimTarget op, []) -> outOfLineCmmOp bid op Nothing args -- we only cope with a single result for foreign calls - (PrimTarget op, [r]) - | sse2 -> case op of + (PrimTarget op, [r]) -> case op of MO_F32_Fabs -> case args of [x] -> sse2FabsCode W32 x _ -> panic "genCCall: Wrong number of arguments for fabs" @@ -2310,36 +2280,16 @@ genCCall _ is32Bit target dest_regs args bid = do MO_F32_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF32 args MO_F64_Sqrt -> actuallyInlineSSE2Op (\fmt r -> SQRT fmt (OpReg r)) FF64 args _other_op -> outOfLineCmmOp bid op (Just r) args - | otherwise -> do - l1 <- getNewLabelNat - l2 <- getNewLabelNat - if sse2 - then outOfLineCmmOp bid op (Just r) args - else case op of - MO_F32_Sqrt -> actuallyInlineFloatOp GSQRT FF32 args - MO_F64_Sqrt -> actuallyInlineFloatOp GSQRT FF64 args - - MO_F32_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF32 args - MO_F64_Sin -> actuallyInlineFloatOp (\s -> GSIN s l1 l2) FF64 args - - MO_F32_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF32 args - MO_F64_Cos -> actuallyInlineFloatOp (\s -> GCOS s l1 l2) FF64 args - - MO_F32_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF32 args - MO_F64_Tan -> actuallyInlineFloatOp (\s -> GTAN s l1 l2) FF64 args - - _other_op -> outOfLineCmmOp bid op (Just r) args where - actuallyInlineFloatOp = actuallyInlineFloatOp' False - actuallyInlineSSE2Op = actuallyInlineFloatOp' True + actuallyInlineSSE2Op = actuallyInlineFloatOp' - actuallyInlineFloatOp' usesSSE instr format [x] + actuallyInlineFloatOp' instr format [x] = do res <- trivialUFCode format (instr format) x any <- anyReg res - return (any (getRegisterReg platform usesSSE (CmmLocal r))) + return (any (getRegisterReg platform (CmmLocal r))) - actuallyInlineFloatOp' _ _ _ args + actuallyInlineFloatOp' _ _ args = panic $ "genCCall.actuallyInlineFloatOp': bad number of arguments! (" ++ show (length args) ++ ")" @@ -2358,7 +2308,7 @@ genCCall _ is32Bit target dest_regs args bid = do AND fmt (OpReg tmp) (OpReg dst) ] - return $ code (getRegisterReg platform True (CmmLocal r)) + return $ code (getRegisterReg platform (CmmLocal r)) (PrimTarget (MO_S_QuotRem width), _) -> divOp1 platform True width dest_regs args (PrimTarget (MO_U_QuotRem width), _) -> divOp1 platform False width dest_regs args @@ -2370,8 +2320,8 @@ genCCall _ is32Bit target dest_regs args bid = do let format = intFormat width lCode <- anyReg =<< trivialCode width (ADD_CC format) (Just (ADD_CC format)) arg_x arg_y - let reg_l = getRegisterReg platform True (CmmLocal res_l) - reg_h = getRegisterReg platform True (CmmLocal res_h) + let reg_l = getRegisterReg platform (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) code = hCode reg_h `appOL` lCode reg_l `snocOL` ADC format (OpImm (ImmInteger 0)) (OpReg reg_h) @@ -2391,8 +2341,8 @@ genCCall _ is32Bit target dest_regs args bid = do do (y_reg, y_code) <- getRegOrMem arg_y x_code <- getAnyReg arg_x let format = intFormat width - reg_h = getRegisterReg platform True (CmmLocal res_h) - reg_l = getRegisterReg platform True (CmmLocal res_l) + reg_h = getRegisterReg platform (CmmLocal res_h) + reg_l = getRegisterReg platform (CmmLocal res_l) code = y_code `appOL` x_code rax `appOL` toOL [MUL2 format y_reg, @@ -2428,8 +2378,8 @@ genCCall _ is32Bit target dest_regs args bid = do divOp platform signed width [res_q, res_r] m_arg_x_high arg_x_low arg_y = do let format = intFormat width - reg_q = getRegisterReg platform True (CmmLocal res_q) - reg_r = getRegisterReg platform True (CmmLocal res_r) + reg_q = getRegisterReg platform (CmmLocal res_q) + reg_r = getRegisterReg platform (CmmLocal res_r) widen | signed = CLTD format | otherwise = XOR format (OpReg rdx) (OpReg rdx) instr | signed = IDIV @@ -2456,8 +2406,8 @@ genCCall _ is32Bit target dest_regs args bid = do rCode <- anyReg =<< trivialCode width (instr format) (mrevinstr format) arg_x arg_y reg_tmp <- getNewRegNat II8 - let reg_c = getRegisterReg platform True (CmmLocal res_c) - reg_r = getRegisterReg platform True (CmmLocal res_r) + let reg_c = getRegisterReg platform (CmmLocal res_c) + reg_r = getRegisterReg platform (CmmLocal res_r) code = rCode reg_r `snocOL` SETCC cond (OpReg reg_tmp) `snocOL` MOVZxL II8 (OpReg reg_tmp) (OpReg reg_c) @@ -2501,8 +2451,7 @@ genCCall32' dflags target dest_regs args = do delta0 <- getDeltaNat setDeltaNat (delta0 - arg_pad_size) - use_sse2 <- sse2Enabled - push_codes <- mapM (push_arg use_sse2) (reverse prom_args) + push_codes <- mapM push_arg (reverse prom_args) delta <- getDeltaNat MASSERT(delta == delta0 - tot_arg_size) @@ -2555,18 +2504,21 @@ genCCall32' dflags target dest_regs args = do assign_code [] = nilOL assign_code [dest] | isFloatType ty = - if use_sse2 - then let tmp_amode = AddrBaseIndex (EABaseReg esp) + -- we assume SSE2 + let tmp_amode = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt 0) - fmt = floatFormat w + fmt = floatFormat w in toOL [ SUB II32 (OpImm (ImmInt b)) (OpReg esp), DELTA (delta0 - b), - GST fmt fake0 tmp_amode, + X87Store fmt tmp_amode, + -- X87Store only supported for the CDECL ABI + -- NB: This code will need to be + -- revisted once GHC does more work around + -- SIGFPE f MOV fmt (OpAddr tmp_amode) (OpReg r_dest), ADD II32 (OpImm (ImmInt b)) (OpReg esp), DELTA delta0] - else unitOL (GMOV fake0 r_dest) | isWord64 ty = toOL [MOV II32 (OpReg eax) (OpReg r_dest), MOV II32 (OpReg edx) (OpReg r_dest_hi)] | otherwise = unitOL (MOV (intFormat w) @@ -2577,7 +2529,7 @@ genCCall32' dflags target dest_regs args = do w = typeWidth ty b = widthInBytes w r_dest_hi = getHiVRegFromLo r_dest - r_dest = getRegisterReg platform use_sse2 (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code many = pprPanic "genCCall.assign_code - too many return values:" (ppr many) return (push_code `appOL` @@ -2592,10 +2544,10 @@ genCCall32' dflags target dest_regs args = do roundTo a x | x `mod` a == 0 = x | otherwise = x + a - (x `mod` a) - push_arg :: Bool -> CmmActual {-current argument-} + push_arg :: CmmActual {-current argument-} -> NatM InstrBlock -- code - push_arg use_sse2 arg -- we don't need the hints on x86 + push_arg arg -- we don't need the hints on x86 | isWord64 arg_ty = do ChildCode64 code r_lo <- iselExpr64 arg delta <- getDeltaNat @@ -2619,9 +2571,10 @@ genCCall32' dflags target dest_regs args = do (ImmInt 0) format = floatFormat (typeWidth arg_ty) in - if use_sse2 - then MOV format (OpReg reg) (OpAddr addr) - else GST format reg addr + + -- assume SSE2 + MOV format (OpReg reg) (OpAddr addr) + ] ) @@ -2749,7 +2702,7 @@ genCCall64' dflags target dest_regs args = do _ -> unitOL (MOV (cmmTypeFormat rep) (OpReg rax) (OpReg r_dest)) where rep = localRegType dest - r_dest = getRegisterReg platform True (CmmLocal dest) + r_dest = getRegisterReg platform (CmmLocal dest) assign_code _many = panic "genCCall.assign_code many" return (adjust_rsp `appOL` @@ -3162,17 +3115,9 @@ condIntReg cond x y = do -- and plays better with the uOP cache. condFltReg :: Bool -> Cond -> CmmExpr -> CmmExpr -> NatM Register -condFltReg is32Bit cond x y = if_sse2 condFltReg_sse2 condFltReg_x87 +condFltReg is32Bit cond x y = condFltReg_sse2 where - condFltReg_x87 = do - CondCode _ cond cond_code <- condFltCode cond x y - tmp <- getNewRegNat II8 - let - code dst = cond_code `appOL` toOL [ - SETCC cond (OpReg tmp), - MOVZxL II8 (OpReg tmp) (OpReg dst) - ] - return (Any II32 code) + condFltReg_sse2 = do CondCode _ cond cond_code <- condFltCode cond x y @@ -3336,18 +3281,6 @@ trivialUCode rep instr x = do ----------- -trivialFCode_x87 :: (Format -> Reg -> Reg -> Reg -> Instr) - -> CmmExpr -> CmmExpr -> NatM Register -trivialFCode_x87 instr x y = do - (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too - (y_reg, y_code) <- getSomeReg y - let - format = FF80 -- always, on x87 - code dst = - x_code `appOL` - y_code `snocOL` - instr format x_reg y_reg dst - return (Any format code) trivialFCode_sse2 :: Width -> (Format -> Operand -> Operand -> Instr) -> CmmExpr -> CmmExpr -> NatM Register @@ -3368,17 +3301,8 @@ trivialUFCode format instr x = do -------------------------------------------------------------------------------- coerceInt2FP :: Width -> Width -> CmmExpr -> NatM Register -coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 +coerceInt2FP from to x = coerce_sse2 where - coerce_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case to of W32 -> GITOF; W64 -> GITOD; - n -> panic $ "coerceInt2FP.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any FF80 code) coerce_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand @@ -3392,18 +3316,8 @@ coerceInt2FP from to x = if_sse2 coerce_sse2 coerce_x87 -------------------------------------------------------------------------------- coerceFP2Int :: Width -> Width -> CmmExpr -> NatM Register -coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 +coerceFP2Int from to x = coerceFP2Int_sse2 where - coerceFP2Int_x87 = do - (x_reg, x_code) <- getSomeReg x - let - opc = case from of W32 -> GFTOI; W64 -> GDTOI - n -> panic $ "coerceFP2Int.x87: unhandled width (" - ++ show n ++ ")" - code dst = x_code `snocOL` opc x_reg dst - -- ToDo: works for non-II32 reps? - return (Any (intFormat to) code) - coerceFP2Int_sse2 = do (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand let @@ -3418,15 +3332,13 @@ coerceFP2Int from to x = if_sse2 coerceFP2Int_sse2 coerceFP2Int_x87 -------------------------------------------------------------------------------- coerceFP2FP :: Width -> CmmExpr -> NatM Register coerceFP2FP to x = do - use_sse2 <- sse2Enabled (x_reg, x_code) <- getSomeReg x let - opc | use_sse2 = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; + opc = case to of W32 -> CVTSD2SS; W64 -> CVTSS2SD; n -> panic $ "coerceFP2FP: unhandled width (" ++ show n ++ ")" - | otherwise = GDTOF code dst = x_code `snocOL` opc x_reg dst - return (Any (if use_sse2 then floatFormat to else FF80) code) + return (Any ( floatFormat to) code) -------------------------------------------------------------------------------- @@ -3443,7 +3355,7 @@ sse2NegCode w x = do x at II16 -> wrongFmt x x at II32 -> wrongFmt x x at II64 -> wrongFmt x - x at FF80 -> wrongFmt x + where wrongFmt x = panic $ "sse2NegCode: " ++ show x Amode amode amode_code <- memConstant (mkAlignment $ widthInBytes w) const ===================================== compiler/nativeGen/X86/Instr.hs ===================================== @@ -10,7 +10,7 @@ module X86.Instr (Instr(..), Operand(..), PrefetchVariant(..), JumpDest(..), getJumpDestBlockId, canShortcut, shortcutStatics, - shortcutJump, i386_insert_ffrees, allocMoreStack, + shortcutJump, allocMoreStack, maxSpillSlots, archWordFormat ) where @@ -240,46 +240,14 @@ data Instr | BT Format Imm Operand | NOP - -- x86 Float Arithmetic. - -- Note that we cheat by treating G{ABS,MOV,NEG} of doubles - -- as single instructions right up until we spit them out. - -- all the 3-operand fake fp insns are src1 src2 dst - -- and furthermore are constrained to be fp regs only. - -- IMPORTANT: keep is_G_insn up to date with any changes here - | GMOV Reg Reg -- src(fpreg), dst(fpreg) - | GLD Format AddrMode Reg -- src, dst(fpreg) - | GST Format Reg AddrMode -- src(fpreg), dst - | GLDZ Reg -- dst(fpreg) - | GLD1 Reg -- dst(fpreg) - - | GFTOI Reg Reg -- src(fpreg), dst(intreg) - | GDTOI Reg Reg -- src(fpreg), dst(intreg) - - | GITOF Reg Reg -- src(intreg), dst(fpreg) - | GITOD Reg Reg -- src(intreg), dst(fpreg) - - | GDTOF Reg Reg -- src(fpreg), dst(fpreg) - - | GADD Format Reg Reg Reg -- src1, src2, dst - | GDIV Format Reg Reg Reg -- src1, src2, dst - | GSUB Format Reg Reg Reg -- src1, src2, dst - | GMUL Format Reg Reg Reg -- src1, src2, dst - - -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] - -- Compare src1 with src2; set the Zero flag iff the numbers are - -- comparable and the comparison is True. Subsequent code must - -- test the %eflags zero flag regardless of the supplied Cond. - | GCMP Cond Reg Reg -- src1, src2 - - | GABS Format Reg Reg -- src, dst - | GNEG Format Reg Reg -- src, dst - | GSQRT Format Reg Reg -- src, dst - | GSIN Format CLabel CLabel Reg Reg -- src, dst - | GCOS Format CLabel CLabel Reg Reg -- src, dst - | GTAN Format CLabel CLabel Reg Reg -- src, dst - - | GFREE -- do ffree on all x86 regs; an ugly hack + -- We need to support the FSTP (x87 store and pop) instruction + -- so that we can correctly read off the return value of an + -- x86 CDECL C function call when its floating point. + -- so we dont include a register argument, and just use st(0) + -- this instruction is used ONLY for return values of C ffi calls + -- in x86_32 abi + | X87Store Format AddrMode -- st(0), dst -- SSE2 floating point: we use a restricted set of the available SSE2 @@ -427,33 +395,7 @@ x86_regUsageOfInstr platform instr CLTD _ -> mkRU [eax] [edx] NOP -> mkRU [] [] - GMOV src dst -> mkRU [src] [dst] - GLD _ src dst -> mkRU (use_EA src []) [dst] - GST _ src dst -> mkRUR (src : use_EA dst []) - - GLDZ dst -> mkRU [] [dst] - GLD1 dst -> mkRU [] [dst] - - GFTOI src dst -> mkRU [src] [dst] - GDTOI src dst -> mkRU [src] [dst] - - GITOF src dst -> mkRU [src] [dst] - GITOD src dst -> mkRU [src] [dst] - - GDTOF src dst -> mkRU [src] [dst] - - GADD _ s1 s2 dst -> mkRU [s1,s2] [dst] - GSUB _ s1 s2 dst -> mkRU [s1,s2] [dst] - GMUL _ s1 s2 dst -> mkRU [s1,s2] [dst] - GDIV _ s1 s2 dst -> mkRU [s1,s2] [dst] - - GCMP _ src1 src2 -> mkRUR [src1,src2] - GABS _ src dst -> mkRU [src] [dst] - GNEG _ src dst -> mkRU [src] [dst] - GSQRT _ src dst -> mkRU [src] [dst] - GSIN _ _ _ src dst -> mkRU [src] [dst] - GCOS _ _ _ src dst -> mkRU [src] [dst] - GTAN _ _ _ src dst -> mkRU [src] [dst] + X87Store _ dst -> mkRUR ( use_EA dst []) CVTSS2SD src dst -> mkRU [src] [dst] CVTSD2SS src dst -> mkRU [src] [dst] @@ -603,33 +545,8 @@ x86_patchRegsOfInstr instr env JMP op regs -> JMP (patchOp op) regs JMP_TBL op ids s lbl -> JMP_TBL (patchOp op) ids s lbl - GMOV src dst -> GMOV (env src) (env dst) - GLD fmt src dst -> GLD fmt (lookupAddr src) (env dst) - GST fmt src dst -> GST fmt (env src) (lookupAddr dst) - - GLDZ dst -> GLDZ (env dst) - GLD1 dst -> GLD1 (env dst) - - GFTOI src dst -> GFTOI (env src) (env dst) - GDTOI src dst -> GDTOI (env src) (env dst) - - GITOF src dst -> GITOF (env src) (env dst) - GITOD src dst -> GITOD (env src) (env dst) - - GDTOF src dst -> GDTOF (env src) (env dst) - - GADD fmt s1 s2 dst -> GADD fmt (env s1) (env s2) (env dst) - GSUB fmt s1 s2 dst -> GSUB fmt (env s1) (env s2) (env dst) - GMUL fmt s1 s2 dst -> GMUL fmt (env s1) (env s2) (env dst) - GDIV fmt s1 s2 dst -> GDIV fmt (env s1) (env s2) (env dst) - - GCMP fmt src1 src2 -> GCMP fmt (env src1) (env src2) - GABS fmt src dst -> GABS fmt (env src) (env dst) - GNEG fmt src dst -> GNEG fmt (env src) (env dst) - GSQRT fmt src dst -> GSQRT fmt (env src) (env dst) - GSIN fmt l1 l2 src dst -> GSIN fmt l1 l2 (env src) (env dst) - GCOS fmt l1 l2 src dst -> GCOS fmt l1 l2 (env src) (env dst) - GTAN fmt l1 l2 src dst -> GTAN fmt l1 l2 (env src) (env dst) + -- literally only support storing the top x87 stack value st(0) + X87Store fmt dst -> X87Store fmt (lookupAddr dst) CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) @@ -752,8 +669,7 @@ x86_mkSpillInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpReg reg) (OpAddr (spRel dflags off)) - RcDouble -> GST FF80 reg (spRel dflags off) {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) + RcDouble -> MOV FF64 (OpReg reg) (OpAddr (spRel dflags off)) _ -> panic "X86.mkSpillInstr: no match" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -772,8 +688,7 @@ x86_mkLoadInstr dflags reg delta slot case targetClassOfReg platform reg of RcInteger -> MOV (archWordFormat is32Bit) (OpAddr (spRel dflags off)) (OpReg reg) - RcDouble -> GLD FF80 (spRel dflags off) reg {- RcFloat/RcDouble -} - RcDoubleSSE -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) + RcDouble -> MOV FF64 (OpAddr (spRel dflags off)) (OpReg reg) _ -> panic "X86.x86_mkLoadInstr" where platform = targetPlatform dflags is32Bit = target32Bit platform @@ -827,6 +742,7 @@ x86_isMetaInstr instr +--- TODO: why is there -- | Make a reg-reg move instruction. -- On SPARC v8 there are no instructions to move directly between -- floating point and integer regs. If we need to do that then we @@ -844,8 +760,10 @@ x86_mkRegRegMoveInstr platform src dst ArchX86 -> MOV II32 (OpReg src) (OpReg dst) ArchX86_64 -> MOV II64 (OpReg src) (OpReg dst) _ -> panic "x86_mkRegRegMoveInstr: Bad arch" - RcDouble -> GMOV src dst - RcDoubleSSE -> MOV FF64 (OpReg src) (OpReg dst) + RcDouble -> MOV FF64 (OpReg src) (OpReg dst) + -- this code is the lie we tell ourselves because both float and double + -- use the same register class.on x86_64 and x86 32bit with SSE2, + -- more plainly, both use the XMM registers _ -> panic "X86.RegInfo.mkRegRegMoveInstr: no match" -- | Check whether an instruction represents a reg-reg move. @@ -970,58 +888,6 @@ x86_mkStackDeallocInstr platform amount ArchX86_64 -> [ADD II64 (OpImm (ImmInt amount)) (OpReg rsp)] _ -> panic "x86_mkStackDeallocInstr" -i386_insert_ffrees - :: [GenBasicBlock Instr] - -> [GenBasicBlock Instr] - -i386_insert_ffrees blocks - | any (any is_G_instr) [ instrs | BasicBlock _ instrs <- blocks ] - = map insertGFREEs blocks - | otherwise - = blocks - where - insertGFREEs (BasicBlock id insns) - = BasicBlock id (insertBeforeNonlocalTransfers GFREE insns) - -insertBeforeNonlocalTransfers :: Instr -> [Instr] -> [Instr] -insertBeforeNonlocalTransfers insert insns - = foldr p [] insns - where p insn r = case insn of - CALL _ _ -> insert : insn : r - JMP _ _ -> insert : insn : r - JXX_GBL _ _ -> panic "insertBeforeNonlocalTransfers: cannot handle JXX_GBL" - _ -> insn : r - - --- if you ever add a new FP insn to the fake x86 FP insn set, --- you must update this too -is_G_instr :: Instr -> Bool -is_G_instr instr - = case instr of - GMOV{} -> True - GLD{} -> True - GST{} -> True - GLDZ{} -> True - GLD1{} -> True - GFTOI{} -> True - GDTOI{} -> True - GITOF{} -> True - GITOD{} -> True - GDTOF{} -> True - GADD{} -> True - GDIV{} -> True - GSUB{} -> True - GMUL{} -> True - GCMP{} -> True - GABS{} -> True - GNEG{} -> True - GSQRT{} -> True - GSIN{} -> True - GCOS{} -> True - GTAN{} -> True - GFREE -> panic "is_G_instr: GFREE (!)" - _ -> False - -- -- Note [extra spill slots] ===================================== compiler/nativeGen/X86/Ppr.hs ===================================== @@ -272,7 +272,7 @@ pprReg f r RegVirtual (VirtualRegHi u) -> text "%vHi_" <> pprUniqueAlways u RegVirtual (VirtualRegF u) -> text "%vF_" <> pprUniqueAlways u RegVirtual (VirtualRegD u) -> text "%vD_" <> pprUniqueAlways u - RegVirtual (VirtualRegSSE u) -> text "%vSSE_" <> pprUniqueAlways u + where ppr32_reg_no :: Format -> Int -> SDoc ppr32_reg_no II8 = ppr32_reg_byte @@ -364,17 +364,14 @@ pprReg f r ppr_reg_float :: Int -> PtrString ppr_reg_float i = case i of - 16 -> sLit "%fake0"; 17 -> sLit "%fake1" - 18 -> sLit "%fake2"; 19 -> sLit "%fake3" - 20 -> sLit "%fake4"; 21 -> sLit "%fake5" - 24 -> sLit "%xmm0"; 25 -> sLit "%xmm1" - 26 -> sLit "%xmm2"; 27 -> sLit "%xmm3" - 28 -> sLit "%xmm4"; 29 -> sLit "%xmm5" - 30 -> sLit "%xmm6"; 31 -> sLit "%xmm7" - 32 -> sLit "%xmm8"; 33 -> sLit "%xmm9" - 34 -> sLit "%xmm10"; 35 -> sLit "%xmm11" - 36 -> sLit "%xmm12"; 37 -> sLit "%xmm13" - 38 -> sLit "%xmm14"; 39 -> sLit "%xmm15" + 16 -> sLit "%xmm0" ; 17 -> sLit "%xmm1" + 18 -> sLit "%xmm2" ; 19 -> sLit "%xmm3" + 20 -> sLit "%xmm4" ; 21 -> sLit "%xmm5" + 22 -> sLit "%xmm6" ; 23 -> sLit "%xmm7" + 24 -> sLit "%xmm8" ; 25 -> sLit "%xmm9" + 26 -> sLit "%xmm10"; 27 -> sLit "%xmm11" + 28 -> sLit "%xmm12"; 29 -> sLit "%xmm13" + 30 -> sLit "%xmm14"; 31 -> sLit "%xmm15" _ -> sLit "very naughty x86 register" pprFormat :: Format -> SDoc @@ -386,7 +383,6 @@ pprFormat x II64 -> sLit "q" FF32 -> sLit "ss" -- "scalar single-precision float" (SSE2) FF64 -> sLit "sd" -- "scalar double-precision float" (SSE2) - FF80 -> sLit "t" ) pprFormat_x87 :: Format -> SDoc @@ -394,9 +390,9 @@ pprFormat_x87 x = ptext $ case x of FF32 -> sLit "s" FF64 -> sLit "l" - FF80 -> sLit "t" _ -> panic "X86.Ppr.pprFormat_x87" + pprCond :: Cond -> SDoc pprCond c = ptext (case c of { @@ -807,224 +803,12 @@ pprInstr (FETCHPC reg) ] --- ----------------------------------------------------------------------------- --- i386 floating-point - --- Simulating a flat register set on the x86 FP stack is tricky. --- you have to free %st(7) before pushing anything on the FP reg stack --- so as to preclude the possibility of a FP stack overflow exception. -pprInstr g@(GMOV src dst) - | src == dst - = empty - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) - --- GLD fmt addr dst ==> FLDsz addr ; FSTP (dst+1) -pprInstr g@(GLD fmt addr dst) - = pprG g (hcat [gtab, text "fld", pprFormat_x87 fmt, gsp, - pprAddr addr, gsemi, gpop dst 1]) - +-- the -- GST fmt src addr ==> FLD dst ; FSTPsz addr -pprInstr g@(GST fmt src addr) - | src == fake0 && fmt /= FF80 -- fstt instruction doesn't exist - = pprG g (hcat [gtab, - text "fst", pprFormat_x87 fmt, gsp, pprAddr addr]) - | otherwise - = pprG g (hcat [gtab, gpush src 0, gsemi, +pprInstr g@(X87Store fmt addr) + = pprX87 g (hcat [gtab, text "fstp", pprFormat_x87 fmt, gsp, pprAddr addr]) -pprInstr g@(GLDZ dst) - = pprG g (hcat [gtab, text "fldz ; ", gpop dst 1]) -pprInstr g@(GLD1 dst) - = pprG g (hcat [gtab, text "fld1 ; ", gpop dst 1]) - -pprInstr (GFTOI src dst) - = pprInstr (GDTOI src dst) - -pprInstr g@(GDTOI src dst) - = pprG g (vcat [ - hcat [gtab, text "subl $8, %esp ; fnstcw 4(%esp)"], - hcat [gtab, gpush src 0], - hcat [gtab, text "movzwl 4(%esp), ", reg, - text " ; orl $0xC00, ", reg], - hcat [gtab, text "movl ", reg, text ", 0(%esp) ; fldcw 0(%esp)"], - hcat [gtab, text "fistpl 0(%esp)"], - hcat [gtab, text "fldcw 4(%esp) ; movl 0(%esp), ", reg], - hcat [gtab, text "addl $8, %esp"] - ]) - where - reg = pprReg II32 dst - -pprInstr (GITOF src dst) - = pprInstr (GITOD src dst) - -pprInstr g@(GITOD src dst) - = pprG g (hcat [gtab, text "pushl ", pprReg II32 src, - text " ; fildl (%esp) ; ", - gpop dst 1, text " ; addl $4,%esp"]) - -pprInstr g@(GDTOF src dst) - = pprG g (vcat [gtab <> gpush src 0, - gtab <> text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ;", - gtab <> gpop dst 1]) - -{- Gruesome swamp follows. If you're unfortunate enough to have ventured - this far into the jungle AND you give a Rat's Ass (tm) what's going - on, here's the deal. Generate code to do a floating point comparison - of src1 and src2, of kind cond, and set the Zero flag if true. - - The complications are to do with handling NaNs correctly. We want the - property that if either argument is NaN, then the result of the - comparison is False ... except if we're comparing for inequality, - in which case the answer is True. - - Here's how the general (non-inequality) case works. As an - example, consider generating the an equality test: - - pushl %eax -- we need to mess with this - - fcomp and pop pushed src1 - -- Result of comparison is in FPU Status Register bits - -- C3 C2 and C0 - fstsw %ax -- Move FPU Status Reg to %ax - sahf -- move C3 C2 C0 from %ax to integer flag reg - -- now the serious magic begins - setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 - sete %al -- %al = if arg1 == arg2 then 1 else 0 - andb %ah,%al -- %al &= %ah - -- so %al == 1 iff (comparable && same); else it holds 0 - decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); - else %al == 0xFF, ZeroFlag=0 - -- the zero flag is now set as we desire. - popl %eax - - The special case of inequality differs thusly: - - setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 - setne %al -- %al = if arg1 /= arg2 then 1 else 0 - orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 - decb %al -- if (incomparable || different) then (%al == 0, ZF=1) - else (%al == 0xFF, ZF=0) --} -pprInstr g@(GCMP cond src1 src2) - | case cond of { NE -> True; _ -> False } - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpe %ah"], - hcat [gtab, text "setne %al ; ", - text "orb %ah,%al ; decb %al ; popl %eax"] - ]) - | otherwise - = pprG g (vcat [ - hcat [gtab, text "pushl %eax ; ",gpush src1 0], - hcat [gtab, text "fcomp ", greg src2 1, - text "; fstsw %ax ; sahf ; setpo %ah"], - hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", - text "andb %ah,%al ; decb %al ; popl %eax"] - ]) - where - {- On the 486, the flags set by FP compare are the unsigned ones! - (This looks like a HACK to me. WDP 96/03) - -} - fix_FP_cond :: Cond -> Cond - fix_FP_cond GE = GEU - fix_FP_cond GTT = GU - fix_FP_cond LTT = LU - fix_FP_cond LE = LEU - fix_FP_cond EQQ = EQQ - fix_FP_cond NE = NE - fix_FP_cond _ = panic "X86.Ppr.fix_FP_cond: no match" - -- there should be no others - - -pprInstr g@(GABS _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) - -pprInstr g@(GNEG _ src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) - -pprInstr g@(GSQRT fmt src dst) - = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ - hcat [gtab, gcoerceto fmt, gpop dst 1]) - -pprInstr g@(GSIN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fsin" False l1 l2 src dst fmt) - -pprInstr g@(GCOS fmt l1 l2 src dst) - = pprG g (pprTrigOp "fcos" False l1 l2 src dst fmt) - -pprInstr g@(GTAN fmt l1 l2 src dst) - = pprG g (pprTrigOp "fptan" True l1 l2 src dst fmt) - --- In the translations for GADD, GMUL, GSUB and GDIV, --- the first two cases are mere optimisations. The otherwise clause --- generates correct code under all circumstances. - -pprInstr g@(GADD _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GADD-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; faddp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GADD-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; faddp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fadd ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GMUL _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GMUL-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fmulp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GMUL-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fmulp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fmul ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GSUB _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GSUB-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fsubrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GSUB-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fsubp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fsub ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr g@(GDIV _ src1 src2 dst) - | src1 == dst - = pprG g (text "\t#GDIV-xxxcase1" $$ - hcat [gtab, gpush src2 0, - text " ; fdivrp %st(0),", greg src1 1]) - | src2 == dst - = pprG g (text "\t#GDIV-xxxcase2" $$ - hcat [gtab, gpush src1 0, - text " ; fdivp %st(0),", greg src2 1]) - | otherwise - = pprG g (hcat [gtab, gpush src1 0, - text " ; fdiv ", greg src2 1, text ",%st(0)", - gsemi, gpop dst 1]) - - -pprInstr GFREE - = vcat [ text "\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)", - text "\tffree %st(4) ;ffree %st(5)" - ] -- Atomics @@ -1038,70 +822,11 @@ pprInstr (CMPXCHG format src dst) = pprFormatOpOp (sLit "cmpxchg") format src dst -pprTrigOp :: String -> Bool -> CLabel -> CLabel - -> Reg -> Reg -> Format -> SDoc -pprTrigOp op -- fsin, fcos or fptan - isTan -- we need a couple of extra steps if we're doing tan - l1 l2 -- internal labels for us to use - src dst fmt - = -- We'll be needing %eax later on - hcat [gtab, text "pushl %eax;"] $$ - -- tan is going to use an extra space on the FP stack - (if isTan then hcat [gtab, text "ffree %st(6)"] else empty) $$ - -- First put the value in %st(0) and try to apply the op to it - hcat [gpush src 0, text ("; " ++ op)] $$ - -- Now look to see if C2 was set (overflow, |value| >= 2^63) - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - -- If we were in bounds then jump to the end - hcat [gtab, text "je " <> ppr l1] $$ - -- Otherwise we need to shrink the value. Start by - -- loading pi, doubleing it (by adding it to itself), - -- and then swapping pi with the value, so the value we - -- want to apply op to is in %st(0) again - hcat [gtab, text "ffree %st(7); fldpi"] $$ - hcat [gtab, text "fadd %st(0),%st"] $$ - hcat [gtab, text "fxch %st(1)"] $$ - -- Now we have a loop in which we make the value smaller, - -- see if it's small enough, and loop if not - (ppr l2 <> char ':') $$ - hcat [gtab, text "fprem1"] $$ - -- My Debian libc uses fstsw here for the tan code, but I can't - -- see any reason why it should need to be different for tan. - hcat [gtab, text "fnstsw %ax"] $$ - hcat [gtab, text "test $0x400,%eax"] $$ - hcat [gtab, text "jne " <> ppr l2] $$ - hcat [gtab, text "fstp %st(1)"] $$ - hcat [gtab, text op] $$ - (ppr l1 <> char ':') $$ - -- Pop the 1.0 tan gave us - (if isTan then hcat [gtab, text "fstp %st(0)"] else empty) $$ - -- Restore %eax - hcat [gtab, text "popl %eax;"] $$ - -- And finally make the result the right size - hcat [gtab, gcoerceto fmt, gpop dst 1] -------------------------- +-- some left over --- coerce %st(0) to the specified size -gcoerceto :: Format -> SDoc -gcoerceto FF64 = empty -gcoerceto FF32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " -gcoerceto _ = panic "X86.Ppr.gcoerceto: no match" -gpush :: Reg -> RegNo -> SDoc -gpush reg offset - = hcat [text "fld ", greg reg offset] - -gpop :: Reg -> RegNo -> SDoc -gpop reg offset - = hcat [text "fstp ", greg reg offset] - -greg :: Reg -> RegNo -> SDoc -greg reg offset = text "%st(" <> int (gregno reg - firstfake+offset) <> char ')' - -gsemi :: SDoc -gsemi = text " ; " gtab :: SDoc gtab = char '\t' @@ -1109,45 +834,15 @@ gtab = char '\t' gsp :: SDoc gsp = char ' ' -gregno :: Reg -> RegNo -gregno (RegReal (RealRegSingle i)) = i -gregno _ = --pprPanic "gregno" (ppr other) - 999 -- bogus; only needed for debug printing - -pprG :: Instr -> SDoc -> SDoc -pprG fake actual - = (char '#' <> pprGInstr fake) $$ actual - - -pprGInstr :: Instr -> SDoc -pprGInstr (GMOV src dst) = pprFormatRegReg (sLit "gmov") FF64 src dst -pprGInstr (GLD fmt src dst) = pprFormatAddrReg (sLit "gld") fmt src dst -pprGInstr (GST fmt src dst) = pprFormatRegAddr (sLit "gst") fmt src dst - -pprGInstr (GLDZ dst) = pprFormatReg (sLit "gldz") FF64 dst -pprGInstr (GLD1 dst) = pprFormatReg (sLit "gld1") FF64 dst -pprGInstr (GFTOI src dst) = pprFormatFormatRegReg (sLit "gftoi") FF32 II32 src dst -pprGInstr (GDTOI src dst) = pprFormatFormatRegReg (sLit "gdtoi") FF64 II32 src dst -pprGInstr (GITOF src dst) = pprFormatFormatRegReg (sLit "gitof") II32 FF32 src dst -pprGInstr (GITOD src dst) = pprFormatFormatRegReg (sLit "gitod") II32 FF64 src dst -pprGInstr (GDTOF src dst) = pprFormatFormatRegReg (sLit "gdtof") FF64 FF32 src dst +pprX87 :: Instr -> SDoc -> SDoc +pprX87 fake actual + = (char '#' <> pprX87Instr fake) $$ actual -pprGInstr (GCMP co src dst) = pprCondRegReg (sLit "gcmp_") FF64 co src dst -pprGInstr (GABS fmt src dst) = pprFormatRegReg (sLit "gabs") fmt src dst -pprGInstr (GNEG fmt src dst) = pprFormatRegReg (sLit "gneg") fmt src dst -pprGInstr (GSQRT fmt src dst) = pprFormatRegReg (sLit "gsqrt") fmt src dst -pprGInstr (GSIN fmt _ _ src dst) = pprFormatRegReg (sLit "gsin") fmt src dst -pprGInstr (GCOS fmt _ _ src dst) = pprFormatRegReg (sLit "gcos") fmt src dst -pprGInstr (GTAN fmt _ _ src dst) = pprFormatRegReg (sLit "gtan") fmt src dst - -pprGInstr (GADD fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gadd") fmt src1 src2 dst -pprGInstr (GSUB fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gsub") fmt src1 src2 dst -pprGInstr (GMUL fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gmul") fmt src1 src2 dst -pprGInstr (GDIV fmt src1 src2 dst) = pprFormatRegRegReg (sLit "gdiv") fmt src1 src2 dst - -pprGInstr _ = panic "X86.Ppr.pprGInstr: no match" +pprX87Instr :: Instr -> SDoc +pprX87Instr (X87Store fmt dst) = pprFormatAddr (sLit "gst") fmt dst +pprX87Instr _ = panic "X86.Ppr.pprX87Instr: no match" pprDollImm :: Imm -> SDoc pprDollImm i = text "$" <> pprImm i @@ -1215,23 +910,6 @@ pprOpOp name format op1 op2 ] -pprFormatReg :: PtrString -> Format -> Reg -> SDoc -pprFormatReg name format reg1 - = hcat [ - pprMnemonic name format, - pprReg format reg1 - ] - - -pprFormatRegReg :: PtrString -> Format -> Reg -> Reg -> SDoc -pprFormatRegReg name format reg1 reg2 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2 - ] - pprRegReg :: PtrString -> Reg -> Reg -> SDoc pprRegReg name reg1 reg2 @@ -1266,31 +944,6 @@ pprCondOpReg name format cond op1 reg2 pprReg format reg2 ] -pprCondRegReg :: PtrString -> Format -> Cond -> Reg -> Reg -> SDoc -pprCondRegReg name format cond reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprCond cond, - space, - pprReg format reg1, - comma, - pprReg format reg2 - ] - -pprFormatFormatRegReg :: PtrString -> Format -> Format -> Reg -> Reg -> SDoc -pprFormatFormatRegReg name format1 format2 reg1 reg2 - = hcat [ - char '\t', - ptext name, - pprFormat format1, - pprFormat format2, - space, - pprReg format1 reg1, - comma, - pprReg format2 reg2 - ] - pprFormatFormatOpReg :: PtrString -> Format -> Format -> Operand -> Reg -> SDoc pprFormatFormatOpReg name format1 format2 op1 reg2 = hcat [ @@ -1300,17 +953,6 @@ pprFormatFormatOpReg name format1 format2 op1 reg2 pprReg format2 reg2 ] -pprFormatRegRegReg :: PtrString -> Format -> Reg -> Reg -> Reg -> SDoc -pprFormatRegRegReg name format reg1 reg2 reg3 - = hcat [ - pprMnemonic name format, - pprReg format reg1, - comma, - pprReg format reg2, - comma, - pprReg format reg3 - ] - pprFormatOpOpReg :: PtrString -> Format -> Operand -> Operand -> Reg -> SDoc pprFormatOpOpReg name format op1 op2 reg3 = hcat [ @@ -1322,26 +964,16 @@ pprFormatOpOpReg name format op1 op2 reg3 pprReg format reg3 ] -pprFormatAddrReg :: PtrString -> Format -> AddrMode -> Reg -> SDoc -pprFormatAddrReg name format op dst - = hcat [ - pprMnemonic name format, - pprAddr op, - comma, - pprReg format dst - ] -pprFormatRegAddr :: PtrString -> Format -> Reg -> AddrMode -> SDoc -pprFormatRegAddr name format src op +pprFormatAddr :: PtrString -> Format -> AddrMode -> SDoc +pprFormatAddr name format op = hcat [ pprMnemonic name format, - pprReg format src, comma, pprAddr op ] - pprShift :: PtrString -> Format -> Operand -> Operand -> SDoc pprShift name format src dest = hcat [ ===================================== compiler/nativeGen/X86/RegInfo.hs ===================================== @@ -25,10 +25,13 @@ import X86.Regs mkVirtualReg :: Unique -> Format -> VirtualReg mkVirtualReg u format = case format of - FF32 -> VirtualRegSSE u - FF64 -> VirtualRegSSE u - FF80 -> VirtualRegD u - _other -> VirtualRegI u + FF32 -> VirtualRegD u + -- for scalar F32, we use the same xmm as F64! + -- this is a hack that needs some improvement. + -- For now we map both to being allocated as "Double" Registers + -- on X86/X86_64 + FF64 -> VirtualRegD u + _other -> VirtualRegI u regDotColor :: Platform -> RealReg -> SDoc regDotColor platform reg @@ -37,11 +40,12 @@ regDotColor platform reg _ -> panic "Register not assigned a color" regColors :: Platform -> UniqFM [Char] -regColors platform = listToUFM (normalRegColors platform ++ fpRegColors platform) +regColors platform = listToUFM (normalRegColors platform) normalRegColors :: Platform -> [(Reg,String)] normalRegColors platform = zip (map regSingle [0..lastint platform]) colors + ++ zip (map regSingle [firstxmm..lastxmm platform]) greys where -- 16 colors - enough for amd64 gp regs colors = ["#800000","#ff0000","#808000","#ffff00","#008000" @@ -49,17 +53,6 @@ normalRegColors platform = ,"#800080","#ff00ff","#87005f","#875f00","#87af00" ,"#ff00af"] -fpRegColors :: Platform -> [(Reg,String)] -fpRegColors platform = - [ (fake0, "red") - , (fake1, "red") - , (fake2, "red") - , (fake3, "red") - , (fake4, "red") - , (fake5, "red") ] - - ++ zip (map regSingle [firstxmm..lastxmm platform]) greys - where -- 16 shades of grey, enough for the currently supported -- SSE extensions. greys = ["#0e0e0e","#1c1c1c","#2a2a2a","#383838","#464646" ===================================== compiler/nativeGen/X86/Regs.hs ===================================== @@ -29,7 +29,7 @@ module X86.Regs ( EABase(..), EAIndex(..), addrModeRegs, eax, ebx, ecx, edx, esi, edi, ebp, esp, - fake0, fake1, fake2, fake3, fake4, fake5, firstfake, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, r8, r9, r10, r11, r12, r13, r14, r15, @@ -86,10 +86,6 @@ virtualRegSqueeze cls vr VirtualRegF{} -> 0 _other -> 0 - RcDoubleSSE - -> case vr of - VirtualRegSSE{} -> 1 - _other -> 0 _other -> 0 @@ -100,7 +96,7 @@ realRegSqueeze cls rr RcInteger -> case rr of RealRegSingle regNo - | regNo < firstfake -> 1 + | regNo < firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 @@ -108,15 +104,11 @@ realRegSqueeze cls rr RcDouble -> case rr of RealRegSingle regNo - | regNo >= firstfake && regNo <= lastfake -> 1 + | regNo >= firstxmm -> 1 | otherwise -> 0 RealRegPair{} -> 0 - RcDoubleSSE - -> case rr of - RealRegSingle regNo | regNo >= firstxmm -> 1 - _otherwise -> 0 _other -> 0 @@ -210,17 +202,16 @@ spRel dflags n -- use a Word32 to represent the set of free registers in the register -- allocator. -firstfake, lastfake :: RegNo -firstfake = 16 -lastfake = 21 + firstxmm :: RegNo -firstxmm = 24 +firstxmm = 16 +-- on 32bit platformOSs, only the first 8 XMM/YMM/ZMM registers are available lastxmm :: Platform -> RegNo lastxmm platform - | target32Bit platform = 31 - | otherwise = 39 + | target32Bit platform = firstxmm + 7 -- xmm0 - xmmm7 + | otherwise = firstxmm + 15 -- xmm0 -xmm15 lastint :: Platform -> RegNo lastint platform @@ -230,14 +221,13 @@ lastint platform intregnos :: Platform -> [RegNo] intregnos platform = [0 .. lastint platform] -fakeregnos :: [RegNo] -fakeregnos = [firstfake .. lastfake] + xmmregnos :: Platform -> [RegNo] xmmregnos platform = [firstxmm .. lastxmm platform] floatregnos :: Platform -> [RegNo] -floatregnos platform = fakeregnos ++ xmmregnos platform +floatregnos platform = xmmregnos platform -- argRegs is the set of regs which are read for an n-argument call to C. -- For archs which pass all args on the stack (x86), is empty. @@ -257,20 +247,19 @@ classOfRealReg :: Platform -> RealReg -> RegClass -- However, we can get away without this at the moment because the -- only allocatable integer regs are also 8-bit compatible (1, 3, 4). classOfRealReg platform reg - = case reg of + = case reg of RealRegSingle i - | i <= lastint platform -> RcInteger - | i <= lastfake -> RcDouble - | otherwise -> RcDoubleSSE - - RealRegPair{} -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" + | i <= lastint platform -> RcInteger + | i <= lastxmm platform -> RcDouble + | otherwise -> panic "X86.Reg.classOfRealReg registerSingle too high" + _ -> panic "X86.Regs.classOfRealReg: RegPairs on this arch" -- | Get the name of the register with this number. +-- NOTE: fixme, we dont track which "way" the XMM registers are used showReg :: Platform -> RegNo -> String showReg platform n - | n >= firstxmm = "%xmm" ++ show (n-firstxmm) - | n >= firstfake = "%fake" ++ show (n-firstfake) - | n >= 8 = "%r" ++ show n + | n >= firstxmm && n <= lastxmm platform = "%xmm" ++ show (n-firstxmm) + | n >= 8 && n < firstxmm = "%r" ++ show n | otherwise = regNames platform A.! n regNames :: Platform -> A.Array Int String @@ -290,17 +279,16 @@ Intel x86 architecture: - Only ebx, esi, edi and esp are available across a C call (they are callee-saves). - Registers 0-7 have 16-bit counterparts (ax, bx etc.) - Registers 0-3 have 8 bit counterparts (ah, bh etc.) -- Registers fake0..fake5 are fakes; we pretend x86 has 6 conventionally-addressable - fp registers, and 3-operand insns for them, and we translate this into - real stack-based x86 fp code after register allocation. The fp registers are all Double registers; we don't have any RcFloat class regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should never generate them. + +TODO: cleanup modelling float vs double registers and how they are the same class. -} -fake0, fake1, fake2, fake3, fake4, fake5, - eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg + +eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg eax = regSingle 0 ebx = regSingle 1 @@ -310,12 +298,7 @@ esi = regSingle 4 edi = regSingle 5 ebp = regSingle 6 esp = regSingle 7 -fake0 = regSingle 16 -fake1 = regSingle 17 -fake2 = regSingle 18 -fake3 = regSingle 19 -fake4 = regSingle 20 -fake5 = regSingle 21 + @@ -362,22 +345,22 @@ r12 = regSingle 12 r13 = regSingle 13 r14 = regSingle 14 r15 = regSingle 15 -xmm0 = regSingle 24 -xmm1 = regSingle 25 -xmm2 = regSingle 26 -xmm3 = regSingle 27 -xmm4 = regSingle 28 -xmm5 = regSingle 29 -xmm6 = regSingle 30 -xmm7 = regSingle 31 -xmm8 = regSingle 32 -xmm9 = regSingle 33 -xmm10 = regSingle 34 -xmm11 = regSingle 35 -xmm12 = regSingle 36 -xmm13 = regSingle 37 -xmm14 = regSingle 38 -xmm15 = regSingle 39 +xmm0 = regSingle 16 +xmm1 = regSingle 17 +xmm2 = regSingle 18 +xmm3 = regSingle 19 +xmm4 = regSingle 20 +xmm5 = regSingle 21 +xmm6 = regSingle 22 +xmm7 = regSingle 23 +xmm8 = regSingle 24 +xmm9 = regSingle 25 +xmm10 = regSingle 26 +xmm11 = regSingle 27 +xmm12 = regSingle 28 +xmm13 = regSingle 29 +xmm14 = regSingle 30 +xmm15 = regSingle 31 ripRel :: Displacement -> AddrMode ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm @@ -411,7 +394,7 @@ callClobberedRegs platform -- Only xmm0-5 are caller-saves registers on 64bit windows. -- ( https://docs.microsoft.com/en-us/cpp/build/register-usage ) -- For details check the Win64 ABI. - ++ map regSingle fakeregnos ++ map xmm [0 .. 5] + ++ map xmm [0 .. 5] | otherwise -- all xmm regs are caller-saves -- caller-saves registers @@ -430,11 +413,15 @@ allIntArgRegs platform = panic "X86.Regs.allIntArgRegs: not defined for this platform" | otherwise = [rdi,rsi,rdx,rcx,r8,r9] + +-- | on 64bit platforms we pass the first 8 float/double arguments +-- in the xmm registers. allFPArgRegs :: Platform -> [Reg] allFPArgRegs platform | platformOS platform == OSMinGW32 = panic "X86.Regs.allFPArgRegs: not defined for this platform" - | otherwise = map regSingle [firstxmm .. firstxmm+7] + | otherwise = map regSingle [firstxmm .. firstxmm + 7 ] + -- Machine registers which might be clobbered by instructions that -- generate results into fixed registers, or need arguments in a fixed ===================================== compiler/types/TyCon.hs ===================================== @@ -1328,12 +1328,12 @@ Roughly in order of "includes more information": number of bits. It may represent a signed or unsigned integer, a floating-point value, or an address. - data Width = W8 | W16 | W32 | W64 | W80 | W128 + data Width = W8 | W16 | W32 | W64 | W128 - Size, which is used in the native code generator, is Width + floating point information. - data Size = II8 | II16 | II32 | II64 | FF32 | FF64 | FF80 + data Size = II8 | II16 | II32 | II64 | FF32 | FF64 it is necessary because e.g. the instruction to move a 64-bit float on x86 (movsd) is different from the instruction to move a 64-bit ===================================== includes/CodeGen.Platform.hs ===================================== @@ -41,65 +41,59 @@ import Reg # define r15 15 # endif -# define fake0 16 -# define fake1 17 -# define fake2 18 -# define fake3 19 -# define fake4 20 -# define fake5 21 -- N.B. XMM, YMM, and ZMM are all aliased to the same hardware registers hence -- being assigned the same RegNos. -# define xmm0 24 -# define xmm1 25 -# define xmm2 26 -# define xmm3 27 -# define xmm4 28 -# define xmm5 29 -# define xmm6 30 -# define xmm7 31 -# define xmm8 32 -# define xmm9 33 -# define xmm10 34 -# define xmm11 35 -# define xmm12 36 -# define xmm13 37 -# define xmm14 38 -# define xmm15 39 +# define xmm0 16 +# define xmm1 17 +# define xmm2 18 +# define xmm3 19 +# define xmm4 20 +# define xmm5 21 +# define xmm6 22 +# define xmm7 23 +# define xmm8 24 +# define xmm9 25 +# define xmm10 26 +# define xmm11 27 +# define xmm12 28 +# define xmm13 29 +# define xmm14 30 +# define xmm15 31 -# define ymm0 24 -# define ymm1 25 -# define ymm2 26 -# define ymm3 27 -# define ymm4 28 -# define ymm5 29 -# define ymm6 30 -# define ymm7 31 -# define ymm8 32 -# define ymm9 33 -# define ymm10 34 -# define ymm11 35 -# define ymm12 36 -# define ymm13 37 -# define ymm14 38 -# define ymm15 39 +# define ymm0 16 +# define ymm1 17 +# define ymm2 18 +# define ymm3 19 +# define ymm4 20 +# define ymm5 21 +# define ymm6 22 +# define ymm7 23 +# define ymm8 24 +# define ymm9 25 +# define ymm10 26 +# define ymm11 27 +# define ymm12 28 +# define ymm13 29 +# define ymm14 30 +# define ymm15 31 -# define zmm0 24 -# define zmm1 25 -# define zmm2 26 -# define zmm3 27 -# define zmm4 28 -# define zmm5 29 -# define zmm6 30 -# define zmm7 31 -# define zmm8 32 -# define zmm9 33 -# define zmm10 34 -# define zmm11 35 -# define zmm12 36 -# define zmm13 37 -# define zmm14 38 -# define zmm15 39 +# define zmm0 16 +# define zmm1 17 +# define zmm2 18 +# define zmm3 19 +# define zmm4 20 +# define zmm5 21 +# define zmm6 22 +# define zmm7 23 +# define zmm8 24 +# define zmm9 25 +# define zmm10 26 +# define zmm11 27 +# define zmm12 28 +# define zmm13 29 +# define zmm14 30 +# define zmm15 31 -- Note: these are only needed for ARM/ARM64 because globalRegMaybe is now used in CmmSink.hs. -- Since it's only used to check 'isJust', the actual values don't matter, thus ===================================== libraries/base/tests/Numeric/all.T ===================================== @@ -7,21 +7,10 @@ test('num006', normal, compile_and_run, ['']) test('num007', normal, compile_and_run, ['']) test('num008', normal, compile_and_run, ['']) -# On i386, we need -msse2 to get reliable floating point results -if config.arch == 'i386': - opts = '-msse2' -else: - opts = '' + test('num009', [ when(fast(), skip) - , when(wordsize(32), expect_broken(15062)) - , when(platform('i386-apple-darwin'), expect_broken(2370)) - , when(platform('powerpc64le-unknown-linux'), expect_broken(13634)) - , when(opsys('mingw32'), omit_ways(['ghci'])) ], - # We get different results at 1e20 on x86/Windows, so there is - # a special output file for that. I (SDM) don't think these are - # serious, since the results for lower numbers are all fine. - # We also get another set of results for 1e02 with GHCi, so - # I'm skipping that way altogether. + # , when(wordsize(32), expect_broken(15062)) + , when(platform('powerpc64le-unknown-linux'), expect_broken(13634))], compile_and_run, [opts]) test('num010', when(platform('i386-apple-darwin'), expect_broken_for(7043, 'ghci')), ===================================== libraries/base/tests/Numeric/num009.hs ===================================== @@ -1,8 +1,6 @@ -- trac #2059 -- --- Note that this test fails miserably when compiled to use X87 floating point. --- For instance, in the case of (sin 1e20) the X86 FSIN instruction doesn't even --- get the sign right on my machine. + module Main(main) where View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/42504f4a575395a35eec5c3fd7c9ef6e2b54e68e -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/42504f4a575395a35eec5c3fd7c9ef6e2b54e68e You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 15:39:35 2019 From: gitlab at gitlab.haskell.org (Vladislav Zavialov) Date: Thu, 11 Apr 2019 11:39:35 -0400 Subject: [Git][ghc/ghc][wip/top-level-kind-signatures] 21 commits: Generate straightline code for inline array allocation Message-ID: <5caf5fb786668_62b33fa2bd46ed182911910@gitlab.haskell.org.mail> Vladislav Zavialov pushed to branch wip/top-level-kind-signatures at Glasgow Haskell Compiler / GHC Commits: 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 2cb57eae by Vladislav Zavialov at 2019-04-11T15:38:50Z WIP: Top-level kind signatures - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/hieFile/HieAst.hs - compiler/hsSyn/HsBinds.hs - compiler/hsSyn/HsDecls.hs - compiler/hsSyn/HsExtension.hs - compiler/hsSyn/HsInstances.hs - compiler/hsSyn/HsTypes.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/da847da7a40908f93ed8a449abc1d8596f1971a5...2cb57eaeee39241f9ab857a4c2bf4af9722ba93c -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/da847da7a40908f93ed8a449abc1d8596f1971a5...2cb57eaeee39241f9ab857a4c2bf4af9722ba93c You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 17:12:36 2019 From: gitlab at gitlab.haskell.org (Ara Adkins) Date: Thu, 11 Apr 2019 13:12:36 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/ara/ci-badge Message-ID: <5caf758472520_62b33fa2eb44be9829226e3@gitlab.haskell.org.mail> Ara Adkins pushed new branch wip/ara/ci-badge at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/ara/ci-badge You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 18:44:12 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 14:44:12 -0400 Subject: [Git][ghc/ghc][wip/T16546] 22 commits: Fix #16282. Message-ID: <5caf8afc53c70_62b33fa2cd4030bc2949056@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/T16546 at Glasgow Haskell Compiler / GHC Commits: 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - b6150886 by Ben Gamari at 2019-04-11T18:44:10Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/9b94af4caf0d5c4db0232f4fb31ece76522e2d33...b6150886e213ba8180086c9760aaab8b20e47ed0 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/9b94af4caf0d5c4db0232f4fb31ece76522e2d33...b6150886e213ba8180086c9760aaab8b20e47ed0 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 18:45:09 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 14:45:09 -0400 Subject: [Git][ghc/ghc][wip/T16546] base: Better document implementation implications of Data.Timeout Message-ID: <5caf8b353a307_62b33fa2e6baf9dc2950440@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/T16546 at Glasgow Haskell Compiler / GHC Commits: 466fb846 by Ben Gamari at 2019-04-11T18:44:55Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - 1 changed file: - libraries/base/System/Timeout.hs Changes: ===================================== libraries/base/System/Timeout.hs ===================================== @@ -16,7 +16,7 @@ -- ------------------------------------------------------------------------------- -module System.Timeout ( timeout ) where +module System.Timeout ( Timeout, timeout ) where #if !defined(mingw32_HOST_OS) import Control.Monad @@ -35,7 +35,11 @@ import Data.Unique (Unique, newUnique) -- interrupt the running IO computation when the timeout has -- expired. -newtype Timeout = Timeout Unique deriving Eq -- ^ @since 4.0 +-- | An exception thrown to a thread by 'timeout' to interrupt a timed-out +-- computation. +-- +-- @since 4.0 +newtype Timeout = Timeout Unique deriving Eq -- | @since 4.0 instance Show Timeout where @@ -67,20 +71,25 @@ instance Exception Timeout where -- another thread. -- -- A tricky implementation detail is the question of how to abort an @IO@ --- computation. This combinator relies on asynchronous exceptions internally. --- The technique works very well for computations executing inside of the --- Haskell runtime system, but it doesn't work at all for non-Haskell code. --- Foreign function calls, for example, cannot be timed out with this --- combinator simply because an arbitrary C function cannot receive --- asynchronous exceptions. When @timeout@ is used to wrap an FFI call that --- blocks, no timeout event can be delivered until the FFI call returns, which --- pretty much negates the purpose of the combinator. In practice, however, --- this limitation is less severe than it may sound. Standard I\/O functions --- like 'System.IO.hGetBuf', 'System.IO.hPutBuf', Network.Socket.accept, or --- 'System.IO.hWaitForInput' appear to be blocking, but they really don't --- because the runtime system uses scheduling mechanisms like @select(2)@ to --- perform asynchronous I\/O, so it is possible to interrupt standard socket --- I\/O or file I\/O using this combinator. +-- computation. This combinator relies on asynchronous exceptions internally +-- (namely throwing the computation the 'Timeout' exception). The technique +-- works very well for computations executing inside of the Haskell runtime +-- system, but it doesn't work at all for non-Haskell code. Foreign function +-- calls, for example, cannot be timed out with this combinator simply because +-- an arbitrary C function cannot receive asynchronous exceptions. When +-- @timeout@ is used to wrap an FFI call that blocks, no timeout event can be +-- delivered until the FFI call returns, which pretty much negates the purpose +-- of the combinator. In practice, however, this limitation is less severe than +-- it may sound. Standard I\/O functions like 'System.IO.hGetBuf', +-- 'System.IO.hPutBuf', Network.Socket.accept, or 'System.IO.hWaitForInput' +-- appear to be blocking, but they really don't because the runtime system uses +-- scheduling mechanisms like @select(2)@ to perform asynchronous I\/O, so it +-- is possible to interrupt standard socket I\/O or file I\/O using this +-- combinator. +--- +-- Note that 'timeout' cancels the computation by throwing it the 'Timeout' +-- exception. Consequently blanket exception handlers (e.g. catching +-- 'SomeException') within the computation will break the timeout behavior. timeout :: Int -> IO a -> IO (Maybe a) timeout n f | n < 0 = fmap Just f View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/466fb8462f0131a5acc0c36dbdb8689c89334d9b -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/466fb8462f0131a5acc0c36dbdb8689c89334d9b You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 22:15:59 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 18:15:59 -0400 Subject: [Git][ghc/ghc][wip/lint-check-version-number] 10 commits: users-guide: Document how to disable package environments Message-ID: <5cafbc9fb9b09_62b33fa2e2f7735c296625e@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/lint-check-version-number at Glasgow Haskell Compiler / GHC Commits: 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 1e545b0a by Ben Gamari at 2019-04-11T22:15:57Z gitlab-ci: Ensure that version number has three components - - - - - 30 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - compiler/utils/Util.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/e089db10d2aefb3c5be448370a8c87289ad89edc...1e545b0ad69da8c4f4f1a8a6b80bfea6f8408753 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/e089db10d2aefb3c5be448370a8c87289ad89edc...1e545b0ad69da8c4f4f1a8a6b80bfea6f8408753 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 22:16:05 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 18:16:05 -0400 Subject: [Git][ghc/ghc][wip/fix-marge] 10 commits: users-guide: Document how to disable package environments Message-ID: <5cafbca5153a7_62b33fa2cd13e3042967044@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/fix-marge at Glasgow Haskell Compiler / GHC Commits: 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 43274e79 by Ben Gamari at 2019-04-11T22:16:03Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 30 changed files: - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - compiler/utils/Util.hs - docs/users_guide/8.10.1-notes.rst The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/91cf8fc1a701831cbd2aa6018b5220f02ecad118...43274e796fd909adfca42e32e18f62c8e7854af9 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/91cf8fc1a701831cbd2aa6018b5220f02ecad118...43274e796fd909adfca42e32e18f62c8e7854af9 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 22:16:50 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 18:16:50 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 39 commits: Generate straightline code for inline array allocation Message-ID: <5cafbcd245840_62b33fa2ef3a4cd429676b9@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 71283c8d by Ben Gamari at 2019-04-11T22:16:46Z gitlab-ci: Test using slowtest in deb9-debug job - - - - - e5562623 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways As noted in #16535. - - - - - 2d09c033 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Make closureSize less sensitive to optimisation - - - - - 59a8f3ca by Ben Gamari at 2019-04-11T22:16:46Z process: Bump submodule * Skip process005 in ghci way * Mark process002 as fragile in threaded2 - - - - - a19d89c8 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark T13167 as fragile in threaded2 As noted in #16536. - - - - - e366bbbb by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark T13910 as broken in optasm Due to #16537. - - - - - 8f7222db by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark T14272 as broken in optasm - - - - - 6ddaf9f5 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark T14761c as broken in hpc and optasm ways As noted in #16540. - - - - - 28e0001d by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark T16180 as broken in ghci and ext-interp ways As noted in #16541. - - - - - 3b89a1a2 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - f4a8c108 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark Overflow as broken in hpc way As noted in #16543. - - - - - eeef47d2 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark closure_size as broken in threaded2, optasm, dyn ways As noted in #16531. - - - - - f836e11d by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark T2783 as fragile in threaded1 It was previously marked as broken but it passes non-deterministically. See #2783. - - - - - a99b2a3c by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Skip T7919 in ghci way It times out pretty reliably. It's not clear that much is gained by running this test in the ghci way anyways. - - - - - 978d9378 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Fix fragile_for test modifier - - - - - 837e1c3b by Ben Gamari at 2019-04-11T22:16:46Z users-guide: Add pretty to package list - - - - - 1bf4d24d by Ben Gamari at 2019-04-11T22:16:46Z Bump unix submodule Marks posix002 as fragile in threaded2 way due to #16550. - - - - - 2698084f by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Fix omit_ways usage omit_ways expects a list but this was broken in several cases. - - - - - f1024129 by Ben Gamari at 2019-04-11T22:16:46Z testsuite: Mark threadstatus-T9333 as fragile in ghci way As noted in #16555. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/a1927cb0269d1231bd8086d17f812cc38d8330e5...f1024129e7087bf29e03af8e6f686b921ecc7cd0 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/a1927cb0269d1231bd8086d17f812cc38d8330e5...f1024129e7087bf29e03af8e6f686b921ecc7cd0 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 22:23:03 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 18:23:03 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/centos7 Message-ID: <5cafbe47d69b4_62b33fa2ec586d5c296913@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/centos7 at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/centos7 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 22:23:43 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 18:23:43 -0400 Subject: [Git][ghc/ghc] Pushed new tag wip/test-centos7 Message-ID: <5cafbe6f9312f_62b33fa2cd13e3042969385@gitlab.haskell.org.mail> Ben Gamari pushed new tag wip/test-centos7 at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/test-centos7 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 22:24:58 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 18:24:58 -0400 Subject: [Git][ghc/ghc][wip/dmd-arity] 25 commits: testsuite: Add testcase for #16111 Message-ID: <5cafbeba7db7e_62b38ab2964297353a@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/dmd-arity at Glasgow Haskell Compiler / GHC Commits: 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - 6319494a by Sebastian Graf at 2019-04-11T22:24:56Z Compute demand signatures assuming idArity This does four things: 1. Look at `idArity` instead of manifest lambdas to decide whether to use LetUp 2. Compute the strictness signature in LetDown assuming at least `idArity` incoming arguments 3. Remove the special case for trivial RHSs, which is subsumed by 2 4. Don't perform the W/W split when doing so would eta expand a binding. Otherwise we would eta expand PAPs, causing unnecessary churn in the Simplifier. NoFib Results -------------------------------------------------------------------------------- Program Allocs Instrs -------------------------------------------------------------------------------- fannkuch-redux +0.3% 0.0% gg -0.0% -0.1% maillist +0.2% +0.2% minimax 0.0% +0.8% pretty 0.0% -0.1% reptile -0.0% -1.2% -------------------------------------------------------------------------------- Min -0.0% -1.2% Max +0.3% +0.8% Geometric Mean +0.0% -0.0% - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/basicTypes/Demand.hs - compiler/basicTypes/Id.hs - compiler/basicTypes/IdInfo.hs - compiler/basicTypes/Var.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/coreSyn/CoreArity.hs - compiler/coreSyn/CoreLint.hs - compiler/coreSyn/CoreUnfold.hs - compiler/deSugar/Check.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/e00de39929e12cbbc0bb2875d8f454359c967410...6319494a84f7650eeeb1e683bf9ae0e232f77792 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/e00de39929e12cbbc0bb2875d8f454359c967410...6319494a84f7650eeeb1e683bf9ae0e232f77792 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 23:13:28 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Thu, 11 Apr 2019 19:13:28 -0400 Subject: [Git][ghc/ghc][ghc-8.8] Use funPrec, not topPrec, to parenthesize GADT argument types Message-ID: <5cafca18d63b3_62b33fa276f064e829894ca@gitlab.haskell.org.mail> Ben Gamari pushed to branch ghc-8.8 at Glasgow Haskell Compiler / GHC Commits: 20f8f2f9 by Ryan Scott at 2019-04-09T20:21:35Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 5 changed files: - compiler/iface/IfaceSyn.hs - + testsuite/tests/ghci/scripts/T16527.hs - + testsuite/tests/ghci/scripts/T16527.script - + testsuite/tests/ghci/scripts/T16527.stdout - testsuite/tests/ghci/scripts/all.T Changes: ===================================== compiler/iface/IfaceSyn.hs ===================================== @@ -1034,7 +1034,7 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- a compound field type is if it's preceded by a bang pattern. pprFieldArgTy (bang, ty) = ppr_arg_ty (bang_prec bang) bang ty -- If not using record syntax, a compound field type might need to be - -- parenthesize if one of the following holds: + -- parenthesized if one of the following holds: -- -- 1. We're using Haskell98 syntax. -- 2. The field type is preceded with a bang pattern. @@ -1046,18 +1046,23 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- If we're displaying the fields GADT-style, e.g., -- -- data Foo a where - -- MkFoo :: Maybe a -> Foo + -- MkFoo :: (Int -> Int) -> Maybe a -> Foo -- - -- Then there is no inherent need to parenthesize compound fields like - -- `Maybe a` (bang patterns notwithstanding). If we're displaying the - -- fields Haskell98-style, e.g., + -- Then we use `funPrec`, since that will ensure `Int -> Int` gets the + -- parentheses that it requires, but simple compound types like `Maybe a` + -- (which don't require parentheses in a function argument position) won't + -- get them, assuming that there are no bang patterns (see bang_prec). -- - -- data Foo a = MkFoo (Maybe a) + -- If we're displaying the fields Haskell98-style, e.g., -- - -- Then we *must* parenthesize compound fields like (Maybe a). + -- data Foo a = MkFoo (Int -> Int) (Maybe a) + -- + -- Then not only must we parenthesize `Int -> Int`, we must also + -- parenthesize compound fields like (Maybe a). Therefore, we pick + -- `appPrec`, which has higher precedence than `funPrec`. gadt_prec :: PprPrec gadt_prec - | gadt_style = topPrec + | gadt_style = funPrec | otherwise = appPrec -- The presence of bang patterns or UNPACK annotations requires ===================================== testsuite/tests/ghci/scripts/T16527.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE RankNTypes #-} +module T16527 where + +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T ===================================== testsuite/tests/ghci/scripts/T16527.script ===================================== @@ -0,0 +1,2 @@ +:load T16527 +:info T ===================================== testsuite/tests/ghci/scripts/T16527.stdout ===================================== @@ -0,0 +1,4 @@ +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T + -- Defined at T16527.hs:5:1 ===================================== testsuite/tests/ghci/scripts/all.T ===================================== @@ -295,3 +295,4 @@ test('T15941', normal, ghci_script, ['T15941.script']) test('T16030', normal, ghci_script, ['T16030.script']) test('T11606', normal, ghci_script, ['T11606.script']) test('T16089', normal, ghci_script, ['T16089.script']) +test('T16527', normal, ghci_script, ['T16527.script']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/20f8f2f98bd686b297e7290b2348ac7e937c1a5a -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/20f8f2f98bd686b297e7290b2348ac7e937c1a5a You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 23:41:22 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Thu, 11 Apr 2019 19:41:22 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 6 commits: GHC no longer ever defines TABLES_NEXT_TO_CODE on its own Message-ID: <5cafd0a2249ca_62b38ab29642995849@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - bc191f74 by Sylvain Henry at 2019-04-11T23:41:13Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - b8a5a655 by Ömer Sinan Ağacan at 2019-04-11T23:41:18Z Remove unused remilestoning script - - - - - a74a282a by Ömer Sinan Ağacan at 2019-04-11T23:41:18Z Update a panic message Point users to the right URL - - - - - 30 changed files: - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/ghci/ByteCodeLink.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs - − distrib/remilestoning.pl - ghc/ghc-bin.cabal.in - hadrian/src/Rules/BinaryDist.hs - includes/CodeGen.Platform.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/0f53232902b383977541ad6ae0622934329219a7...a74a282a440c385c63395f32034c7b110422ce9e -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/0f53232902b383977541ad6ae0622934329219a7...a74a282a440c385c63395f32034c7b110422ce9e You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Thu Apr 11 23:57:35 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Thu, 11 Apr 2019 19:57:35 -0400 Subject: [Git][ghc/ghc][master] Hadrian: fix binary-dir with --docs=none Message-ID: <5cafd46f28b2e_62b33fa2ef7aa1bc30046ec@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 1 changed file: - hadrian/src/Rules/BinaryDist.hs Changes: ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -121,7 +121,13 @@ bindistRules = do copyDirectory (ghcBuildDir -/- "lib") bindistFilesDir copyDirectory (rtsIncludeDir) bindistFilesDir need ["docs"] - copyDirectory (root -/- "docs") bindistFilesDir + -- TODO: we should only embed the docs that have been generated + -- depending on the current settings (flavours' "ghcDocs" field and + -- "--docs=.." command-line flag) + -- Currently we embed the "docs" directory if it exists but it may + -- contain outdated or even invalid data. + whenM (doesDirectoryExist (root -/- "docs")) $ do + copyDirectory (root -/- "docs") bindistFilesDir when windows $ do copyDirectory (root -/- "mingw") bindistFilesDir -- we use that opportunity to delete the .stamp file that we use View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/c401f8a487ce6762476b113ad9f0d32960a3e152 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/c401f8a487ce6762476b113ad9f0d32960a3e152 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 00:03:45 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Thu, 11 Apr 2019 20:03:45 -0400 Subject: [Git][ghc/ghc][master] 2 commits: Remove unused remilestoning script Message-ID: <5cafd5e19b890_62b33fa2ef7aa1bc30083c2@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - 8 changed files: - compiler/ghci/ByteCodeLink.hs - − distrib/remilestoning.pl - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 Changes: ===================================== compiler/ghci/ByteCodeLink.hs ===================================== @@ -154,8 +154,8 @@ linkFail who what , "the missing library using the -L/path/to/object/dir and -lmissinglibname" , "flags, or simply by naming the relevant files on the GHCi command line." , "Alternatively, this link failure might indicate a bug in GHCi." - , "If you suspect the latter, please send a bug report to:" - , " glasgow-haskell-bugs at haskell.org" + , "If you suspect the latter, please report this as a GHC bug:" + , " https://www.haskell.org/ghc/reportabug" ]) ===================================== distrib/remilestoning.pl deleted ===================================== @@ -1,119 +0,0 @@ -#!/usr/bin/env perl - -use warnings; -use strict; - -use DBI; - -# ===== Config: - -my $dbfile = "trac.db"; -my $milestone = "7.4.1"; -my $test = 0; - -# ===== Code: - -my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","", {}); - -my %emailof; -my %ticketsfor; - -sub getUserAddress { - my $sth = $dbh->prepare("SELECT sid, value FROM session_attribute WHERE name = 'email'"); - $sth->execute(); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $username = $result->{sid}; - my $email = $result->{value}; - if (defined($emailof{$username})) { - die "Two e-mail addresses found for $username"; - } - if ($email =~ /@/) { - $emailof{$username} = $email; - } - else { - # warn "The e-mail address $email for $username contains no @"; - } - } - $sth->finish; -} - -sub doTickets { - my $sth = $dbh->prepare("SELECT id, summary, reporter, cc FROM ticket WHERE milestone = ? AND status = 'new'"); - $sth->execute($milestone); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $ticket = $result->{id}; - my $title = $result->{summary}; - my $reporter = $result->{reporter}; - my $cc = $result->{cc}; - my %addresses; - my $address_added; - for my $who ($reporter, split /[ ,]+/, $cc) { - $address_added = 0; - if ($who =~ /@/) { - $addresses{$who} = 1; - $address_added = 1; - } - if (defined($emailof{$who})) { - $addresses{$emailof{$who}} = 1; - $address_added = 1; - } - if ($who ne "nobody" && $address_added eq 0) { - # warn "No address found for $who"; - } - } - for my $address (keys(%addresses)) { - $ticketsfor{$address}{$ticket}{"title"} = $title; - } - } - $sth->finish; -} - -sub doEmails { - for my $email (sort (keys %ticketsfor)) { - if ($test ne 0) { - open FH, ">&STDOUT"; - } - else { - open(FH, '|-', 'mail', '-s', 'GHC bugs', '-a', 'From: glasgow-haskell-bugs at haskell.org', $email) or die "Running mail failed: $!"; - } - print FH <<'EOF'; - -Hello, - -You are receiving this mail because you are the reporter, or on the CC -list, for one or more GHC tickets that are automatically having their -priority reduced due to our post-release ticket handling policy: - https://gitlab.haskell.org/ghc/ghc/wikis/working-conventions/bug-tracker#re-milestoning-tickets-after-a-release - -The list of tickets for which you are the reporter or on the CC list is -given below. If any of these are causing problems for you, please let us -know on glasgow-haskell-bugs at haskell.org and we'll look at raising the -priority. - -Better still, if you are able to make any progress on any of the tickets -yourself (whether that be actually fixing the bug, or just making it -easier for someone else to - for example, by making a small, -self-contained test-case), then that would be a great help. We at GHC HQ -have limited resources, so if anything is waiting for us to make -progress then it can be waiting a long time! -EOF - for my $ticket (sort {$a <=> $b} (keys %{$ticketsfor{$email}})) { - my $title = $ticketsfor{$email}{$ticket}{"title"}; - print FH "\n"; - print FH "#$ticket $title:\n"; - print FH " https://gitlab.haskell.org/ghc/ghc/issues/$ticket\n"; - } - print FH <<'EOF'; - --- -The GHC Team -http://www.haskell.org/ghc/ -EOF - close FH or die "Close failed: $!"; - } -} - -&getUserAddress(); -&doTickets(); -&doEmails(); - ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/c401f8a487ce6762476b113ad9f0d32960a3e152...fa0ccbb8731e3a44dba130e835ce2a5da994c66c -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/c401f8a487ce6762476b113ad9f0d32960a3e152...fa0ccbb8731e3a44dba130e835ce2a5da994c66c You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 15:06:34 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 11:06:34 -0400 Subject: [Git][ghc/ghc][wip/centos7] 25 commits: Fix #16282. Message-ID: <5cb0a97a2c881_62b33fa2e6a9f3e430773e1@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/centos7 at Glasgow Haskell Compiler / GHC Commits: 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - ed529a65 by Ben Gamari at 2019-04-12T15:06:32Z gitlab-ci: Add centos7 release job - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/ghci/ByteCodeLink.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f26b843e3dd0e30cc1349dfab7a9a4ca2c0f6b97...ed529a654dd04959975bf1df684f06b822321211 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f26b843e3dd0e30cc1349dfab7a9a4ca2c0f6b97...ed529a654dd04959975bf1df684f06b822321211 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 15:11:08 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 11:11:08 -0400 Subject: [Git][ghc/ghc][wip/centos7] 2 commits: gitlab-ci: Add centos7 release job Message-ID: <5cb0aa8ca887c_62b33fa2bf0f79d03079818@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/centos7 at Glasgow Haskell Compiler / GHC Commits: d5a7f8e9 by Ben Gamari at 2019-04-12T15:08:20Z gitlab-ci: Add centos7 release job - - - - - 0ad0c936 by Ben Gamari at 2019-04-12T15:10:48Z gitlab-ci: Only run release notes lint on release tags - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -116,7 +116,8 @@ lint-changelogs: lint-release-changelogs: extends: .lint-changelogs only: - - tags + refs: + - /ghc-[0-9]+\.[0-9]+\.[0-9]+-.*/ ############################################################ @@ -448,6 +449,24 @@ release-x86_64-linux-deb8: when: always expire_in: 2 week +################################# +# x86_64-linux-centos7 +################################# + +release-x86_64-linux-centos7: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-centos7:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-centos7" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-centos7-linux.tar.xz" + only: + - tags + cache: + key: linux-x86_64-centos7 + artifacts: + when: always + expire_in: 2 week ################################# # x86_64-linux-fedora27 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/ed529a654dd04959975bf1df684f06b822321211...0ad0c936a91d0117e03bb89e2f62ecc9bbe5a244 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/ed529a654dd04959975bf1df684f06b822321211...0ad0c936a91d0117e03bb89e2f62ecc9bbe5a244 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 15:15:37 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 11:15:37 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] 9 commits: Use funPrec, not topPrec, to parenthesize GADT argument types Message-ID: <5cb0ab996771b_62b33fa2ec39a4bc308262b@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: 20f8f2f9 by Ryan Scott at 2019-04-09T20:21:35Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 996a8732 by klebinger.andreas at gmx.at at 2019-04-12T15:15:28Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. (cherry picked from commit 9b131500371a07626e33edc56700c12322364560) - - - - - d3b1a318 by Ben Gamari at 2019-04-12T15:15:28Z configure: Always use AC_LINK_ELSEIF when testing against assembler This fixes #16440, where the build system incorrectly concluded that the `.subsections_via_symbols` assembler directive was supported on a Linux system. This was caused by the fact that gcc was invoked with `-flto`; when so-configured gcc does not call the assembler but rather simply serialises its AST for compilation during the final link. This is described in Note [autoconf assembler checks and -flto]. (cherry picked from commit 7b090b53fea065d2cfd967ea919426af9ba8d737) - - - - - 05db4473 by Matthew Pickering at 2019-04-12T15:15:28Z Add werror function to Flavour.hs This function makes it easy to turn on `-Werror` in the correct manner to mimic how CI turns on -Werror. (cherry picked from commit 8dcd00cef7782c64b5484b106f4fd77c8c87e40a) - - - - - c3e5b5e8 by Alp Mestanogullari at 2019-04-12T15:15:28Z Hadrian: introduce ways to skip some documentation targets The initial motivation for this is to have a chance to run the binary distribution rules in our Windows CI without having to install sphinx-build and xelatex there, while retaining the ability to generate haddocks. I just ended up extending this idea a little bit so as to have control over whether we build haddocks, (sphinx) HTML manuals, (sphinx) PDF manuals and (sphinx) manpages. (cherry picked from commit 8442103aa575dc1cd25cb3231e729c6365dc1b5c) - - - - - 785f3583 by Ben Gamari at 2019-04-12T15:15:28Z gitlab-ci: Backport from master - - - - - 7b9c7c65 by Ben Gamari at 2019-04-12T15:15:28Z users-guide: Add pretty to package list - - - - - bdf4c632 by Ben Gamari at 2019-04-12T15:15:28Z gitlab-ci: Allow failing build jobs to fail - - - - - 7ad5ea62 by Ben Gamari at 2019-04-12T15:15:28Z Allow unregisterised build to fail - - - - - 16 changed files: - .gitlab-ci.yml - aclocal.m4 - compiler/iface/IfaceSyn.hs - docs/users_guide/8.8.1-notes.rst - hadrian/doc/make.md - hadrian/doc/user-settings.md - hadrian/src/CommandLine.hs - hadrian/src/Flavour.hs - hadrian/src/Rules/Documentation.hs - hadrian/src/Settings/Default.hs - hadrian/src/UserSettings.hs - rts/StgCRun.c - + testsuite/tests/ghci/scripts/T16527.hs - + testsuite/tests/ghci/scripts/T16527.script - + testsuite/tests/ghci/scripts/T16527.stdout - testsuite/tests/ghci/scripts/all.T Changes: ===================================== .gitlab-ci.yml ===================================== @@ -1,17 +1,31 @@ variables: GIT_SSL_NO_VERIFY: "1" + # Commit of ghc/ci-images repository from which to pull Docker images + DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + before_script: - python3 .gitlab/fix-submodules.py - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" stages: - - lint - - build - - full-build - - cleanup # See Note [Cleanup on Windows] + - lint # Source linting + - build # A quick smoke-test to weed out broken commits + - full-build # Build all the things + - cleanup # See Note [Cleanup on Windows] + - packaging # Source distribution, etc. + - hackage # head.hackage testing + - deploy # push documentation + +.only-default: &only-default + only: + - master + - /ghc-[0-9]+\.[0-9]+/ + - merge_requests + - tags ############################################################ # Runner Tags @@ -30,44 +44,106 @@ stages: ############################################################ ghc-linters: + allow_failure: true stage: lint - image: ghcci/linters:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" script: - - | - if [ -n "$CI_MERGE_REQUEST_ID" ]; then - base="$(git merge-base $CI_MERGE_REQUEST_BRANCH_NAME HEAD)" - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) - submodchecker .git $(git rev-list $base..$CI_COMMIT_SHA) - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA - fi + - git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME + - base="$(git merge-base FETCH_HEAD $CI_COMMIT_SHA)" + - "echo Merge base $base" + # - validate-commit-msg .git $(git rev-list $base..$CI_COMMIT_SHA) + - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) + - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + dependencies: [] + tags: + - lint + only: + refs: + - merge_requests + +# We allow the submodule checker to fail when run on merge requests (to +# accomodate, e.g., haddock changes not yet upstream) but not on `master` or +# Marge jobs. +.lint-submods: + stage: lint + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" + script: + - submodchecker .git $(git rev-list $base..$CI_COMMIT_SHA) + dependencies: [] + tags: + - lint + +lint-submods: + extends: .lint-submods + only: + refs: + - master + - /ghc-[0-9]+\.[0-9]+/ + - wip/marge_bot_batch_merge_job + +lint-submods-mr: + extends: .lint-submods + allow_failure: true + only: + refs: + - merge_requests + +.lint-changelogs: + stage: lint + image: "registry.gitlab.haskell.org/ghc/ci-images/linters:$DOCKER_REV" dependencies: [] tags: - lint + script: + - | + grep TBA libraries/*/changelog.md && ( + echo "Error: Found \"TBA\"s in changelogs." + exit 1 + ) + +lint-changelogs: + extends: .lint-changelogs + allow_failure: true + only: + refs: + - /ghc-[0-9]+\.[0-9]+/ + +lint-release-changelogs: + extends: .lint-changelogs + only: + - tags + ############################################################ # Validation via Pipelines (hadrian) ############################################################ .validate-hadrian: + <<: *only-default allow_failure: true script: + - cabal update - git clean -xdf && git submodule foreach git clean -xdf - bash .circleci/prepare-system.sh - if [[ -d ./cabal-cache ]]; then cp -R ./.cabal-cache ~/.cabal-cache; fi - ./boot - ./configure $CONFIGURE_ARGS - - hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` + - hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --docs=no-sphinx binary-dist + - mv _build/bindist/ghc*.tar.xz ghc.tar.xz cache: key: hadrian paths: - cabal-cache + artifacts: + when: always + paths: + - ghc.tar.xz validate-x86_64-linux-deb8-hadrian: extends: .validate-hadrian stage: build - image: ghcci/x86_64-linux-deb8:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb8:$DOCKER_REV" before_script: # workaround for docker permissions - sudo chown ghc:ghc -R . @@ -75,14 +151,17 @@ validate-x86_64-linux-deb8-hadrian: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" tags: - x86_64-linux + ############################################################ # Validation via Pipelines (make) ############################################################ .validate: + <<: *only-default variables: TEST_TYPE: test before_script: @@ -92,22 +171,25 @@ validate-x86_64-linux-deb8-hadrian: - ./configure $CONFIGURE_ARGS - | THREADS=`mk/detect-cpu-count.sh` - make V=0 -j$THREADS + make V=0 -j$THREADS WERROR=-Werror - | make binary-dist TAR_COMP_OPTS="-1" - mv ghc-*.tar.xz ghc.tar.xz - | THREADS=`mk/detect-cpu-count.sh` - make $TEST_TYPE THREADS=$THREADS JUNIT_FILE=../../junit.xml + make $TEST_TYPE THREADS=$THREADS JUNIT_FILE=../../junit.xml METRICS_FILE=$METRICS_FILE dependencies: [] artifacts: reports: junit: junit.xml expire_in: 2 week paths: - - ghc.tar.xz + - ghc-*.tar.xz - junit.xml +################################# +# x86_64-darwin +################################# + validate-x86_64-darwin: extends: .validate stage: full-build @@ -115,17 +197,20 @@ validate-x86_64-darwin: - x86_64-darwin variables: GHC_VERSION: 8.6.3 + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-apple-darwin.tar.xz" MACOSX_DEPLOYMENT_TARGET: "10.7" # Only Sierra and onwards supports clock_gettime. See #12858 ac_cv_func_clock_gettime: "no" LANG: "en_US.UTF-8" CONFIGURE_ARGS: --with-intree-gmp + TEST_ENV: "x86_64-darwin" before_script: - git clean -xdf && git submodule foreach git clean -xdf - python3 .gitlab/fix-submodules.py - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" - bash .gitlab/darwin-init.sh - PATH="`pwd`/toolchain/bin:$PATH" @@ -150,6 +235,12 @@ validate-x86_64-darwin: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" + # Build hyperlinked sources for documentation when building releases + - | + if [[ -n "$CI_COMMIT_TAG" ]]; then + echo "EXTRA_HADDOCK_OPTS += --hyperlinked-source --quickjump" >> mk/build.mk + fi - bash .circleci/prepare-system.sh # workaround for docker permissions @@ -162,22 +253,31 @@ validate-x86_64-darwin: - cabal-cache - toolchain -validate-aarch64-linux-deb9: +################################# +# aarch64-linux-deb9 +################################# + +.build-aarch64-linux-deb9: extends: .validate-linux stage: full-build - image: ghcci/aarch64-linux-deb9:0.1 + image: "registry.gitlab.haskell.org/ghc/ci-images/aarch64-linux-deb9:$DOCKER_REV" allow_failure: true - artifacts: - when: always - expire_in: 2 week + variables: + TEST_ENV: "aarch64-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-aarch64-linux-deb9.tar.xz" cache: key: linux-aarch64-deb9 tags: - aarch64-linux +validate-aarch64-linux-deb9: + extends: .build-aarch64-linux-deb9 + artifacts: + when: always + expire_in: 2 week + nightly-aarch64-linux-deb9: - extends: validate-aarch64-linux-deb9 - stage: full-build + extends: .build-aarch64-linux-deb9 artifacts: expire_in: 2 year variables: @@ -186,22 +286,28 @@ nightly-aarch64-linux-deb9: variables: - $NIGHTLY -validate-i386-linux-deb9: +################################# +# i386-linux-deb9 +################################# + +.build-i386-linux-deb9: extends: .validate-linux stage: full-build - image: ghcci/i386-linux-deb9:0.1 - allow_failure: true + image: "registry.gitlab.haskell.org/ghc/ci-images/i386-linux-deb9:$DOCKER_REV" + variables: + TEST_ENV: "i386-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-i386-deb9-linux.tar.xz" + cache: + key: linux-i386-deb9 + +validate-i386-linux-deb9: + extends: .build-i386-linux-deb9 artifacts: when: always expire_in: 2 week - cache: - key: linux-i386-deb9 nightly-i386-linux-deb9: - extends: .validate-linux - stage: full-build - image: ghcci/i386-linux-deb9:0.1 - allow_failure: true + extends: .build-i386-linux-deb9 variables: TEST_TYPE: slowtest artifacts: @@ -210,22 +316,29 @@ nightly-i386-linux-deb9: only: variables: - $NIGHTLY + +################################# +# x86_64-linux-deb9 +################################# + +.build-x86_64-linux-deb9: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb9:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-deb9" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb9-linux.tar.xz" cache: - key: linux-i386-deb9 + key: linux-x86_64-deb9 validate-x86_64-linux-deb9: - extends: .validate-linux - stage: build - image: ghcci/x86_64-linux-deb9:0.2 + extends: .build-x86_64-linux-deb9 artifacts: when: always expire_in: 2 week - cache: - key: linux-x86_64-deb9 nightly-x86_64-linux-deb9: - extends: validate-x86_64-linux-deb9 - stage: build + extends: .build-x86_64-linux-deb9 artifacts: expire_in: 2 year variables: @@ -234,70 +347,96 @@ nightly-x86_64-linux-deb9: variables: - $NIGHTLY +# N.B. Has DEBUG assertions enabled in stage2 +validate-x86_64-linux-deb9-debug: + extends: .build-x86_64-linux-deb9 + stage: build + variables: + BUILD_FLAVOUR: validate + TEST_ENV: "x86_64-linux-deb9-debug" + validate-x86_64-linux-deb9-llvm: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build allow_failure: true - image: ghcci/x86_64-linux-deb9:0.2 variables: BUILD_FLAVOUR: perf-llvm - cache: - key: linux-x86_64-deb9 - -validate-x86_64-linux-deb8: - extends: .validate-linux - stage: full-build - image: ghcci/x86_64-linux-deb8:0.1 - cache: - key: linux-x86_64-deb8 - artifacts: - when: always - expire_in: 2 week - -validate-x86_64-linux-fedora27: - extends: .validate-linux - stage: full-build - image: ghcci/x86_64-linux-fedora27:0.1 - cache: - key: linux-x86_64-fedora27 - artifacts: - when: always - expire_in: 2 week + TEST_ENV: "x86_64-linux-deb9-llvm" validate-x86_64-linux-deb9-integer-simple: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build + allow_failure: true variables: INTEGER_LIBRARY: integer-simple - image: ghcci/x86_64-linux-deb9:0.2 - cache: - key: linux-x86_64-deb9 + TEST_ENV: "x86_64-linux-deb9-integer-simple" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb9-linux-integer-simple.tar.xz" nightly-x86_64-linux-deb9-integer-simple: - extends: validate-x86_64-linux-deb9-integer-simple + extends: .build-x86_64-linux-deb9 stage: full-build - artifacts: - expire_in: 2 year variables: + INTEGER_LIBRARY: integer-simple + TEST_ENV: "x86_64-linux-deb9-integer-simple" TEST_TYPE: slowtest + artifacts: + expire_in: 2 year only: variables: - $NIGHTLY validate-x86_64-linux-deb9-unreg: - extends: .validate-linux + extends: .build-x86_64-linux-deb9 stage: full-build + allow_failure: true variables: CONFIGURE_ARGS: --enable-unregisterised - image: ghcci/x86_64-linux-deb9:0.2 + TEST_ENV: "x86_64-linux-deb9-unreg" + + +################################# +# x86_64-linux-deb8 +################################# + +release-x86_64-linux-deb8: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb8:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-deb8" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-deb8-linux.tar.xz" + only: + - tags cache: - key: linux-x86_64-deb9 + key: linux-x86_64-deb8 + artifacts: + when: always + expire_in: 2 week + + +################################# +# x86_64-linux-fedora27 +################################# + +validate-x86_64-linux-fedora27: + extends: .validate-linux + stage: full-build + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-fedora27:$DOCKER_REV" + variables: + TEST_ENV: "x86_64-linux-fedora27" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-fedora27-linux.tar.xz" + cache: + key: linux-x86_64-fedora27 + artifacts: + when: always + expire_in: 2 week ############################################################ # Validation via Pipelines (Windows) ############################################################ -.validate-windows: +.build-windows: + <<: *only-default before_script: - git clean -xdf - git submodule foreach git clean -xdf @@ -314,72 +453,132 @@ validate-x86_64-linux-deb9-unreg: - git submodule sync --recursive - git submodule update --init --recursive - git checkout .gitmodules + - "git fetch https://gitlab.haskell.org/ghc/ghc-performance-notes.git refs/notes/perf:refs/notes/perf || true" - bash .gitlab/win32-init.sh after_script: - rd /s /q tmp - robocopy /np /nfl /ndl /e "%APPDATA%\cabal" cabal-cache - bash -c 'make clean || true' dependencies: [] + variables: + FORCE_SYMLINKS: 1 + LANG: "en_US.UTF-8" cache: paths: - cabal-cache - ghc-8.6.2 - ghc-tarballs -validate-x86_64-windows-hadrian: - extends: .validate-windows +.build-windows-hadrian: + extends: .build-windows stage: full-build + allow_failure: true variables: GHC_VERSION: "8.6.2" - LANG: "en_US.UTF-8" script: - | - set MSYSTEM=MINGW64 python boot bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex' - mkdir -p _build - cp -R inplace/mingw _build/mingw - - bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --flavour=Quick" - # FIXME: Bindist disabled due to #16073 - #- bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh binary-dist" + - bash -c "PATH=`pwd`/toolchain/bin:$PATH hadrian/build.cabal.sh -j`mk/detect-cpu-count.sh` --flavour=Quick --docs=no-sphinx binary-dist" + - mv _build/bindist/ghc*.tar.xz ghc.tar.xz # FIXME: Testsuite disabled due to #16156. - #- bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' - cache: - key: x86_64-windows + # - bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' tags: - x86_64-windows + artifacts: + when: always + paths: + - ghc.tar.xz -validate-x86_64-windows: - extends: .validate-windows +validate-x86_64-windows-hadrian: + extends: .build-windows-hadrian + variables: + MSYSTEM: MINGW64 + cache: + key: x86_64-windows-hadrian + +nightly-i386-windows-hadrian: + extends: .build-windows-hadrian + variables: + MSYSTEM: MINGW32 + only: + variables: + - $NIGHTLY + cache: + key: i386-windows-hadrian + +.build-windows-make: + extends: .build-windows stage: full-build # due to #16084 allow_failure: true variables: + BUILD_FLAVOUR: "quick" GHC_VERSION: "8.6.2" - LANG: "en_US.UTF-8" + BIN_DIST_PREP_TAR_COMP: "bindistprep/ghc-x86_64-mingw32.tar.xz" script: - | - set MSYSTEM=MINGW64 python boot - bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex' - - bash -c "echo include mk/flavours/quick.mk > mk/build.mk" + bash -c './configure --enable-tarballs-autodownload GHC=`pwd`/toolchain/bin/ghc HAPPY=`pwd`/toolchain/bin/happy ALEX=`pwd`/toolchain/bin/alex $CONFIGURE_ARGS' + - bash -c "echo include mk/flavours/${BUILD_FLAVOUR}.mk > mk/build.mk" - bash -c "PATH=`pwd`/toolchain/bin:$PATH make -j`mk/detect-cpu-count.sh`" - - | - bash -c "make binary-dist TAR_COMP_OPTS=-1" - mv ghc-*.tar.xz ghc.tar.xz + - bash -c "PATH=`pwd`/toolchain/bin:$PATH make binary-dist TAR_COMP_OPTS=-1" - bash -c 'make V=0 test THREADS=`mk/detect-cpu-count.sh` JUNIT_FILE=../../junit.xml' - cache: - key: x86_64-windows tags: - x86_64-windows artifacts: when: always + expire_in: 2 week reports: junit: junit.xml paths: - - ghc.tar.xz + - ghc-*.tar.xz - junit.xml +validate-x86_64-windows: + extends: .build-windows-make + variables: + MSYSTEM: MINGW64 + CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" + cache: + key: x86_64-windows + +# Normal Windows validate builds are profiled; that won't do for releases. +release-x86_64-windows: + extends: validate-x86_64-windows + variables: + MSYSTEM: MINGW64 + BUILD_FLAVOUR: "perf" + CONFIGURE_ARGS: "--target=x86_64-unknown-mingw32" + only: + - tags + +release-i386-windows: + extends: .build-windows-make + only: + - tags + variables: + MSYSTEM: MINGW32 + BUILD_FLAVOUR: "perf" + CONFIGURE_ARGS: "--target=i386-unknown-mingw32" + cache: + key: i386-windows + +nightly-i386-windows: + extends: .build-windows-make + only: + variables: + - $NIGHTLY + variables: + MSYSTEM: MINGW32 + CONFIGURE_ARGS: "--target=i386-unknown-mingw32" + cache: + key: i386-windows + +############################################################ +# Cleanup +############################################################ + # Note [Cleaning up after shell executor] # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # @@ -393,6 +592,7 @@ validate-x86_64-windows: # See Note [Cleanup after shell executor] cleanup-windows: + <<: *only-default stage: cleanup tags: - x86_64-windows @@ -415,10 +615,12 @@ cleanup-windows: # See Note [Cleanup after shell executor] cleanup-darwin: + <<: *only-default stage: cleanup tags: - x86_64-darwin when: always + dependencies: [] before_script: - echo "Time to clean up" script: @@ -430,3 +632,106 @@ cleanup-darwin: - rm -Rf $BUILD_DIR/* - exit 0 +############################################################ +# Packaging +############################################################ + +doc-tarball: + <<: *only-default + stage: packaging + tags: + - x86_64-linux + image: ghcci/x86_64-linux-deb9:0.2 + dependencies: + - validate-x86_64-linux-deb9 + - validate-x86_64-windows + artifacts: + paths: + - haddock.html.tar.xz + - libraries.html.tar.xz + - users_guide.html.tar.xz + - index.html + - "*.pdf" + script: + - rm -Rf docs + - bash -ex distrib/mkDocs/mkDocs ghc-x86_64-deb9-linux.tar.xz ghc-x86_64-mingw32.tar.xz + - ls -lh + - mv docs/*.tar.xz docs/index.html . + +source-tarball: + stage: packaging + tags: + - x86_64-linux + image: ghcci/x86_64-linux-deb9:0.2 + only: + - tags + artifacts: + paths: + - ghc-*.tar.xz + - version + script: + - mk/get-win32-tarballs.sh download all + - ./boot + - ./configure + - make sdist + - mv sdistprep/*.xz . + - make show VALUE=version > version + + +############################################################ +# Testing via head.hackage +############################################################ + +# Triggering jobs in the ghc/head.hackage project requires that we have a job +# token for that repository. Furthermore the head.hackage CI job must have +# access to an unprivileged access token with the ability to query the ghc/ghc +# project such that it can find the job ID of the fedora27 job for the current +# pipeline. + +.hackage: + <<: *only-default + stage: hackage + image: ghcci/x86_64-linux-deb9:0.2 + tags: + - x86_64-linux + dependencies: [] + variables: + HEAD_HACKAGE_PROJECT_ID: "78" + script: + - bash .gitlab/start-head.hackage.sh + +hackage: + extends: .hackage + when: manual + +hackage-label: + extends: .hackage + only: + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*user-facing.*/ + +nightly-hackage: + extends: .hackage + only: + variables: + - $NIGHTLY + +pages: + stage: deploy + dependencies: + - doc-tarball + image: ghcci/x86_64-linux-deb9:0.2 + tags: + - x86_64-linux + script: + - mkdir -p public/doc + - tar -xf haddock.html.tar.xz -C public/doc + - tar -xf libraries.html.tar.xz -C public/doc + - tar -xf users_guide.html.tar.xz -C public/doc + - cp -f index.html public/doc + only: + - master + artifacts: + paths: + - public + ===================================== aclocal.m4 ===================================== @@ -288,11 +288,31 @@ AC_DEFUN([FPTOOLS_SET_HASKELL_PLATFORM_VARS], esac } + dnl Note [autoconf assembler checks and -flto] + dnl ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + dnl + dnl Autoconf's AC_COMPILE_IFELSE macro is fragile in the case of checks + dnl which require that the assembler is run. Specifically, GCC does not run + dnl the assembler if invoked with `-c -flto`; it merely dumps its internal + dnl AST to the object file, to be compiled and assembled during the final + dnl link. + dnl + dnl This can cause configure checks like that for the + dnl .subsections_via_symbols directive to pass unexpected (see #16440), + dnl leading the build system to incorrectly conclude that the directive is + dnl supported. + dnl + dnl For this reason, it is important that configure checks that rely on the + dnl assembler failing use AC_LINK_IFELSE rather than AC_COMPILE_IFELSE, + dnl ensuring that the assembler sees the check. + dnl + dnl ** check for Apple-style dead-stripping support dnl (.subsections-via-symbols assembler directive) AC_MSG_CHECKING(for .subsections_via_symbols) - AC_COMPILE_IFELSE( + dnl See Note [autoconf assembler checks and -flto] + AC_LINK_IFELSE( [AC_LANG_PROGRAM([], [__asm__ (".subsections_via_symbols");])], [AC_MSG_RESULT(yes) HaskellHaveSubsectionsViaSymbols=True @@ -305,8 +325,9 @@ AC_DEFUN([FPTOOLS_SET_HASKELL_PLATFORM_VARS], dnl ** check for .ident assembler directive AC_MSG_CHECKING(whether your assembler supports .ident directive) - AC_COMPILE_IFELSE( - [AC_LANG_SOURCE([__asm__ (".ident \"GHC x.y.z\"");])], + dnl See Note [autoconf assembler checks and -flto] + AC_LINK_IFELSE( + [AC_LANG_PROGRAM([__asm__ (".ident \"GHC x.y.z\"");], [])], [AC_MSG_RESULT(yes) HaskellHaveIdentDirective=True], [AC_MSG_RESULT(no) @@ -330,8 +351,15 @@ AC_DEFUN([FPTOOLS_SET_HASKELL_PLATFORM_VARS], ;; esac AC_MSG_CHECKING(for GNU non-executable stack support) - AC_COMPILE_IFELSE( - [AC_LANG_PROGRAM([__asm__ (".section .note.GNU-stack,\"\",$progbits");], [0])], + dnl See Note [autoconf assembler checks and -flto] + AC_LINK_IFELSE( + dnl the `main` function is placed after the .note.GNU-stack directive + dnl so we need to ensure that the active segment is correctly set, + dnl otherwise `main` will be placed in the wrong segment. + [AC_LANG_PROGRAM([ + __asm__ (".section .note.GNU-stack,\"\",$progbits"); + __asm__ (".section .text"); + ], [0])], [AC_MSG_RESULT(yes) HaskellHaveGnuNonexecStack=True], [AC_MSG_RESULT(no) ===================================== compiler/iface/IfaceSyn.hs ===================================== @@ -1034,7 +1034,7 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- a compound field type is if it's preceded by a bang pattern. pprFieldArgTy (bang, ty) = ppr_arg_ty (bang_prec bang) bang ty -- If not using record syntax, a compound field type might need to be - -- parenthesize if one of the following holds: + -- parenthesized if one of the following holds: -- -- 1. We're using Haskell98 syntax. -- 2. The field type is preceded with a bang pattern. @@ -1046,18 +1046,23 @@ pprIfaceConDecl ss gadt_style tycon tc_binders parent -- If we're displaying the fields GADT-style, e.g., -- -- data Foo a where - -- MkFoo :: Maybe a -> Foo + -- MkFoo :: (Int -> Int) -> Maybe a -> Foo -- - -- Then there is no inherent need to parenthesize compound fields like - -- `Maybe a` (bang patterns notwithstanding). If we're displaying the - -- fields Haskell98-style, e.g., + -- Then we use `funPrec`, since that will ensure `Int -> Int` gets the + -- parentheses that it requires, but simple compound types like `Maybe a` + -- (which don't require parentheses in a function argument position) won't + -- get them, assuming that there are no bang patterns (see bang_prec). -- - -- data Foo a = MkFoo (Maybe a) + -- If we're displaying the fields Haskell98-style, e.g., -- - -- Then we *must* parenthesize compound fields like (Maybe a). + -- data Foo a = MkFoo (Int -> Int) (Maybe a) + -- + -- Then not only must we parenthesize `Int -> Int`, we must also + -- parenthesize compound fields like (Maybe a). Therefore, we pick + -- `appPrec`, which has higher precedence than `funPrec`. gadt_prec :: PprPrec gadt_prec - | gadt_style = topPrec + | gadt_style = funPrec | otherwise = appPrec -- The presence of bang patterns or UNPACK annotations requires ===================================== docs/users_guide/8.8.1-notes.rst ===================================== @@ -206,6 +206,7 @@ for further change information. libraries/mtl/mtl.cabal: Dependency of ``Cabal`` library libraries/parsec/parsec.cabal: Dependency of ``Cabal`` library libraries/process/process.cabal: Dependency of ``ghc`` library + libraries/pretty/pretty.cabal: Dependency of ``ghc`` library libraries/stm/stm.cabal: Dependency of ``haskeline`` library libraries/template-haskell/template-haskell.cabal: Core library libraries/terminfo/terminfo.cabal: Dependency of ``haskeline`` library ===================================== hadrian/doc/make.md ===================================== @@ -174,6 +174,22 @@ time you fire up a build. This is not possible with the Make build system. build docs ``` +- Build documentation, but without haddocks (resp. without HTML or PDF manuals) + + ``` sh + # Make + echo 'HADDOCKS_DOCS = NO' > mk/build.mk + # For HTML manuals: BUILD_SPHINX_HTML = NO + # For PDF manuals: BUILD_SPHINX_PDF = NO + make + + # Hadrian + build docs --docs=no-haddocks + # Append --docs=no-sphinx-pdfs, --docs=no-sphinx-html or + # --docs=no-sphinx-man (or --docs=no-sphinx to encompass them all) + # to further reduce or even completely disable documentation targets. + ``` + - Running nofib ``` sh ===================================== hadrian/doc/user-settings.md ===================================== @@ -32,7 +32,10 @@ data Flavour = Flavour { -- | Build profiled GHC. ghcProfiled :: Bool, -- | Build GHC with debug information. - ghcDebugged :: Bool } + ghcDebugged :: Bool + -- | Whether to build docs and which ones + -- (haddocks, user manual, haddock manual) + ghcDocs :: Action DocTargets } ``` Hadrian provides several built-in flavours (`default`, `quick`, and a few others; see `hadrian/doc/flavours.md`), which can be activated from the command line, @@ -102,6 +105,17 @@ patterns such as `"//Prelude.*"` can be used when matching input and output file where `//` matches an arbitrary number of path components and `*` matches an entire path component, excluding any separators. +### Enabling -Werror + +It is useful to enable `-Werror` when building GHC as this setting is +used in the CI to ensure a warning free build. The `werror` function can be +used to easily modify a flavour to turn this setting on. + +``` +devel2WerrorFlavour :: Flavour +devel2WerrorFlavour = werror (developmentFlavour Stage2) +``` + ## Packages Users can add and remove packages from particular build stages. As an example, @@ -216,6 +230,45 @@ verboseCommand = output "//rts/sm/*" &&^ way threaded verboseCommand = return True ``` +## Documentation + +`Flavour`'s `ghcDocs :: Action DocTargets` field lets you +customize the "groups" of documentation targets that should +run when running `build docs` (or, transitively, +`build binary-dist`). + +```haskell +type DocTargets = Set DocTarget +data DocTarget = Haddocks | SphinxHTML | SphinxPDFs | SphinxMan +``` + +By default, `ghcDocs` contains all of them and `build docs` would +therefore attempt to build all the haddocks, manuals and manpages. +If, for some reason (e.g no easy way to install `sphinx-build` or +`xelatex` on your system), you're just interested in building the +haddocks, you could define a custom flavour as follows: + +```haskell +justHaddocksFlavour :: Flavour +justHaddocksFlavour = defaultFlavour + { name = "default-haddocks" + , ghcDocs = Set.singleton Haddocks } +``` + +and then run `build --flavour=default-haddocks`. Alternatively, +you can use the `--docs` CLI flag to selectively disable some or +all of the documentation targets: + +- `--docs=none`: don't build any docs +- `--docs=no-haddocks`: don't build haddocks +- `--docs=no-sphinx`: don't build any user manual or manpage +- `--docs=no-sphinx-html`: don't build HTML versions of manuals +- `--docs=no-sphinx-pdfs`: don't build PDF versions of manuals +- `--docs=no-sphinx-man`: don't build the manpage + +You can pass several `--docs=...` flags, Hadrian will combine +their effects. + ## Miscellaneous To change the default behaviour of Hadrian with respect to building split ===================================== hadrian/src/CommandLine.hs ===================================== @@ -1,17 +1,20 @@ module CommandLine ( optDescrs, cmdLineArgsMap, cmdFlavour, lookupFreeze1, cmdIntegerSimple, cmdProgressColour, cmdProgressInfo, cmdConfigure, cmdSplitObjects, - lookupBuildRoot, TestArgs(..), TestSpeed(..), defaultTestArgs + cmdDocsArgs, lookupBuildRoot, TestArgs(..), TestSpeed(..), defaultTestArgs ) where import Data.Either import qualified Data.HashMap.Strict as Map import Data.List.Extra import Development.Shake hiding (Normal) +import Flavour (DocTargets, DocTarget(..)) import Hadrian.Utilities hiding (buildRoot) import System.Console.GetOpt import System.Environment +import qualified Data.Set as Set + data TestSpeed = Slow | Average | Fast deriving (Show, Eq) -- | All arguments that can be passed to Hadrian via the command line. @@ -24,7 +27,8 @@ data CommandLineArgs = CommandLineArgs , progressInfo :: ProgressInfo , splitObjects :: Bool , buildRoot :: BuildRoot - , testArgs :: TestArgs } + , testArgs :: TestArgs + , docTargets :: DocTargets } deriving (Eq, Show) -- | Default values for 'CommandLineArgs'. @@ -38,7 +42,8 @@ defaultCommandLineArgs = CommandLineArgs , progressInfo = Brief , splitObjects = False , buildRoot = BuildRoot "_build" - , testArgs = defaultTestArgs } + , testArgs = defaultTestArgs + , docTargets = Set.fromList [minBound..maxBound] } -- | These arguments are used by the `test` target. data TestArgs = TestArgs @@ -179,6 +184,25 @@ readTestWay way = let newWays = way : testWays (testArgs flags) in flags { testArgs = (testArgs flags) {testWays = newWays} } +readDocsArg :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) +readDocsArg ms = maybe (Left "Cannot parse docs argument") (Right . set) (go =<< ms) + + where + go :: String -> Maybe (DocTargets -> DocTargets) + go "none" = Just (const Set.empty) + go "no-haddocks" = Just (Set.delete Haddocks) + go "no-sphinx-html" = Just (Set.delete SphinxHTML) + go "no-sphinx-pdfs" = Just (Set.delete SphinxPDFs) + go "no-sphinx-man" = Just (Set.delete SphinxMan) + go "no-sphinx" = Just (Set.delete SphinxHTML + . Set.delete SphinxPDFs + . Set.delete SphinxMan) + go _ = Nothing + + set :: (DocTargets -> DocTargets) -> CommandLineArgs -> CommandLineArgs + set tweakTargets flags = flags + { docTargets = tweakTargets (docTargets flags) } + -- | Standard 'OptDescr' descriptions of Hadrian's command line arguments. optDescrs :: [OptDescr (Either String (CommandLineArgs -> CommandLineArgs))] optDescrs = @@ -198,6 +222,8 @@ optDescrs = "Progress info style (None, Brief, Normal or Unicorn)." , Option [] ["split-objects"] (NoArg readSplitObjects) "Generate split objects (requires a full clean rebuild)." + , Option [] ["docs"] (OptArg readDocsArg "TARGET") + "Strip down docs targets (none, no-haddocks, no-sphinx[-{html, pdfs, man}]." , Option [] ["test-compiler"] (OptArg readTestCompiler "TEST_COMPILER") "Use given compiler [Default=stage2]." , Option [] ["test-config-file"] (OptArg readTestConfigFile "CONFIG_FILE") @@ -259,3 +285,6 @@ cmdProgressInfo = progressInfo <$> cmdLineArgs cmdSplitObjects :: Action Bool cmdSplitObjects = splitObjects <$> cmdLineArgs + +cmdDocsArgs :: Action DocTargets +cmdDocsArgs = docTargets <$> cmdLineArgs ===================================== hadrian/src/Flavour.hs ===================================== @@ -1,6 +1,10 @@ -module Flavour (Flavour (..)) where +module Flavour + ( Flavour (..), werror + , DocTargets, DocTarget(..) + ) where import Expression +import Data.Set (Set) -- Please update doc/{flavours.md, user-settings.md} when changing this file. -- | 'Flavour' is a collection of build settings that fully define a GHC build. @@ -31,4 +35,33 @@ data Flavour = Flavour { -- | Build profiled GHC. ghcProfiled :: Bool, -- | Build GHC with debug information. - ghcDebugged :: Bool } + ghcDebugged :: Bool, + -- | Whether to build docs and which ones + -- (haddocks, user manual, haddock manual) + ghcDocs :: Action DocTargets } + +-- | A set of documentation targets +type DocTargets = Set DocTarget + +-- | Documentation targets +-- +-- While we can't reasonably expose settings or CLI options +-- to selectively disable, say, base's haddocks, we can offer +-- a less fine-grained choice: +-- +-- - haddocks for libraries +-- - non-haddock html pages (e.g GHC's user manual) +-- - PDF documents (e.g haddock's manual) +-- - man pages (GHC's) +-- +-- The main goal being to have easy ways to do away with the need +-- for e.g @sphinx-build@ or @xelatex@ and associated packages +-- while still being able to build a(n almost) complete binary +-- distribution. +data DocTarget = Haddocks | SphinxHTML | SphinxPDFs | SphinxMan + deriving (Eq, Ord, Show, Bounded, Enum) + +-- | Turn on -Werror for packages built with the stage1 compiler. +-- It mimics the CI settings so is useful to turn on when developing. +werror :: Flavour -> Flavour +werror fl = fl { args = args fl <> (builder Ghc ? notStage0 ? arg "-Werror") } ===================================== hadrian/src/Rules/Documentation.hs ===================================== @@ -22,6 +22,7 @@ import Target import Utilities import Data.List (union) +import qualified Data.Set as Set import qualified Text.Parsec as Parsec docRoot :: FilePath @@ -79,10 +80,35 @@ documentationRules = do -- Haddock's manual, and builds man pages "docs" ~> do root <- buildRoot + doctargets <- ghcDocs =<< flavour let html = htmlRoot -/- "index.html" -- also implies "docs-haddock" archives = map pathArchive docPaths pdfs = map pathPdf $ docPaths \\ ["libraries"] - need $ map (root -/-) $ [html] ++ archives ++ pdfs ++ [manPageBuildPath] + + targets = -- include PDFs unless --docs=no-sphinx[-pdf] is + -- passed. + concat [ pdfs | SphinxPDFs `Set.member` doctargets ] + + -- include manpage unless --docs=no-sphinx[-man] is given. + ++ [ manPageBuildPath | SphinxMan `Set.member` doctargets ] + + -- include toplevel html target uness we neither want + -- haddocks nor html pages produced by sphinx. + ++ [ html | Set.size (doctargets `Set.intersection` + Set.fromList [Haddocks, SphinxHTML] + ) > 0 ] + + -- include archives for whatever targets remain from + -- the --docs arguments we got. + ++ [ ar + | (ar, doc) <- zip archives docPaths + , archiveTarget doc `Set.member` doctargets ] + + need $ map (root -/-) targets + + where archiveTarget "libraries" = Haddocks + archiveTarget _ = SphinxHTML + ------------------------------------- HTML ------------------------------------- @@ -94,7 +120,16 @@ buildHtmlDocumentation = do root <- buildRootRules root -/- htmlRoot -/- "index.html" %> \file -> do - need $ map ((root -/-) . pathIndex) docPaths + doctargets <- ghcDocs =<< flavour + + -- We include the HTML output of haddock for libraries unless + -- told not to (e.g with --docs=no-haddocks). Likewise for + -- the HTML version of the users guide or the Haddock manual. + let targets = [ "libraries" | Haddocks `Set.member` doctargets ] + ++ concat [ ["users_guide", "Haddock"] + | SphinxHTML `Set.member` doctargets ] + need $ map ((root -/-) . pathIndex) targets + copyFileUntracked "docs/index.html" file -- | Compile a Sphinx ReStructured Text package to HTML. ===================================== hadrian/src/Settings/Default.hs ===================================== @@ -222,7 +222,8 @@ defaultFlavour = Flavour , dynamicGhcPrograms = defaultDynamicGhcPrograms , ghciWithDebugger = False , ghcProfiled = False - , ghcDebugged = False } + , ghcDebugged = False + , ghcDocs = cmdDocsArgs } -- | Default logic for determining whether to build -- dynamic GHC programs. ===================================== hadrian/src/UserSettings.hs ===================================== @@ -2,6 +2,10 @@ -- hadrian/src/UserSettings.hs to hadrian/UserSettings.hs and edit your copy. -- If you don't copy the file your changes will be tracked by git and you can -- accidentally commit them. +-- +-- See doc/user-settings.md for instructions, and src/Flavour.hs for auxiliary +-- functions for manipulating flavours. +-- Please update doc/user-settings.md when committing changes to this file. module UserSettings ( userFlavours, userPackages, userDefaultFlavour, verboseCommand, buildProgressColour, successColour, finalStage ===================================== rts/StgCRun.c ===================================== @@ -494,15 +494,15 @@ StgRunIsImplementedInAssembler(void) "movq 48(%%rsp),%%rdi\n\t" "movq 56(%%rsp),%%rsi\n\t" "movq 64(%%rsp),%%xmm6\n\t" - "movq 72(%%rax),%%xmm7\n\t" - "movq 80(%%rax),%%xmm8\n\t" - "movq 88(%%rax),%%xmm9\n\t" - "movq 96(%%rax),%%xmm10\n\t" - "movq 104(%%rax),%%xmm11\n\t" - "movq 112(%%rax),%%xmm12\n\t" - "movq 120(%%rax),%%xmm13\n\t" - "movq 128(%%rax),%%xmm14\n\t" - "movq 136(%%rax),%%xmm15\n\t" + "movq 72(%%rsp),%%xmm7\n\t" + "movq 80(%%rsp),%%xmm8\n\t" + "movq 88(%%rsp),%%xmm9\n\t" + "movq 96(%%rsp),%%xmm10\n\t" + "movq 104(%%rsp),%%xmm11\n\t" + "movq 112(%%rsp),%%xmm12\n\t" + "movq 120(%%rsp),%%xmm13\n\t" + "movq 128(%%rsp),%%xmm14\n\t" + "movq 136(%%rsp),%%xmm15\n\t" #endif "addq %1, %%rsp\n\t" "retq" ===================================== testsuite/tests/ghci/scripts/T16527.hs ===================================== @@ -0,0 +1,7 @@ +{-# LANGUAGE GADTs #-} +{-# LANGUAGE RankNTypes #-} +module T16527 where + +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T ===================================== testsuite/tests/ghci/scripts/T16527.script ===================================== @@ -0,0 +1,2 @@ +:load T16527 +:info T ===================================== testsuite/tests/ghci/scripts/T16527.stdout ===================================== @@ -0,0 +1,4 @@ +data T where + MkT1 :: (Int -> Int) -> T + MkT2 :: (forall a. Maybe a) -> T + -- Defined at T16527.hs:5:1 ===================================== testsuite/tests/ghci/scripts/all.T ===================================== @@ -295,3 +295,4 @@ test('T15941', normal, ghci_script, ['T15941.script']) test('T16030', normal, ghci_script, ['T16030.script']) test('T11606', normal, ghci_script, ['T11606.script']) test('T16089', normal, ghci_script, ['T16089.script']) +test('T16527', normal, ghci_script, ['T16527.script']) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/8f5f791b79f33e0aebffadd0d824c7ae9ff5bc6e...7ad5ea62db63921ee3e68a7cda89052a7dc612ee -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/8f5f791b79f33e0aebffadd0d824c7ae9ff5bc6e...7ad5ea62db63921ee3e68a7cda89052a7dc612ee You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 15:36:56 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 11:36:56 -0400 Subject: [Git][ghc/ghc] Pushed new branch cherry-pick-36d38047 Message-ID: <5cb0b09892034_62b33fa2e2a1e0b831051c5@gitlab.haskell.org.mail> Ben Gamari pushed new branch cherry-pick-36d38047 at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/cherry-pick-36d38047 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 15:47:51 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 11:47:51 -0400 Subject: [Git][ghc/ghc][cherry-pick-908b4b86] 19 commits: Correct duplicate 4.12.0.0 entry in base's changelog Message-ID: <5cb0b32724a75_62b33fa2e2a1e0b831320bd@gitlab.haskell.org.mail> Ben Gamari pushed to branch cherry-pick-908b4b86 at Glasgow Haskell Compiler / GHC Commits: db5a43a9 by Ryan Scott at 2019-04-02T18:22:28Z Correct duplicate 4.12.0.0 entry in base's changelog See #16490. [ci skip] - - - - - c0661417 by Ben Gamari at 2019-04-02T18:22:48Z Bump transformers to 0.5.6.2 See #16199. - - - - - d90dcd4a by Ryan Scott at 2019-04-02T18:24:17Z Bump array submodule This bumps `array` to version 0.5.4.0 so that we can distinguish it with `MIN_VERSION_array` (as it introduces some changes to the `Show` instance for `UArray`). - - - - - ac12033a by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update binary submodule to latest master branch tip - - - - - 6ac90706 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update text submodule - - - - - 29e38980 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update unix submodule - - - - - 48989b49 by Herbert Valerio Riedel at 2019-04-05T17:10:06Z Update deepseq submodule - - - - - 9d228b75 by Herbert Valerio Riedel at 2019-04-05T17:10:45Z Update haskeline submodule - - - - - ab97500f by Herbert Valerio Riedel at 2019-04-05T17:11:06Z Update parsec submodule - - - - - 950d45b9 by Herbert Valerio Riedel at 2019-04-05T17:11:35Z Update process submodule - - - - - 47b1a718 by Herbert Valerio Riedel at 2019-04-05T17:12:01Z Update stm submodule - - - - - 8cb38504 by Herbert Valerio Riedel at 2019-04-05T17:12:28Z Update terminfo submodule - - - - - 94a576f3 by Herbert Valerio Riedel at 2019-04-05T17:13:52Z Update hpc submodule - - - - - dd26d493 by Herbert Valerio Riedel at 2019-04-05T17:14:19Z Update filepath submodule - - - - - 460eec60 by Herbert Valerio Riedel at 2019-04-05T17:14:40Z Update directory submodule - - - - - 3ab1b786 by Herbert Valerio Riedel at 2019-04-05T17:16:27Z Update parallel submodule - - - - - 86ce5718 by Herbert Valerio Riedel at 2019-04-05T17:20:02Z base: Remove `Monad(fail)` method and reexport `MonadFail(fail)` instead As per https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail Coauthored-by: Ben Gamari <ben at well-typed.com> - - - - - 20f8f2f9 by Ryan Scott at 2019-04-09T20:21:35Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 36d64990 by Ömer Sinan Ağacan at 2019-04-12T15:47:46Z Fix two bugs in stg_ap_0_fast in profiling runtime This includes two bug fixes in profiling version of stg_ap_0_fast: - PAPs allocated by stg_ap_0_fast are now correctly tagged. This invariant is checked in Sanity.c:checkPAP. (This was originally implemented in 2693eb11f5, later reverted with ab55b4ddb7 because it revealed the bug below, but it wasn't clear at the time whether the bug was the one below or something in the commit) - The local variable `untaggedfun` is now marked as a pointer so it survives GC. With this we finally fix all known bugs caught in #15508. `concprog001` now works reliably with prof+threaded and prof runtimes (with and without -debug). (cherry picked from commit 908b4b8659713f0b7a1704ce33c7fa30e3e0ffc3) - - - - - 30 changed files: - .gitmodules - compiler/ghc.cabal.in - compiler/iface/IfaceSyn.hs - compiler/main/DynFlags.hs - compiler/prelude/PrelNames.hs - compiler/rename/RnExpr.hs - compiler/rename/RnSource.hs - compiler/simplCore/SimplCore.hs - compiler/typecheck/TcMatches.hs - ghc/ghc-bin.cabal.in - libraries/array - libraries/base/Control/Monad.hs - libraries/base/Control/Monad/ST/Lazy/Imp.hs - libraries/base/GHC/Base.hs - libraries/base/GHC/Conc/Sync.hs - libraries/base/GHC/TopHandler.hs - libraries/base/Prelude.hs - libraries/base/System/IO.hs - libraries/base/Text/ParserCombinators/ReadP.hs - libraries/base/Text/ParserCombinators/ReadPrec.hs - libraries/base/base.cabal - libraries/base/changelog.md - libraries/binary - libraries/deepseq - libraries/directory - libraries/filepath - libraries/haskeline - libraries/hpc - libraries/parallel - libraries/parsec The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/c79384716949ae8c01b6f2708a12dac1de93b43a...36d64990bcecba78ddc078b66d5a29c676c67743 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/c79384716949ae8c01b6f2708a12dac1de93b43a...36d64990bcecba78ddc078b66d5a29c676c67743 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 15:50:24 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 11:50:24 -0400 Subject: [Git][ghc/ghc][wip/codeowners] 13 commits: users-guide: Document how to disable package environments Message-ID: <5cb0b3c0ef489_62b33fa2e2a1e0b831344e7@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/codeowners at Glasgow Haskell Compiler / GHC Commits: 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - 8decb964 by Ben Gamari at 2019-04-12T15:50:09Z CODEOWNERS: Add simonmar as owner of rts/linker I suspect this is why @simonmar wasn't notified of !706. [skip ci] - - - - - 30 changed files: - .gitlab-ci.yml - .gitlab/win32-init.sh - CODEOWNERS - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/ghci/ByteCodeLink.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/main/SysTools.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs - compiler/nativeGen/X86/RegInfo.hs - compiler/nativeGen/X86/Regs.hs - compiler/types/TyCon.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/0469fc2057b3844574ed0233c602e603184feb0f...8decb964c73499c296b3f7ca36ba8364ce19e150 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/0469fc2057b3844574ed0233c602e603184feb0f...8decb964c73499c296b3f7ca36ba8364ce19e150 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 16:03:39 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 12:03:39 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/debug-windows-hadrian Message-ID: <5cb0b6db3610_62b39c764343142063@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/debug-windows-hadrian at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/debug-windows-hadrian You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 17:07:20 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Fri, 12 Apr 2019 13:07:20 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 5 commits: Hadrian: fix binary-dir with --docs=none Message-ID: <5cb0c5c89e61f_62b33fa2e572ecec3163396@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - 4c07fa61 by Sylvain Henry at 2019-04-12T17:07:14Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - 93c42be4 by Ben Gamari at 2019-04-12T17:07:15Z gitlab-ci: Ensure that version number has three components - - - - - 12 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - compiler/ghci/ByteCodeLink.hs - configure.ac - − distrib/remilestoning.pl - hadrian/src/Rules/BinaryDist.hs - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 Changes: ===================================== .gitlab-ci.yml ===================================== @@ -59,6 +59,7 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-version-number.sh dependencies: [] tags: - lint ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +set -e + +grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== compiler/ghci/ByteCodeLink.hs ===================================== @@ -154,8 +154,8 @@ linkFail who what , "the missing library using the -L/path/to/object/dir and -lmissinglibname" , "flags, or simply by naming the relevant files on the GHCi command line." , "Alternatively, this link failure might indicate a bug in GHCi." - , "If you suspect the latter, please send a bug report to:" - , " glasgow-haskell-bugs at haskell.org" + , "If you suspect the latter, please report this as a GHC bug:" + , " https://www.haskell.org/ghc/reportabug" ]) ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} ===================================== distrib/remilestoning.pl deleted ===================================== @@ -1,119 +0,0 @@ -#!/usr/bin/env perl - -use warnings; -use strict; - -use DBI; - -# ===== Config: - -my $dbfile = "trac.db"; -my $milestone = "7.4.1"; -my $test = 0; - -# ===== Code: - -my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","", {}); - -my %emailof; -my %ticketsfor; - -sub getUserAddress { - my $sth = $dbh->prepare("SELECT sid, value FROM session_attribute WHERE name = 'email'"); - $sth->execute(); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $username = $result->{sid}; - my $email = $result->{value}; - if (defined($emailof{$username})) { - die "Two e-mail addresses found for $username"; - } - if ($email =~ /@/) { - $emailof{$username} = $email; - } - else { - # warn "The e-mail address $email for $username contains no @"; - } - } - $sth->finish; -} - -sub doTickets { - my $sth = $dbh->prepare("SELECT id, summary, reporter, cc FROM ticket WHERE milestone = ? AND status = 'new'"); - $sth->execute($milestone); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $ticket = $result->{id}; - my $title = $result->{summary}; - my $reporter = $result->{reporter}; - my $cc = $result->{cc}; - my %addresses; - my $address_added; - for my $who ($reporter, split /[ ,]+/, $cc) { - $address_added = 0; - if ($who =~ /@/) { - $addresses{$who} = 1; - $address_added = 1; - } - if (defined($emailof{$who})) { - $addresses{$emailof{$who}} = 1; - $address_added = 1; - } - if ($who ne "nobody" && $address_added eq 0) { - # warn "No address found for $who"; - } - } - for my $address (keys(%addresses)) { - $ticketsfor{$address}{$ticket}{"title"} = $title; - } - } - $sth->finish; -} - -sub doEmails { - for my $email (sort (keys %ticketsfor)) { - if ($test ne 0) { - open FH, ">&STDOUT"; - } - else { - open(FH, '|-', 'mail', '-s', 'GHC bugs', '-a', 'From: glasgow-haskell-bugs at haskell.org', $email) or die "Running mail failed: $!"; - } - print FH <<'EOF'; - -Hello, - -You are receiving this mail because you are the reporter, or on the CC -list, for one or more GHC tickets that are automatically having their -priority reduced due to our post-release ticket handling policy: - https://gitlab.haskell.org/ghc/ghc/wikis/working-conventions/bug-tracker#re-milestoning-tickets-after-a-release - -The list of tickets for which you are the reporter or on the CC list is -given below. If any of these are causing problems for you, please let us -know on glasgow-haskell-bugs at haskell.org and we'll look at raising the -priority. - -Better still, if you are able to make any progress on any of the tickets -yourself (whether that be actually fixing the bug, or just making it -easier for someone else to - for example, by making a small, -self-contained test-case), then that would be a great help. We at GHC HQ -have limited resources, so if anything is waiting for us to make -progress then it can be waiting a long time! -EOF - for my $ticket (sort {$a <=> $b} (keys %{$ticketsfor{$email}})) { - my $title = $ticketsfor{$email}{$ticket}{"title"}; - print FH "\n"; - print FH "#$ticket $title:\n"; - print FH " https://gitlab.haskell.org/ghc/ghc/issues/$ticket\n"; - } - print FH <<'EOF'; - --- -The GHC Team -http://www.haskell.org/ghc/ -EOF - close FH or die "Close failed: $!"; - } -} - -&getUserAddress(); -&doTickets(); -&doEmails(); - ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -121,7 +121,13 @@ bindistRules = do copyDirectory (ghcBuildDir -/- "lib") bindistFilesDir copyDirectory (rtsIncludeDir) bindistFilesDir need ["docs"] - copyDirectory (root -/- "docs") bindistFilesDir + -- TODO: we should only embed the docs that have been generated + -- depending on the current settings (flavours' "ghcDocs" field and + -- "--docs=.." command-line flag) + -- Currently we embed the "docs" directory if it exists but it may + -- contain outdated or even invalid data. + whenM (doesDirectoryExist (root -/- "docs")) $ do + copyDirectory (root -/- "docs") bindistFilesDir when windows $ do copyDirectory (root -/- "mingw") bindistFilesDir -- we use that opportunity to delete the .stamp file that we use @@ -141,7 +147,7 @@ bindistRules = do (["configure", "Makefile"] ++ bindistInstallFiles) need $ map ((bindistFilesDir -/- "wrappers") -/-) ["check-api-annotations" , "check-ppr", "ghc", "ghc-iserv", "ghc-pkg" - , "ghci-script", "ghci", "haddock", "hpc", "hp2ps", "hsc2hs" + , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz @@ -287,9 +293,8 @@ bindistMakefile = unlines , "\tdone" , "" , "install_ghci:" - , "\t at echo \"Copying and installing ghci\"" - , "\t$(CREATE_SCRIPT) '$(WrapperBinsDir)/ghci'" - , "\t at echo \"#!$(SHELL)\" >> '$(WrapperBinsDir)/ghci'" + , "\t at echo \"Installing ghci wrapper\"" + , "\t at echo \"#!$(SHELL)\" > '$(WrapperBinsDir)/ghci'" , "\tcat wrappers/ghci-script >> '$(WrapperBinsDir)/ghci'" , "\t$(EXECUTABLE_FILE) '$(WrapperBinsDir)/ghci'" , "" @@ -325,6 +330,7 @@ bindistMakefile = unlines , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ "$(ActualLibsDir),$(docdir),$(includedir)))" + , "\trm -f '$(WrapperBinsDir)/ghci-script'" -- FIXME: we shouldn't generate it in the first place , "" , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" @@ -350,7 +356,6 @@ bindistMakefile = unlines wrapper :: FilePath -> String wrapper "ghc" = ghcWrapper wrapper "ghc-pkg" = ghcPkgWrapper -wrapper "ghci" = ghciWrapper wrapper "ghci-script" = ghciScriptWrapper wrapper "haddock" = haddockWrapper wrapper "hsc2hs" = hsc2hsWrapper @@ -367,9 +372,6 @@ ghcPkgWrapper = unlines [ "PKGCONF=\"$libdir/package.conf.d\"" , "exec \"$executablename\" --global-package-db \"$PKGCONF\" ${1+\"$@\"}" ] -ghciWrapper :: String -ghciWrapper = "exec \"$executablename\" --interactive \"$@\"\n" - haddockWrapper :: String haddockWrapper = "exec \"$executablename\" -B\"$libdir\" -l\"$libdir\" ${1+\"$@\"}\n" ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/a74a282a440c385c63395f32034c7b110422ce9e...93c42be48de8dddfbea15e8b0c682c6f567556dc -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/a74a282a440c385c63395f32034c7b110422ce9e...93c42be48de8dddfbea15e8b0c682c6f567556dc You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 17:23:28 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Fri, 12 Apr 2019 13:23:28 -0400 Subject: [Git][ghc/ghc][master] Hadrian: fix ghci wrapper script generation (#16508) Message-ID: <5cb0c99065d87_62b33fa2bec411fc317467b@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - 1 changed file: - hadrian/src/Rules/BinaryDist.hs Changes: ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -147,7 +147,7 @@ bindistRules = do (["configure", "Makefile"] ++ bindistInstallFiles) need $ map ((bindistFilesDir -/- "wrappers") -/-) ["check-api-annotations" , "check-ppr", "ghc", "ghc-iserv", "ghc-pkg" - , "ghci-script", "ghci", "haddock", "hpc", "hp2ps", "hsc2hs" + , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz @@ -293,9 +293,8 @@ bindistMakefile = unlines , "\tdone" , "" , "install_ghci:" - , "\t at echo \"Copying and installing ghci\"" - , "\t$(CREATE_SCRIPT) '$(WrapperBinsDir)/ghci'" - , "\t at echo \"#!$(SHELL)\" >> '$(WrapperBinsDir)/ghci'" + , "\t at echo \"Installing ghci wrapper\"" + , "\t at echo \"#!$(SHELL)\" > '$(WrapperBinsDir)/ghci'" , "\tcat wrappers/ghci-script >> '$(WrapperBinsDir)/ghci'" , "\t$(EXECUTABLE_FILE) '$(WrapperBinsDir)/ghci'" , "" @@ -331,6 +330,7 @@ bindistMakefile = unlines , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ "$(ActualLibsDir),$(docdir),$(includedir)))" + , "\trm -f '$(WrapperBinsDir)/ghci-script'" -- FIXME: we shouldn't generate it in the first place , "" , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" @@ -356,7 +356,6 @@ bindistMakefile = unlines wrapper :: FilePath -> String wrapper "ghc" = ghcWrapper wrapper "ghc-pkg" = ghcPkgWrapper -wrapper "ghci" = ghciWrapper wrapper "ghci-script" = ghciScriptWrapper wrapper "haddock" = haddockWrapper wrapper "hsc2hs" = hsc2hsWrapper @@ -373,9 +372,6 @@ ghcPkgWrapper = unlines [ "PKGCONF=\"$libdir/package.conf.d\"" , "exec \"$executablename\" --global-package-db \"$PKGCONF\" ${1+\"$@\"}" ] -ghciWrapper :: String -ghciWrapper = "exec \"$executablename\" --interactive \"$@\"\n" - haddockWrapper :: String haddockWrapper = "exec \"$executablename\" -B\"$libdir\" -l\"$libdir\" ${1+\"$@\"}\n" View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/beaa07d204a779cc509985765a87da95faefb359 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/beaa07d204a779cc509985765a87da95faefb359 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 17:23:33 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Fri, 12 Apr 2019 13:23:33 -0400 Subject: [Git][ghc/ghc][wip/lint-check-version-number] 5 commits: Hadrian: fix binary-dir with --docs=none Message-ID: <5cb0c995d6e77_62b33fa2e3e167643176154@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/lint-check-version-number at Glasgow Haskell Compiler / GHC Commits: c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 12 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - compiler/ghci/ByteCodeLink.hs - configure.ac - − distrib/remilestoning.pl - hadrian/src/Rules/BinaryDist.hs - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 Changes: ===================================== .gitlab-ci.yml ===================================== @@ -59,6 +59,7 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-version-number.sh dependencies: [] tags: - lint ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +set -e + +grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== compiler/ghci/ByteCodeLink.hs ===================================== @@ -154,8 +154,8 @@ linkFail who what , "the missing library using the -L/path/to/object/dir and -lmissinglibname" , "flags, or simply by naming the relevant files on the GHCi command line." , "Alternatively, this link failure might indicate a bug in GHCi." - , "If you suspect the latter, please send a bug report to:" - , " glasgow-haskell-bugs at haskell.org" + , "If you suspect the latter, please report this as a GHC bug:" + , " https://www.haskell.org/ghc/reportabug" ]) ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} ===================================== distrib/remilestoning.pl deleted ===================================== @@ -1,119 +0,0 @@ -#!/usr/bin/env perl - -use warnings; -use strict; - -use DBI; - -# ===== Config: - -my $dbfile = "trac.db"; -my $milestone = "7.4.1"; -my $test = 0; - -# ===== Code: - -my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","", {}); - -my %emailof; -my %ticketsfor; - -sub getUserAddress { - my $sth = $dbh->prepare("SELECT sid, value FROM session_attribute WHERE name = 'email'"); - $sth->execute(); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $username = $result->{sid}; - my $email = $result->{value}; - if (defined($emailof{$username})) { - die "Two e-mail addresses found for $username"; - } - if ($email =~ /@/) { - $emailof{$username} = $email; - } - else { - # warn "The e-mail address $email for $username contains no @"; - } - } - $sth->finish; -} - -sub doTickets { - my $sth = $dbh->prepare("SELECT id, summary, reporter, cc FROM ticket WHERE milestone = ? AND status = 'new'"); - $sth->execute($milestone); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $ticket = $result->{id}; - my $title = $result->{summary}; - my $reporter = $result->{reporter}; - my $cc = $result->{cc}; - my %addresses; - my $address_added; - for my $who ($reporter, split /[ ,]+/, $cc) { - $address_added = 0; - if ($who =~ /@/) { - $addresses{$who} = 1; - $address_added = 1; - } - if (defined($emailof{$who})) { - $addresses{$emailof{$who}} = 1; - $address_added = 1; - } - if ($who ne "nobody" && $address_added eq 0) { - # warn "No address found for $who"; - } - } - for my $address (keys(%addresses)) { - $ticketsfor{$address}{$ticket}{"title"} = $title; - } - } - $sth->finish; -} - -sub doEmails { - for my $email (sort (keys %ticketsfor)) { - if ($test ne 0) { - open FH, ">&STDOUT"; - } - else { - open(FH, '|-', 'mail', '-s', 'GHC bugs', '-a', 'From: glasgow-haskell-bugs at haskell.org', $email) or die "Running mail failed: $!"; - } - print FH <<'EOF'; - -Hello, - -You are receiving this mail because you are the reporter, or on the CC -list, for one or more GHC tickets that are automatically having their -priority reduced due to our post-release ticket handling policy: - https://gitlab.haskell.org/ghc/ghc/wikis/working-conventions/bug-tracker#re-milestoning-tickets-after-a-release - -The list of tickets for which you are the reporter or on the CC list is -given below. If any of these are causing problems for you, please let us -know on glasgow-haskell-bugs at haskell.org and we'll look at raising the -priority. - -Better still, if you are able to make any progress on any of the tickets -yourself (whether that be actually fixing the bug, or just making it -easier for someone else to - for example, by making a small, -self-contained test-case), then that would be a great help. We at GHC HQ -have limited resources, so if anything is waiting for us to make -progress then it can be waiting a long time! -EOF - for my $ticket (sort {$a <=> $b} (keys %{$ticketsfor{$email}})) { - my $title = $ticketsfor{$email}{$ticket}{"title"}; - print FH "\n"; - print FH "#$ticket $title:\n"; - print FH " https://gitlab.haskell.org/ghc/ghc/issues/$ticket\n"; - } - print FH <<'EOF'; - --- -The GHC Team -http://www.haskell.org/ghc/ -EOF - close FH or die "Close failed: $!"; - } -} - -&getUserAddress(); -&doTickets(); -&doEmails(); - ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -121,7 +121,13 @@ bindistRules = do copyDirectory (ghcBuildDir -/- "lib") bindistFilesDir copyDirectory (rtsIncludeDir) bindistFilesDir need ["docs"] - copyDirectory (root -/- "docs") bindistFilesDir + -- TODO: we should only embed the docs that have been generated + -- depending on the current settings (flavours' "ghcDocs" field and + -- "--docs=.." command-line flag) + -- Currently we embed the "docs" directory if it exists but it may + -- contain outdated or even invalid data. + whenM (doesDirectoryExist (root -/- "docs")) $ do + copyDirectory (root -/- "docs") bindistFilesDir when windows $ do copyDirectory (root -/- "mingw") bindistFilesDir -- we use that opportunity to delete the .stamp file that we use @@ -141,7 +147,7 @@ bindistRules = do (["configure", "Makefile"] ++ bindistInstallFiles) need $ map ((bindistFilesDir -/- "wrappers") -/-) ["check-api-annotations" , "check-ppr", "ghc", "ghc-iserv", "ghc-pkg" - , "ghci-script", "ghci", "haddock", "hpc", "hp2ps", "hsc2hs" + , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz @@ -287,9 +293,8 @@ bindistMakefile = unlines , "\tdone" , "" , "install_ghci:" - , "\t at echo \"Copying and installing ghci\"" - , "\t$(CREATE_SCRIPT) '$(WrapperBinsDir)/ghci'" - , "\t at echo \"#!$(SHELL)\" >> '$(WrapperBinsDir)/ghci'" + , "\t at echo \"Installing ghci wrapper\"" + , "\t at echo \"#!$(SHELL)\" > '$(WrapperBinsDir)/ghci'" , "\tcat wrappers/ghci-script >> '$(WrapperBinsDir)/ghci'" , "\t$(EXECUTABLE_FILE) '$(WrapperBinsDir)/ghci'" , "" @@ -325,6 +330,7 @@ bindistMakefile = unlines , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ "$(ActualLibsDir),$(docdir),$(includedir)))" + , "\trm -f '$(WrapperBinsDir)/ghci-script'" -- FIXME: we shouldn't generate it in the first place , "" , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" @@ -350,7 +356,6 @@ bindistMakefile = unlines wrapper :: FilePath -> String wrapper "ghc" = ghcWrapper wrapper "ghc-pkg" = ghcPkgWrapper -wrapper "ghci" = ghciWrapper wrapper "ghci-script" = ghciScriptWrapper wrapper "haddock" = haddockWrapper wrapper "hsc2hs" = hsc2hsWrapper @@ -367,9 +372,6 @@ ghcPkgWrapper = unlines [ "PKGCONF=\"$libdir/package.conf.d\"" , "exec \"$executablename\" --global-package-db \"$PKGCONF\" ${1+\"$@\"}" ] -ghciWrapper :: String -ghciWrapper = "exec \"$executablename\" --interactive \"$@\"\n" - haddockWrapper :: String haddockWrapper = "exec \"$executablename\" -B\"$libdir\" -l\"$libdir\" ${1+\"$@\"}\n" ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1e545b0ad69da8c4f4f1a8a6b80bfea6f8408753...e05df3e1380989ca00ecd88b6d7d0f4aec5502fb -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1e545b0ad69da8c4f4f1a8a6b80bfea6f8408753...e05df3e1380989ca00ecd88b6d7d0f4aec5502fb You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 17:29:34 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Fri, 12 Apr 2019 13:29:34 -0400 Subject: [Git][ghc/ghc][master] gitlab-ci: Ensure that version number has three components Message-ID: <5cb0cafe8c846_62b33fa2bec411fc3179077@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 3 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - configure.ac Changes: ===================================== .gitlab-ci.yml ===================================== @@ -59,6 +59,7 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-version-number.sh dependencies: [] tags: - lint ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +set -e + +grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e05df3e1380989ca00ecd88b6d7d0f4aec5502fb -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e05df3e1380989ca00ecd88b6d7d0f4aec5502fb You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 18:30:05 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Fri, 12 Apr 2019 14:30:05 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 5 commits: Hadrian: fix ghci wrapper script generation (#16508) Message-ID: <5cb0d92d1a0db_62b3de530003192737@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 5e0f0872 by klebinger.andreas at gmx.at at 2019-04-12T18:29:56Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 48ed9ac8 by Alp Mestanogullari at 2019-04-12T18:29:58Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - 3d19895f by Alp Mestanogullari at 2019-04-12T18:29:58Z Hadrian: document -a/--test-accept - - - - - 11 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - configure.ac - docs/users_guide/debugging.rst - hadrian/doc/make.md - hadrian/doc/testsuite.md - hadrian/src/CommandLine.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Settings/Builders/RunTest.hs Changes: ===================================== .gitlab-ci.yml ===================================== @@ -59,6 +59,7 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-version-number.sh dependencies: [] tags: - lint ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +set -e + +grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== compiler/main/DynFlags.hs ===================================== @@ -384,6 +384,7 @@ data DumpFlag | Opt_D_dump_spec | Opt_D_dump_prep | Opt_D_dump_stg + | Opt_D_dump_stg_final | Opt_D_dump_call_arity | Opt_D_dump_exitify | Opt_D_dump_stranal @@ -3339,6 +3340,8 @@ dynamic_flags_deps = [ (setDumpFlag Opt_D_dump_prep) , make_ord_flag defGhcFlag "ddump-stg" (setDumpFlag Opt_D_dump_stg) + , make_ord_flag defGhcFlag "ddump-stg-final" + (setDumpFlag Opt_D_dump_stg_final) , make_ord_flag defGhcFlag "ddump-call-arity" (setDumpFlag Opt_D_dump_call_arity) , make_ord_flag defGhcFlag "ddump-exitify" ===================================== compiler/main/HscMain.hs ===================================== @@ -1470,6 +1470,8 @@ doCodeGen hsc_env this_mod data_tycons let dflags = hsc_dflags hsc_env let stg_binds_w_fvs = annTopBindingsFreeVars stg_binds + dumpIfSet_dyn dflags Opt_D_dump_stg_final + "STG for code gen:" (pprGenStgTopBindings stg_binds_w_fvs) let cmm_stream :: Stream IO CmmGroup () cmm_stream = {-# SCC "StgCmm" #-} StgCmm.codeGen dflags this_mod data_tycons ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} ===================================== docs/users_guide/debugging.rst ===================================== @@ -380,6 +380,11 @@ These flags dump various phases of GHC's STG pipeline. Show the output of the intermediate STG-to-STG pass. (*lots* of output!) +.. ghc-flag:: -ddump-stg-final + :shortdesc: Show output of last STG pass. + :type: dynamic + + Show the output of the last STG pass before we generate Cmm. C-\\- representation ~~~~~~~~~~~~~~~~~~~~ ===================================== hadrian/doc/make.md ===================================== @@ -123,6 +123,8 @@ time you fire up a build. This is not possible with the Make build system. make test # (1) make test TEST=plugins01 # (2) make test TEST="plugins01 plugins02" # (3) + make accept # (4) + PLATFORM=YES OS=YES make accept # (5) # Hadrian @@ -134,6 +136,12 @@ time you fire up a build. This is not possible with the Make build system. build test --only="plugins01 plugins02" # equivalent to (3) TEST="plugins01 plugins02" build test # equivalent to (3) TEST=plugins01 build test --only=plugins02 # equivalent to (3) + + build test -a # equivalent to (4) + build test --test-accept # equivalent to (4) + + PLATFORM=YES OS=YES build test -a # equivalent to (5) + PLATFORM=YES OS=YES build test --test-accept # equivalent to (5) ``` As illustrated in the examples above, you can use the `TEST` environment ===================================== hadrian/doc/testsuite.md ===================================== @@ -40,6 +40,31 @@ TEST="test1 test2" build test TEST="test1 test2" build test --only="test3 test4" ``` +## Accepting new output + +You can use the `-a` or `--test-accept` flag to "accept" the new +output of your tests. This has the effect of updating the expected +output of all the tests that fail due to mismatching output, so as to +consider the new output the correct one. + +When the `PLATFORM` environment variable is set to `YES`, passing this flag has +the effect of accepting the new output for the current platform. + +When the `OS` environment variable is set to `YES`, passing this flag has the +effect of accepting the new output for all word sizes on the current OS. + +``` sh +# accept new output for all tests +build test -a + +# just run and accept new output for 'test123' and 'test456' +build test -a --only="test123 test456" + +# accept new output for current platform and all word sizes for +# the current OS, for all tests +PLATFORM=YES OS=YES build test -a +``` + ## Performance tests You can use the `--only-perf` and `--skip-perf` flags to ===================================== hadrian/src/CommandLine.hs ===================================== @@ -56,7 +56,8 @@ data TestArgs = TestArgs , testSpeed :: TestSpeed , testSummary :: Maybe FilePath , testVerbosity :: Maybe String - , testWays :: [String] } + , testWays :: [String] + , testAccept :: Bool} deriving (Eq, Show) -- | Default value for `TestArgs`. @@ -73,7 +74,8 @@ defaultTestArgs = TestArgs , testSpeed = TestNormal , testSummary = Nothing , testVerbosity = Nothing - , testWays = [] } + , testWays = [] + , testAccept = False } readConfigure :: Either String (CommandLineArgs -> CommandLineArgs) readConfigure = Right $ \flags -> flags { configure = True } @@ -124,6 +126,9 @@ readProgressInfo ms = readTestKeepFiles :: Either String (CommandLineArgs -> CommandLineArgs) readTestKeepFiles = Right $ \flags -> flags { testArgs = (testArgs flags) { testKeepFiles = True } } +readTestAccept :: Either String (CommandLineArgs -> CommandLineArgs) +readTestAccept = Right $ \flags -> flags { testArgs = (testArgs flags) { testAccept = True } } + readTestCompiler :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) readTestCompiler compiler = maybe (Left "Cannot parse compiler") (Right . set) compiler where @@ -245,7 +250,8 @@ optDescrs = , Option [] ["test-verbose"] (OptArg readTestVerbose "TEST_VERBOSE") "A verbosity value between 0 and 5. 0 is silent, 4 and higher activates extra output." , Option [] ["test-way"] (OptArg readTestWay "TEST_WAY") - "only run these ways" ] + "only run these ways" + , Option ['a'] ["test-accept"] (NoArg readTestAccept) "Accept new output of tests" ] -- | A type-indexed map containing Hadrian command line arguments to be passed -- to Shake via 'shakeExtra'. ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -147,7 +147,7 @@ bindistRules = do (["configure", "Makefile"] ++ bindistInstallFiles) need $ map ((bindistFilesDir -/- "wrappers") -/-) ["check-api-annotations" , "check-ppr", "ghc", "ghc-iserv", "ghc-pkg" - , "ghci-script", "ghci", "haddock", "hpc", "hp2ps", "hsc2hs" + , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz @@ -293,9 +293,8 @@ bindistMakefile = unlines , "\tdone" , "" , "install_ghci:" - , "\t at echo \"Copying and installing ghci\"" - , "\t$(CREATE_SCRIPT) '$(WrapperBinsDir)/ghci'" - , "\t at echo \"#!$(SHELL)\" >> '$(WrapperBinsDir)/ghci'" + , "\t at echo \"Installing ghci wrapper\"" + , "\t at echo \"#!$(SHELL)\" > '$(WrapperBinsDir)/ghci'" , "\tcat wrappers/ghci-script >> '$(WrapperBinsDir)/ghci'" , "\t$(EXECUTABLE_FILE) '$(WrapperBinsDir)/ghci'" , "" @@ -331,6 +330,7 @@ bindistMakefile = unlines , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ "$(ActualLibsDir),$(docdir),$(includedir)))" + , "\trm -f '$(WrapperBinsDir)/ghci-script'" -- FIXME: we shouldn't generate it in the first place , "" , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" @@ -356,7 +356,6 @@ bindistMakefile = unlines wrapper :: FilePath -> String wrapper "ghc" = ghcWrapper wrapper "ghc-pkg" = ghcPkgWrapper -wrapper "ghci" = ghciWrapper wrapper "ghci-script" = ghciScriptWrapper wrapper "haddock" = haddockWrapper wrapper "hsc2hs" = hsc2hsWrapper @@ -373,9 +372,6 @@ ghcPkgWrapper = unlines [ "PKGCONF=\"$libdir/package.conf.d\"" , "exec \"$executablename\" --global-package-db \"$PKGCONF\" ${1+\"$@\"}" ] -ghciWrapper :: String -ghciWrapper = "exec \"$executablename\" --interactive \"$@\"\n" - haddockWrapper :: String haddockWrapper = "exec \"$executablename\" -B\"$libdir\" -l\"$libdir\" ${1+\"$@\"}\n" ===================================== hadrian/src/Settings/Builders/RunTest.hs ===================================== @@ -71,6 +71,11 @@ runTestBuilderArgs = builder RunTest ? do debugged <- read <$> getTestSetting TestGhcDebugged keepFiles <- expr (testKeepFiles <$> userSetting defaultTestArgs) + accept <- expr (testAccept <$> userSetting defaultTestArgs) + (acceptPlatform, acceptOS) <- expr . liftIO $ + (,) <$> (maybe False (=="YES") <$> lookupEnv "PLATFORM") + <*> (maybe False (=="YES") <$> lookupEnv "OS") + windows <- expr windowsHost darwin <- expr osxHost threads <- shakeThreads <$> expr getShakeOptions @@ -95,6 +100,9 @@ runTestBuilderArgs = builder RunTest ? do , arg "-e", arg $ "darwin=" ++ show darwin , arg "-e", arg $ "config.local=False" , arg "-e", arg $ "config.cleanup=" ++ show (not keepFiles) + , arg "-e", arg $ "config.accept=" ++ show accept + , arg "-e", arg $ "config.accept_platform=" ++ show acceptPlatform + , arg "-e", arg $ "config.accept_os=" ++ show acceptOS , arg "-e", arg $ "config.exeext=" ++ quote exe , arg "-e", arg $ "config.compiler_debugged=" ++ quote (yesNo debugged) , arg "-e", arg $ "ghc_debugged=" ++ quote (yesNo debugged) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/93c42be48de8dddfbea15e8b0c682c6f567556dc...3d19895f2e5739f7a96c69793b469bf521410f40 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/93c42be48de8dddfbea15e8b0c682c6f567556dc...3d19895f2e5739f7a96c69793b469bf521410f40 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 18:46:48 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Fri, 12 Apr 2019 14:46:48 -0400 Subject: [Git][ghc/ghc][master] Add -ddump-stg-final to dump stg as it is used for codegen. Message-ID: <5cb0dd18e0fef_62b39eb720c31984f5@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3 changed files: - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - docs/users_guide/debugging.rst Changes: ===================================== compiler/main/DynFlags.hs ===================================== @@ -384,6 +384,7 @@ data DumpFlag | Opt_D_dump_spec | Opt_D_dump_prep | Opt_D_dump_stg + | Opt_D_dump_stg_final | Opt_D_dump_call_arity | Opt_D_dump_exitify | Opt_D_dump_stranal @@ -3339,6 +3340,8 @@ dynamic_flags_deps = [ (setDumpFlag Opt_D_dump_prep) , make_ord_flag defGhcFlag "ddump-stg" (setDumpFlag Opt_D_dump_stg) + , make_ord_flag defGhcFlag "ddump-stg-final" + (setDumpFlag Opt_D_dump_stg_final) , make_ord_flag defGhcFlag "ddump-call-arity" (setDumpFlag Opt_D_dump_call_arity) , make_ord_flag defGhcFlag "ddump-exitify" ===================================== compiler/main/HscMain.hs ===================================== @@ -1470,6 +1470,8 @@ doCodeGen hsc_env this_mod data_tycons let dflags = hsc_dflags hsc_env let stg_binds_w_fvs = annTopBindingsFreeVars stg_binds + dumpIfSet_dyn dflags Opt_D_dump_stg_final + "STG for code gen:" (pprGenStgTopBindings stg_binds_w_fvs) let cmm_stream :: Stream IO CmmGroup () cmm_stream = {-# SCC "StgCmm" #-} StgCmm.codeGen dflags this_mod data_tycons ===================================== docs/users_guide/debugging.rst ===================================== @@ -380,6 +380,11 @@ These flags dump various phases of GHC's STG pipeline. Show the output of the intermediate STG-to-STG pass. (*lots* of output!) +.. ghc-flag:: -ddump-stg-final + :shortdesc: Show output of last STG pass. + :type: dynamic + + Show the output of the last STG pass before we generate Cmm. C-\\- representation ~~~~~~~~~~~~~~~~~~~~ View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/885d2e04854f038fbb899ab545df2b57d9b8bba4 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/885d2e04854f038fbb899ab545df2b57d9b8bba4 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 18:54:04 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Fri, 12 Apr 2019 14:54:04 -0400 Subject: [Git][ghc/ghc][master] 2 commits: Hadrian: add a --test-accept/-a flag, to mimic 'make accept' Message-ID: <5cb0decc3a996_62b33fa2bec411fc32008a5@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 4 changed files: - hadrian/doc/make.md - hadrian/doc/testsuite.md - hadrian/src/CommandLine.hs - hadrian/src/Settings/Builders/RunTest.hs Changes: ===================================== hadrian/doc/make.md ===================================== @@ -123,6 +123,8 @@ time you fire up a build. This is not possible with the Make build system. make test # (1) make test TEST=plugins01 # (2) make test TEST="plugins01 plugins02" # (3) + make accept # (4) + PLATFORM=YES OS=YES make accept # (5) # Hadrian @@ -134,6 +136,12 @@ time you fire up a build. This is not possible with the Make build system. build test --only="plugins01 plugins02" # equivalent to (3) TEST="plugins01 plugins02" build test # equivalent to (3) TEST=plugins01 build test --only=plugins02 # equivalent to (3) + + build test -a # equivalent to (4) + build test --test-accept # equivalent to (4) + + PLATFORM=YES OS=YES build test -a # equivalent to (5) + PLATFORM=YES OS=YES build test --test-accept # equivalent to (5) ``` As illustrated in the examples above, you can use the `TEST` environment ===================================== hadrian/doc/testsuite.md ===================================== @@ -40,6 +40,31 @@ TEST="test1 test2" build test TEST="test1 test2" build test --only="test3 test4" ``` +## Accepting new output + +You can use the `-a` or `--test-accept` flag to "accept" the new +output of your tests. This has the effect of updating the expected +output of all the tests that fail due to mismatching output, so as to +consider the new output the correct one. + +When the `PLATFORM` environment variable is set to `YES`, passing this flag has +the effect of accepting the new output for the current platform. + +When the `OS` environment variable is set to `YES`, passing this flag has the +effect of accepting the new output for all word sizes on the current OS. + +``` sh +# accept new output for all tests +build test -a + +# just run and accept new output for 'test123' and 'test456' +build test -a --only="test123 test456" + +# accept new output for current platform and all word sizes for +# the current OS, for all tests +PLATFORM=YES OS=YES build test -a +``` + ## Performance tests You can use the `--only-perf` and `--skip-perf` flags to ===================================== hadrian/src/CommandLine.hs ===================================== @@ -56,7 +56,8 @@ data TestArgs = TestArgs , testSpeed :: TestSpeed , testSummary :: Maybe FilePath , testVerbosity :: Maybe String - , testWays :: [String] } + , testWays :: [String] + , testAccept :: Bool} deriving (Eq, Show) -- | Default value for `TestArgs`. @@ -73,7 +74,8 @@ defaultTestArgs = TestArgs , testSpeed = TestNormal , testSummary = Nothing , testVerbosity = Nothing - , testWays = [] } + , testWays = [] + , testAccept = False } readConfigure :: Either String (CommandLineArgs -> CommandLineArgs) readConfigure = Right $ \flags -> flags { configure = True } @@ -124,6 +126,9 @@ readProgressInfo ms = readTestKeepFiles :: Either String (CommandLineArgs -> CommandLineArgs) readTestKeepFiles = Right $ \flags -> flags { testArgs = (testArgs flags) { testKeepFiles = True } } +readTestAccept :: Either String (CommandLineArgs -> CommandLineArgs) +readTestAccept = Right $ \flags -> flags { testArgs = (testArgs flags) { testAccept = True } } + readTestCompiler :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) readTestCompiler compiler = maybe (Left "Cannot parse compiler") (Right . set) compiler where @@ -245,7 +250,8 @@ optDescrs = , Option [] ["test-verbose"] (OptArg readTestVerbose "TEST_VERBOSE") "A verbosity value between 0 and 5. 0 is silent, 4 and higher activates extra output." , Option [] ["test-way"] (OptArg readTestWay "TEST_WAY") - "only run these ways" ] + "only run these ways" + , Option ['a'] ["test-accept"] (NoArg readTestAccept) "Accept new output of tests" ] -- | A type-indexed map containing Hadrian command line arguments to be passed -- to Shake via 'shakeExtra'. ===================================== hadrian/src/Settings/Builders/RunTest.hs ===================================== @@ -71,6 +71,11 @@ runTestBuilderArgs = builder RunTest ? do debugged <- read <$> getTestSetting TestGhcDebugged keepFiles <- expr (testKeepFiles <$> userSetting defaultTestArgs) + accept <- expr (testAccept <$> userSetting defaultTestArgs) + (acceptPlatform, acceptOS) <- expr . liftIO $ + (,) <$> (maybe False (=="YES") <$> lookupEnv "PLATFORM") + <*> (maybe False (=="YES") <$> lookupEnv "OS") + windows <- expr windowsHost darwin <- expr osxHost threads <- shakeThreads <$> expr getShakeOptions @@ -95,6 +100,9 @@ runTestBuilderArgs = builder RunTest ? do , arg "-e", arg $ "darwin=" ++ show darwin , arg "-e", arg $ "config.local=False" , arg "-e", arg $ "config.cleanup=" ++ show (not keepFiles) + , arg "-e", arg $ "config.accept=" ++ show accept + , arg "-e", arg $ "config.accept_platform=" ++ show acceptPlatform + , arg "-e", arg $ "config.accept_os=" ++ show acceptOS , arg "-e", arg $ "config.exeext=" ++ quote exe , arg "-e", arg $ "config.compiler_debugged=" ++ quote (yesNo debugged) , arg "-e", arg $ "ghc_debugged=" ++ quote (yesNo debugged) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/885d2e04854f038fbb899ab545df2b57d9b8bba4...f4b5a6c040abb492367fdfe18c4f2ebf03c0d084 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/885d2e04854f038fbb899ab545df2b57d9b8bba4...f4b5a6c040abb492367fdfe18c4f2ebf03c0d084 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 19:32:28 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 15:32:28 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/disable-windows-hadrian Message-ID: <5cb0e7cc8cc4e_62b39950a2032027c9@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/disable-windows-hadrian at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/disable-windows-hadrian You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 19:33:26 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 15:33:26 -0400 Subject: [Git][ghc/ghc][wip/disable-windows-hadrian] 30 commits: Fix #16282. Message-ID: <5cb0e806f226b_62b39950a203202957@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/disable-windows-hadrian at Glasgow Haskell Compiler / GHC Commits: 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - f854ebab by Ben Gamari at 2019-04-12T19:33:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - .gitlab/win32-init.sh - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/ghci/ByteCodeLink.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscMain.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/c1d85d6ec9cdb42bff0a915539eaa55f4d351329...f854ebabaf3eb61733ad6d91ebd16558a884b7f4 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/c1d85d6ec9cdb42bff0a915539eaa55f4d351329...f854ebabaf3eb61733ad6d91ebd16558a884b7f4 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 19:35:09 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 15:35:09 -0400 Subject: [Git][ghc/ghc][wip/disable-windows-hadrian] gitlab: Disable windows-hadrian job Message-ID: <5cb0e86d3fd8e_62b33fa2c8aa8a5c32044b5@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/disable-windows-hadrian at Glasgow Haskell Compiler / GHC Commits: d82cd786 by Ben Gamari at 2019-04-12T19:34:56Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. [skip ci] - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -525,7 +525,7 @@ validate-x86_64-linux-fedora27: paths: - ghc.tar.xz -validate-x86_64-windows-hadrian: +.validate-x86_64-windows-hadrian: extends: .build-windows-hadrian variables: MSYSTEM: MINGW64 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d82cd786789767cef271da083a7381533f10c67c -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d82cd786789767cef271da083a7381533f10c67c You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 19:40:52 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 15:40:52 -0400 Subject: [Git][ghc/ghc][wip/fix-marge] 9 commits: Hadrian: fix binary-dir with --docs=none Message-ID: <5cb0e9c48c1ef_62b33fa2c648a9243204980@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/fix-marge at Glasgow Haskell Compiler / GHC Commits: c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 7a3c27a7 by Ben Gamari at 2019-04-12T19:40:41Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 19 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - compiler/ghci/ByteCodeLink.hs - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - configure.ac - − distrib/remilestoning.pl - docs/users_guide/debugging.rst - hadrian/doc/make.md - hadrian/doc/testsuite.md - hadrian/src/CommandLine.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Settings/Builders/RunTest.hs - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 Changes: ===================================== .gitlab-ci.yml ===================================== @@ -59,6 +59,7 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-version-number.sh dependencies: [] tags: - lint @@ -84,7 +85,14 @@ lint-submods: refs: - master - /ghc-[0-9]+\.[0-9]+/ - - wip/marge_bot_batch_merge_job + +lint-submods-marge: + extends: .lint-submods + only: + refs: + - merge_requests + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*wip/marge_bot_batch_merge_job.*/ lint-submods-mr: extends: .lint-submods ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +set -e + +grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== compiler/ghci/ByteCodeLink.hs ===================================== @@ -154,8 +154,8 @@ linkFail who what , "the missing library using the -L/path/to/object/dir and -lmissinglibname" , "flags, or simply by naming the relevant files on the GHCi command line." , "Alternatively, this link failure might indicate a bug in GHCi." - , "If you suspect the latter, please send a bug report to:" - , " glasgow-haskell-bugs at haskell.org" + , "If you suspect the latter, please report this as a GHC bug:" + , " https://www.haskell.org/ghc/reportabug" ]) ===================================== compiler/main/DynFlags.hs ===================================== @@ -384,6 +384,7 @@ data DumpFlag | Opt_D_dump_spec | Opt_D_dump_prep | Opt_D_dump_stg + | Opt_D_dump_stg_final | Opt_D_dump_call_arity | Opt_D_dump_exitify | Opt_D_dump_stranal @@ -3339,6 +3340,8 @@ dynamic_flags_deps = [ (setDumpFlag Opt_D_dump_prep) , make_ord_flag defGhcFlag "ddump-stg" (setDumpFlag Opt_D_dump_stg) + , make_ord_flag defGhcFlag "ddump-stg-final" + (setDumpFlag Opt_D_dump_stg_final) , make_ord_flag defGhcFlag "ddump-call-arity" (setDumpFlag Opt_D_dump_call_arity) , make_ord_flag defGhcFlag "ddump-exitify" ===================================== compiler/main/HscMain.hs ===================================== @@ -1470,6 +1470,8 @@ doCodeGen hsc_env this_mod data_tycons let dflags = hsc_dflags hsc_env let stg_binds_w_fvs = annTopBindingsFreeVars stg_binds + dumpIfSet_dyn dflags Opt_D_dump_stg_final + "STG for code gen:" (pprGenStgTopBindings stg_binds_w_fvs) let cmm_stream :: Stream IO CmmGroup () cmm_stream = {-# SCC "StgCmm" #-} StgCmm.codeGen dflags this_mod data_tycons ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} ===================================== distrib/remilestoning.pl deleted ===================================== @@ -1,119 +0,0 @@ -#!/usr/bin/env perl - -use warnings; -use strict; - -use DBI; - -# ===== Config: - -my $dbfile = "trac.db"; -my $milestone = "7.4.1"; -my $test = 0; - -# ===== Code: - -my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","", {}); - -my %emailof; -my %ticketsfor; - -sub getUserAddress { - my $sth = $dbh->prepare("SELECT sid, value FROM session_attribute WHERE name = 'email'"); - $sth->execute(); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $username = $result->{sid}; - my $email = $result->{value}; - if (defined($emailof{$username})) { - die "Two e-mail addresses found for $username"; - } - if ($email =~ /@/) { - $emailof{$username} = $email; - } - else { - # warn "The e-mail address $email for $username contains no @"; - } - } - $sth->finish; -} - -sub doTickets { - my $sth = $dbh->prepare("SELECT id, summary, reporter, cc FROM ticket WHERE milestone = ? AND status = 'new'"); - $sth->execute($milestone); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $ticket = $result->{id}; - my $title = $result->{summary}; - my $reporter = $result->{reporter}; - my $cc = $result->{cc}; - my %addresses; - my $address_added; - for my $who ($reporter, split /[ ,]+/, $cc) { - $address_added = 0; - if ($who =~ /@/) { - $addresses{$who} = 1; - $address_added = 1; - } - if (defined($emailof{$who})) { - $addresses{$emailof{$who}} = 1; - $address_added = 1; - } - if ($who ne "nobody" && $address_added eq 0) { - # warn "No address found for $who"; - } - } - for my $address (keys(%addresses)) { - $ticketsfor{$address}{$ticket}{"title"} = $title; - } - } - $sth->finish; -} - -sub doEmails { - for my $email (sort (keys %ticketsfor)) { - if ($test ne 0) { - open FH, ">&STDOUT"; - } - else { - open(FH, '|-', 'mail', '-s', 'GHC bugs', '-a', 'From: glasgow-haskell-bugs at haskell.org', $email) or die "Running mail failed: $!"; - } - print FH <<'EOF'; - -Hello, - -You are receiving this mail because you are the reporter, or on the CC -list, for one or more GHC tickets that are automatically having their -priority reduced due to our post-release ticket handling policy: - https://gitlab.haskell.org/ghc/ghc/wikis/working-conventions/bug-tracker#re-milestoning-tickets-after-a-release - -The list of tickets for which you are the reporter or on the CC list is -given below. If any of these are causing problems for you, please let us -know on glasgow-haskell-bugs at haskell.org and we'll look at raising the -priority. - -Better still, if you are able to make any progress on any of the tickets -yourself (whether that be actually fixing the bug, or just making it -easier for someone else to - for example, by making a small, -self-contained test-case), then that would be a great help. We at GHC HQ -have limited resources, so if anything is waiting for us to make -progress then it can be waiting a long time! -EOF - for my $ticket (sort {$a <=> $b} (keys %{$ticketsfor{$email}})) { - my $title = $ticketsfor{$email}{$ticket}{"title"}; - print FH "\n"; - print FH "#$ticket $title:\n"; - print FH " https://gitlab.haskell.org/ghc/ghc/issues/$ticket\n"; - } - print FH <<'EOF'; - --- -The GHC Team -http://www.haskell.org/ghc/ -EOF - close FH or die "Close failed: $!"; - } -} - -&getUserAddress(); -&doTickets(); -&doEmails(); - ===================================== docs/users_guide/debugging.rst ===================================== @@ -380,6 +380,11 @@ These flags dump various phases of GHC's STG pipeline. Show the output of the intermediate STG-to-STG pass. (*lots* of output!) +.. ghc-flag:: -ddump-stg-final + :shortdesc: Show output of last STG pass. + :type: dynamic + + Show the output of the last STG pass before we generate Cmm. C-\\- representation ~~~~~~~~~~~~~~~~~~~~ ===================================== hadrian/doc/make.md ===================================== @@ -123,6 +123,8 @@ time you fire up a build. This is not possible with the Make build system. make test # (1) make test TEST=plugins01 # (2) make test TEST="plugins01 plugins02" # (3) + make accept # (4) + PLATFORM=YES OS=YES make accept # (5) # Hadrian @@ -134,6 +136,12 @@ time you fire up a build. This is not possible with the Make build system. build test --only="plugins01 plugins02" # equivalent to (3) TEST="plugins01 plugins02" build test # equivalent to (3) TEST=plugins01 build test --only=plugins02 # equivalent to (3) + + build test -a # equivalent to (4) + build test --test-accept # equivalent to (4) + + PLATFORM=YES OS=YES build test -a # equivalent to (5) + PLATFORM=YES OS=YES build test --test-accept # equivalent to (5) ``` As illustrated in the examples above, you can use the `TEST` environment ===================================== hadrian/doc/testsuite.md ===================================== @@ -40,6 +40,31 @@ TEST="test1 test2" build test TEST="test1 test2" build test --only="test3 test4" ``` +## Accepting new output + +You can use the `-a` or `--test-accept` flag to "accept" the new +output of your tests. This has the effect of updating the expected +output of all the tests that fail due to mismatching output, so as to +consider the new output the correct one. + +When the `PLATFORM` environment variable is set to `YES`, passing this flag has +the effect of accepting the new output for the current platform. + +When the `OS` environment variable is set to `YES`, passing this flag has the +effect of accepting the new output for all word sizes on the current OS. + +``` sh +# accept new output for all tests +build test -a + +# just run and accept new output for 'test123' and 'test456' +build test -a --only="test123 test456" + +# accept new output for current platform and all word sizes for +# the current OS, for all tests +PLATFORM=YES OS=YES build test -a +``` + ## Performance tests You can use the `--only-perf` and `--skip-perf` flags to ===================================== hadrian/src/CommandLine.hs ===================================== @@ -56,7 +56,8 @@ data TestArgs = TestArgs , testSpeed :: TestSpeed , testSummary :: Maybe FilePath , testVerbosity :: Maybe String - , testWays :: [String] } + , testWays :: [String] + , testAccept :: Bool} deriving (Eq, Show) -- | Default value for `TestArgs`. @@ -73,7 +74,8 @@ defaultTestArgs = TestArgs , testSpeed = TestNormal , testSummary = Nothing , testVerbosity = Nothing - , testWays = [] } + , testWays = [] + , testAccept = False } readConfigure :: Either String (CommandLineArgs -> CommandLineArgs) readConfigure = Right $ \flags -> flags { configure = True } @@ -124,6 +126,9 @@ readProgressInfo ms = readTestKeepFiles :: Either String (CommandLineArgs -> CommandLineArgs) readTestKeepFiles = Right $ \flags -> flags { testArgs = (testArgs flags) { testKeepFiles = True } } +readTestAccept :: Either String (CommandLineArgs -> CommandLineArgs) +readTestAccept = Right $ \flags -> flags { testArgs = (testArgs flags) { testAccept = True } } + readTestCompiler :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) readTestCompiler compiler = maybe (Left "Cannot parse compiler") (Right . set) compiler where @@ -245,7 +250,8 @@ optDescrs = , Option [] ["test-verbose"] (OptArg readTestVerbose "TEST_VERBOSE") "A verbosity value between 0 and 5. 0 is silent, 4 and higher activates extra output." , Option [] ["test-way"] (OptArg readTestWay "TEST_WAY") - "only run these ways" ] + "only run these ways" + , Option ['a'] ["test-accept"] (NoArg readTestAccept) "Accept new output of tests" ] -- | A type-indexed map containing Hadrian command line arguments to be passed -- to Shake via 'shakeExtra'. ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -121,7 +121,13 @@ bindistRules = do copyDirectory (ghcBuildDir -/- "lib") bindistFilesDir copyDirectory (rtsIncludeDir) bindistFilesDir need ["docs"] - copyDirectory (root -/- "docs") bindistFilesDir + -- TODO: we should only embed the docs that have been generated + -- depending on the current settings (flavours' "ghcDocs" field and + -- "--docs=.." command-line flag) + -- Currently we embed the "docs" directory if it exists but it may + -- contain outdated or even invalid data. + whenM (doesDirectoryExist (root -/- "docs")) $ do + copyDirectory (root -/- "docs") bindistFilesDir when windows $ do copyDirectory (root -/- "mingw") bindistFilesDir -- we use that opportunity to delete the .stamp file that we use @@ -141,7 +147,7 @@ bindistRules = do (["configure", "Makefile"] ++ bindistInstallFiles) need $ map ((bindistFilesDir -/- "wrappers") -/-) ["check-api-annotations" , "check-ppr", "ghc", "ghc-iserv", "ghc-pkg" - , "ghci-script", "ghci", "haddock", "hpc", "hp2ps", "hsc2hs" + , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz @@ -287,9 +293,8 @@ bindistMakefile = unlines , "\tdone" , "" , "install_ghci:" - , "\t at echo \"Copying and installing ghci\"" - , "\t$(CREATE_SCRIPT) '$(WrapperBinsDir)/ghci'" - , "\t at echo \"#!$(SHELL)\" >> '$(WrapperBinsDir)/ghci'" + , "\t at echo \"Installing ghci wrapper\"" + , "\t at echo \"#!$(SHELL)\" > '$(WrapperBinsDir)/ghci'" , "\tcat wrappers/ghci-script >> '$(WrapperBinsDir)/ghci'" , "\t$(EXECUTABLE_FILE) '$(WrapperBinsDir)/ghci'" , "" @@ -325,6 +330,7 @@ bindistMakefile = unlines , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ "$(ActualLibsDir),$(docdir),$(includedir)))" + , "\trm -f '$(WrapperBinsDir)/ghci-script'" -- FIXME: we shouldn't generate it in the first place , "" , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" @@ -350,7 +356,6 @@ bindistMakefile = unlines wrapper :: FilePath -> String wrapper "ghc" = ghcWrapper wrapper "ghc-pkg" = ghcPkgWrapper -wrapper "ghci" = ghciWrapper wrapper "ghci-script" = ghciScriptWrapper wrapper "haddock" = haddockWrapper wrapper "hsc2hs" = hsc2hsWrapper @@ -367,9 +372,6 @@ ghcPkgWrapper = unlines [ "PKGCONF=\"$libdir/package.conf.d\"" , "exec \"$executablename\" --global-package-db \"$PKGCONF\" ${1+\"$@\"}" ] -ghciWrapper :: String -ghciWrapper = "exec \"$executablename\" --interactive \"$@\"\n" - haddockWrapper :: String haddockWrapper = "exec \"$executablename\" -B\"$libdir\" -l\"$libdir\" ${1+\"$@\"}\n" ===================================== hadrian/src/Settings/Builders/RunTest.hs ===================================== @@ -71,6 +71,11 @@ runTestBuilderArgs = builder RunTest ? do debugged <- read <$> getTestSetting TestGhcDebugged keepFiles <- expr (testKeepFiles <$> userSetting defaultTestArgs) + accept <- expr (testAccept <$> userSetting defaultTestArgs) + (acceptPlatform, acceptOS) <- expr . liftIO $ + (,) <$> (maybe False (=="YES") <$> lookupEnv "PLATFORM") + <*> (maybe False (=="YES") <$> lookupEnv "OS") + windows <- expr windowsHost darwin <- expr osxHost threads <- shakeThreads <$> expr getShakeOptions @@ -95,6 +100,9 @@ runTestBuilderArgs = builder RunTest ? do , arg "-e", arg $ "darwin=" ++ show darwin , arg "-e", arg $ "config.local=False" , arg "-e", arg $ "config.cleanup=" ++ show (not keepFiles) + , arg "-e", arg $ "config.accept=" ++ show accept + , arg "-e", arg $ "config.accept_platform=" ++ show acceptPlatform + , arg "-e", arg $ "config.accept_os=" ++ show acceptOS , arg "-e", arg $ "config.exeext=" ++ quote exe , arg "-e", arg $ "config.compiler_debugged=" ++ quote (yesNo debugged) , arg "-e", arg $ "ghc_debugged=" ++ quote (yesNo debugged) ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/43274e796fd909adfca42e32e18f62c8e7854af9...7a3c27a7bc8f6530fc16f2556c9292fe67ca0da1 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/43274e796fd909adfca42e32e18f62c8e7854af9...7a3c27a7bc8f6530fc16f2556c9292fe67ca0da1 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 19:41:57 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 15:41:57 -0400 Subject: [Git][ghc/ghc][master] 2 commits: gitlab: Disable windows-hadrian job Message-ID: <5cb0ea05e7682_62b3cee7cb032055ba@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -85,7 +85,14 @@ lint-submods: refs: - master - /ghc-[0-9]+\.[0-9]+/ - - wip/marge_bot_batch_merge_job + +lint-submods-marge: + extends: .lint-submods + only: + refs: + - merge_requests + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*wip/marge_bot_batch_merge_job.*/ lint-submods-mr: extends: .lint-submods @@ -525,7 +532,7 @@ validate-x86_64-linux-fedora27: paths: - ghc.tar.xz -validate-x86_64-windows-hadrian: +.validate-x86_64-windows-hadrian: extends: .build-windows-hadrian variables: MSYSTEM: MINGW64 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f4b5a6c040abb492367fdfe18c4f2ebf03c0d084...8870a51bd62f2d4ee7bd21b96240d5806c2817af -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f4b5a6c040abb492367fdfe18c4f2ebf03c0d084...8870a51bd62f2d4ee7bd21b96240d5806c2817af You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Fri Apr 12 21:01:16 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Fri, 12 Apr 2019 17:01:16 -0400 Subject: [Git][ghc/ghc] Pushed new branch wip/system-libffi Message-ID: <5cb0fc9cbbef0_62b37aa87083215199@gitlab.haskell.org.mail> Ben Gamari pushed new branch wip/system-libffi at Glasgow Haskell Compiler / GHC -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/tree/wip/system-libffi You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 13:31:42 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 09:31:42 -0400 Subject: [Git][ghc/ghc][wip/ghc-8.8-merges] gitlab: Remove doc-tarball job Message-ID: <5cb1e4be5363b_62b33fa2d1333e5832709a8@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ghc-8.8-merges at Glasgow Haskell Compiler / GHC Commits: eb41af20 by Ben Gamari at 2019-04-13T13:31:41Z gitlab: Remove doc-tarball job - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -636,28 +636,6 @@ cleanup-darwin: # Packaging ############################################################ -doc-tarball: - <<: *only-default - stage: packaging - tags: - - x86_64-linux - image: ghcci/x86_64-linux-deb9:0.2 - dependencies: - - validate-x86_64-linux-deb9 - - validate-x86_64-windows - artifacts: - paths: - - haddock.html.tar.xz - - libraries.html.tar.xz - - users_guide.html.tar.xz - - index.html - - "*.pdf" - script: - - rm -Rf docs - - bash -ex distrib/mkDocs/mkDocs ghc-x86_64-deb9-linux.tar.xz ghc-x86_64-mingw32.tar.xz - - ls -lh - - mv docs/*.tar.xz docs/index.html . - source-tarball: stage: packaging tags: View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/eb41af204046fbe75ffe11ec762fe04de0c564d2 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/eb41af204046fbe75ffe11ec762fe04de0c564d2 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 13:35:27 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 09:35:27 -0400 Subject: [Git][ghc/ghc][ghc-8.8] Bump hpc submodule Message-ID: <5cb1e59f6b5f1_62b33fa2d1333e58327296b@gitlab.haskell.org.mail> Ben Gamari pushed to branch ghc-8.8 at Glasgow Haskell Compiler / GHC Commits: 6e254ee3 by Ryan Scott at 2019-04-12T15:23:33Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 1 changed file: - libraries/hpc Changes: ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 7673420558e2a54affe530911d555cc78577ad87 +Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/6e254ee34e2e259639ffea7accc0e651c4319791 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/6e254ee34e2e259639ffea7accc0e651c4319791 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 13:36:24 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 09:36:24 -0400 Subject: [Git][ghc/ghc][wip/codeowners] 8 commits: Hadrian: fix ghci wrapper script generation (#16508) Message-ID: <5cb1e5d83df84_62b33fa2e3c6700832737c6@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/codeowners at Glasgow Haskell Compiler / GHC Commits: beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - b7343e19 by Ben Gamari at 2019-04-13T13:36:19Z CODEOWNERS: Add simonmar as owner of rts/linker I suspect this is why @simonmar wasn't notified of !706. [skip ci] - - - - - 12 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - CODEOWNERS - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - configure.ac - docs/users_guide/debugging.rst - hadrian/doc/make.md - hadrian/doc/testsuite.md - hadrian/src/CommandLine.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Settings/Builders/RunTest.hs Changes: ===================================== .gitlab-ci.yml ===================================== @@ -59,6 +59,7 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-version-number.sh dependencies: [] tags: - lint @@ -84,7 +85,14 @@ lint-submods: refs: - master - /ghc-[0-9]+\.[0-9]+/ - - wip/marge_bot_batch_merge_job + +lint-submods-marge: + extends: .lint-submods + only: + refs: + - merge_requests + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*wip/marge_bot_batch_merge_job.*/ lint-submods-mr: extends: .lint-submods @@ -524,7 +532,7 @@ validate-x86_64-linux-fedora27: paths: - ghc.tar.xz -validate-x86_64-windows-hadrian: +.validate-x86_64-windows-hadrian: extends: .build-windows-hadrian variables: MSYSTEM: MINGW64 ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +set -e + +grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== CODEOWNERS ===================================== @@ -12,7 +12,7 @@ # RTS-like things /rts/ @bgamari @simonmar @osa1 @Phyx @angerman -/rts/linker/ @angerman @Phyx +/rts/linker/ @angerman @Phyx @simonmar /includes/ @bgamari @simonmar @osa1 # The compiler ===================================== compiler/main/DynFlags.hs ===================================== @@ -384,6 +384,7 @@ data DumpFlag | Opt_D_dump_spec | Opt_D_dump_prep | Opt_D_dump_stg + | Opt_D_dump_stg_final | Opt_D_dump_call_arity | Opt_D_dump_exitify | Opt_D_dump_stranal @@ -3339,6 +3340,8 @@ dynamic_flags_deps = [ (setDumpFlag Opt_D_dump_prep) , make_ord_flag defGhcFlag "ddump-stg" (setDumpFlag Opt_D_dump_stg) + , make_ord_flag defGhcFlag "ddump-stg-final" + (setDumpFlag Opt_D_dump_stg_final) , make_ord_flag defGhcFlag "ddump-call-arity" (setDumpFlag Opt_D_dump_call_arity) , make_ord_flag defGhcFlag "ddump-exitify" ===================================== compiler/main/HscMain.hs ===================================== @@ -1470,6 +1470,8 @@ doCodeGen hsc_env this_mod data_tycons let dflags = hsc_dflags hsc_env let stg_binds_w_fvs = annTopBindingsFreeVars stg_binds + dumpIfSet_dyn dflags Opt_D_dump_stg_final + "STG for code gen:" (pprGenStgTopBindings stg_binds_w_fvs) let cmm_stream :: Stream IO CmmGroup () cmm_stream = {-# SCC "StgCmm" #-} StgCmm.codeGen dflags this_mod data_tycons ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} ===================================== docs/users_guide/debugging.rst ===================================== @@ -380,6 +380,11 @@ These flags dump various phases of GHC's STG pipeline. Show the output of the intermediate STG-to-STG pass. (*lots* of output!) +.. ghc-flag:: -ddump-stg-final + :shortdesc: Show output of last STG pass. + :type: dynamic + + Show the output of the last STG pass before we generate Cmm. C-\\- representation ~~~~~~~~~~~~~~~~~~~~ ===================================== hadrian/doc/make.md ===================================== @@ -123,6 +123,8 @@ time you fire up a build. This is not possible with the Make build system. make test # (1) make test TEST=plugins01 # (2) make test TEST="plugins01 plugins02" # (3) + make accept # (4) + PLATFORM=YES OS=YES make accept # (5) # Hadrian @@ -134,6 +136,12 @@ time you fire up a build. This is not possible with the Make build system. build test --only="plugins01 plugins02" # equivalent to (3) TEST="plugins01 plugins02" build test # equivalent to (3) TEST=plugins01 build test --only=plugins02 # equivalent to (3) + + build test -a # equivalent to (4) + build test --test-accept # equivalent to (4) + + PLATFORM=YES OS=YES build test -a # equivalent to (5) + PLATFORM=YES OS=YES build test --test-accept # equivalent to (5) ``` As illustrated in the examples above, you can use the `TEST` environment ===================================== hadrian/doc/testsuite.md ===================================== @@ -40,6 +40,31 @@ TEST="test1 test2" build test TEST="test1 test2" build test --only="test3 test4" ``` +## Accepting new output + +You can use the `-a` or `--test-accept` flag to "accept" the new +output of your tests. This has the effect of updating the expected +output of all the tests that fail due to mismatching output, so as to +consider the new output the correct one. + +When the `PLATFORM` environment variable is set to `YES`, passing this flag has +the effect of accepting the new output for the current platform. + +When the `OS` environment variable is set to `YES`, passing this flag has the +effect of accepting the new output for all word sizes on the current OS. + +``` sh +# accept new output for all tests +build test -a + +# just run and accept new output for 'test123' and 'test456' +build test -a --only="test123 test456" + +# accept new output for current platform and all word sizes for +# the current OS, for all tests +PLATFORM=YES OS=YES build test -a +``` + ## Performance tests You can use the `--only-perf` and `--skip-perf` flags to ===================================== hadrian/src/CommandLine.hs ===================================== @@ -56,7 +56,8 @@ data TestArgs = TestArgs , testSpeed :: TestSpeed , testSummary :: Maybe FilePath , testVerbosity :: Maybe String - , testWays :: [String] } + , testWays :: [String] + , testAccept :: Bool} deriving (Eq, Show) -- | Default value for `TestArgs`. @@ -73,7 +74,8 @@ defaultTestArgs = TestArgs , testSpeed = TestNormal , testSummary = Nothing , testVerbosity = Nothing - , testWays = [] } + , testWays = [] + , testAccept = False } readConfigure :: Either String (CommandLineArgs -> CommandLineArgs) readConfigure = Right $ \flags -> flags { configure = True } @@ -124,6 +126,9 @@ readProgressInfo ms = readTestKeepFiles :: Either String (CommandLineArgs -> CommandLineArgs) readTestKeepFiles = Right $ \flags -> flags { testArgs = (testArgs flags) { testKeepFiles = True } } +readTestAccept :: Either String (CommandLineArgs -> CommandLineArgs) +readTestAccept = Right $ \flags -> flags { testArgs = (testArgs flags) { testAccept = True } } + readTestCompiler :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) readTestCompiler compiler = maybe (Left "Cannot parse compiler") (Right . set) compiler where @@ -245,7 +250,8 @@ optDescrs = , Option [] ["test-verbose"] (OptArg readTestVerbose "TEST_VERBOSE") "A verbosity value between 0 and 5. 0 is silent, 4 and higher activates extra output." , Option [] ["test-way"] (OptArg readTestWay "TEST_WAY") - "only run these ways" ] + "only run these ways" + , Option ['a'] ["test-accept"] (NoArg readTestAccept) "Accept new output of tests" ] -- | A type-indexed map containing Hadrian command line arguments to be passed -- to Shake via 'shakeExtra'. ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -147,7 +147,7 @@ bindistRules = do (["configure", "Makefile"] ++ bindistInstallFiles) need $ map ((bindistFilesDir -/- "wrappers") -/-) ["check-api-annotations" , "check-ppr", "ghc", "ghc-iserv", "ghc-pkg" - , "ghci-script", "ghci", "haddock", "hpc", "hp2ps", "hsc2hs" + , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz @@ -293,9 +293,8 @@ bindistMakefile = unlines , "\tdone" , "" , "install_ghci:" - , "\t at echo \"Copying and installing ghci\"" - , "\t$(CREATE_SCRIPT) '$(WrapperBinsDir)/ghci'" - , "\t at echo \"#!$(SHELL)\" >> '$(WrapperBinsDir)/ghci'" + , "\t at echo \"Installing ghci wrapper\"" + , "\t at echo \"#!$(SHELL)\" > '$(WrapperBinsDir)/ghci'" , "\tcat wrappers/ghci-script >> '$(WrapperBinsDir)/ghci'" , "\t$(EXECUTABLE_FILE) '$(WrapperBinsDir)/ghci'" , "" @@ -331,6 +330,7 @@ bindistMakefile = unlines , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ "$(ActualLibsDir),$(docdir),$(includedir)))" + , "\trm -f '$(WrapperBinsDir)/ghci-script'" -- FIXME: we shouldn't generate it in the first place , "" , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" @@ -356,7 +356,6 @@ bindistMakefile = unlines wrapper :: FilePath -> String wrapper "ghc" = ghcWrapper wrapper "ghc-pkg" = ghcPkgWrapper -wrapper "ghci" = ghciWrapper wrapper "ghci-script" = ghciScriptWrapper wrapper "haddock" = haddockWrapper wrapper "hsc2hs" = hsc2hsWrapper @@ -373,9 +372,6 @@ ghcPkgWrapper = unlines [ "PKGCONF=\"$libdir/package.conf.d\"" , "exec \"$executablename\" --global-package-db \"$PKGCONF\" ${1+\"$@\"}" ] -ghciWrapper :: String -ghciWrapper = "exec \"$executablename\" --interactive \"$@\"\n" - haddockWrapper :: String haddockWrapper = "exec \"$executablename\" -B\"$libdir\" -l\"$libdir\" ${1+\"$@\"}\n" ===================================== hadrian/src/Settings/Builders/RunTest.hs ===================================== @@ -71,6 +71,11 @@ runTestBuilderArgs = builder RunTest ? do debugged <- read <$> getTestSetting TestGhcDebugged keepFiles <- expr (testKeepFiles <$> userSetting defaultTestArgs) + accept <- expr (testAccept <$> userSetting defaultTestArgs) + (acceptPlatform, acceptOS) <- expr . liftIO $ + (,) <$> (maybe False (=="YES") <$> lookupEnv "PLATFORM") + <*> (maybe False (=="YES") <$> lookupEnv "OS") + windows <- expr windowsHost darwin <- expr osxHost threads <- shakeThreads <$> expr getShakeOptions @@ -95,6 +100,9 @@ runTestBuilderArgs = builder RunTest ? do , arg "-e", arg $ "darwin=" ++ show darwin , arg "-e", arg $ "config.local=False" , arg "-e", arg $ "config.cleanup=" ++ show (not keepFiles) + , arg "-e", arg $ "config.accept=" ++ show accept + , arg "-e", arg $ "config.accept_platform=" ++ show acceptPlatform + , arg "-e", arg $ "config.accept_os=" ++ show acceptOS , arg "-e", arg $ "config.exeext=" ++ quote exe , arg "-e", arg $ "config.compiler_debugged=" ++ quote (yesNo debugged) , arg "-e", arg $ "ghc_debugged=" ++ quote (yesNo debugged) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/8decb964c73499c296b3f7ca36ba8364ce19e150...b7343e192c6425b84e02fcf3849fc61e1de586db -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/8decb964c73499c296b3f7ca36ba8364ce19e150...b7343e192c6425b84e02fcf3849fc61e1de586db You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 13:36:36 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 09:36:36 -0400 Subject: [Git][ghc/ghc][cherry-pick-908b4b86] 2 commits: Bump hpc submodule Message-ID: <5cb1e5e4b124_62b33fa2d0f08c7032747f0@gitlab.haskell.org.mail> Ben Gamari pushed to branch cherry-pick-908b4b86 at Glasgow Haskell Compiler / GHC Commits: 6e254ee3 by Ryan Scott at 2019-04-12T15:23:33Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 2ea424ea by Ömer Sinan Ağacan at 2019-04-13T13:36:32Z Fix two bugs in stg_ap_0_fast in profiling runtime This includes two bug fixes in profiling version of stg_ap_0_fast: - PAPs allocated by stg_ap_0_fast are now correctly tagged. This invariant is checked in Sanity.c:checkPAP. (This was originally implemented in 2693eb11f5, later reverted with ab55b4ddb7 because it revealed the bug below, but it wasn't clear at the time whether the bug was the one below or something in the commit) - The local variable `untaggedfun` is now marked as a pointer so it survives GC. With this we finally fix all known bugs caught in #15508. `concprog001` now works reliably with prof+threaded and prof runtimes (with and without -debug). (cherry picked from commit 908b4b8659713f0b7a1704ce33c7fa30e3e0ffc3) - - - - - 2 changed files: - libraries/hpc - rts/Apply.cmm Changes: ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 7673420558e2a54affe530911d555cc78577ad87 +Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 ===================================== rts/Apply.cmm ===================================== @@ -60,7 +60,7 @@ stg_ap_0_fast ( P_ fun ) again: W_ info; - W_ untaggedfun; + P_ untaggedfun; W_ arity; untaggedfun = UNTAG(fun); info = %INFO_PTR(untaggedfun); @@ -106,6 +106,11 @@ again: pap = Hp - SIZEOF_StgPAP + WDS(1); SET_HDR(pap, stg_PAP_info, CCCS); StgPAP_arity(pap) = arity; + if (arity <= TAG_MASK) { + // TODO: Shouldn't this already be tagged? If not why did we + // untag it at the beginning of this function? + fun = untaggedfun + arity; + } StgPAP_fun(pap) = fun; StgPAP_n_args(pap) = 0; return (pap); @@ -117,9 +122,8 @@ again: return (fun); } else { // We're going to copy this PAP, and put the new CCS in it - fun = untaggedfun; W_ size; - size = SIZEOF_StgPAP + WDS(TO_W_(StgPAP_n_args(fun))); + size = SIZEOF_StgPAP + WDS(TO_W_(StgPAP_n_args(untaggedfun))); HP_CHK_GEN(size); TICK_ALLOC_PAP(size, 0); // attribute this allocation to the "overhead of profiling" @@ -127,13 +131,13 @@ again: P_ pap; pap = Hp - size + WDS(1); // We'll lose the original PAP, so we should enter its CCS - ccall enterFunCCS(BaseReg "ptr", StgHeader_ccs(fun) "ptr"); + ccall enterFunCCS(BaseReg "ptr", StgHeader_ccs(untaggedfun) "ptr"); SET_HDR(pap, stg_PAP_info, CCCS); - StgPAP_arity(pap) = StgPAP_arity(fun); - StgPAP_n_args(pap) = StgPAP_n_args(fun); + StgPAP_arity(pap) = StgPAP_arity(untaggedfun); + StgPAP_n_args(pap) = StgPAP_n_args(untaggedfun); StgPAP_fun(pap) = StgPAP_fun(fun); W_ i; - i = TO_W_(StgPAP_n_args(fun)); + i = TO_W_(StgPAP_n_args(untaggedfun)); loop: if (i == 0) { return (pap); View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/36d64990bcecba78ddc078b66d5a29c676c67743...2ea424ea10e30105f9cb4f7d4368d055e67c54c2 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/36d64990bcecba78ddc078b66d5a29c676c67743...2ea424ea10e30105f9cb4f7d4368d055e67c54c2 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 13:36:56 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 09:36:56 -0400 Subject: [Git][ghc/ghc][cherry-pick-36d38047] 2 commits: Bump hpc submodule Message-ID: <5cb1e5f876af3_62b33fa2cc9e1db83274942@gitlab.haskell.org.mail> Ben Gamari pushed to branch cherry-pick-36d38047 at Glasgow Haskell Compiler / GHC Commits: 6e254ee3 by Ryan Scott at 2019-04-12T15:23:33Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 6c1e1beb by Ben Gamari at 2019-04-13T13:36:53Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. (cherry picked from commit 36d380475d9056fdf93305985be3def00aaf6cf7) - - - - - 2 changed files: - docs/users_guide/packages.rst - libraries/hpc Changes: ===================================== docs/users_guide/packages.rst ===================================== @@ -579,6 +579,12 @@ must be relative to the location of the package environment file. Use the package environment in ⟨file⟩, or in ``$HOME/.ghc/arch-os-version/environments/⟨name⟩`` + If set to ``-`` no package environment is read. + +.. envvar:: GHC_ENVIRONMENT + + Specifies the path to the package environment file to be used by GHC. + Overridden by the :ghc-flag:`-package-env ⟨file⟩|⟨name⟩` flag if set. In order, ``ghc`` will look for the package environment in the following locations: @@ -588,11 +594,11 @@ locations: - File ``$HOME/.ghc/arch-os-version/environments/name`` if you pass the option ``-package-env ⟨name⟩``. -- File ⟨file⟩ if the environment variable ``GHC_ENVIRONMENT`` is set to +- File ⟨file⟩ if the environment variable :envvar:`GHC_ENVIRONMENT` is set to ⟨file⟩. - File ``$HOME/.ghc/arch-os-version/environments/name`` if the - environment variable ``GHC_ENVIRONMENT`` is set to ⟨name⟩. + environment variable :envvar:`GHC_ENVIRONMENT` is set to ⟨name⟩. Additionally, unless ``-hide-all-packages`` is specified ``ghc`` will also look for the package environment in the following locations: ===================================== libraries/hpc ===================================== @@ -1 +1 @@ -Subproject commit 7673420558e2a54affe530911d555cc78577ad87 +Subproject commit 42bb0059dc535948ce87b846611968c1c01ae049 View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/29d530a1a97d0887d4dd0939f5da27bae4382c19...6c1e1beb78c559bf88b658b3b118ead46c85579b -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/29d530a1a97d0887d4dd0939f5da27bae4382c19...6c1e1beb78c559bf88b658b3b118ead46c85579b You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 13:52:02 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 09:52:02 -0400 Subject: [Git][ghc/ghc][master] linters: Fix check-version-number Message-ID: <5cb1e98263a90_62b33fa2d13ad30c32788bb@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - 1 changed file: - .gitlab/linters/check-version-number.sh Changes: ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -2,5 +2,5 @@ set -e -grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || +grep -E -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/7876d08881b66c3e47d8dab3420c9c14616325e4 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/7876d08881b66c3e47d8dab3420c9c14616325e4 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 13:59:05 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 09:59:05 -0400 Subject: [Git][ghc/ghc][wip/slowtest] 30 commits: Hadrian: fix binary-dir with --docs=none Message-ID: <5cb1eb29d5ea6_62b33fa2ebceddbc32891ed@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/slowtest at Glasgow Haskell Compiler / GHC Commits: c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - f48a5d98 by Ben Gamari at 2019-04-13T13:59:03Z gitlab-ci: Test using slowtest in deb9-debug job - - - - - d4a5aa9f by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark hWaitForInput-accurate-stdin as broken in threaded ways As noted in #16535. - - - - - a93d8ed6 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Make closureSize less sensitive to optimisation - - - - - fc800302 by Ben Gamari at 2019-04-13T13:59:03Z process: Bump submodule * Skip process005 in ghci way * Mark process002 as fragile in threaded2 - - - - - 6de8eb70 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark T13167 as fragile in threaded2 As noted in #16536. - - - - - 83999888 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark T13910 as broken in optasm Due to #16537. - - - - - 450a962f by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark T14272 as broken in optasm - - - - - bc469eaf by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark T14761c as broken in hpc and optasm ways As noted in #16540. - - - - - a0102062 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark T16180 as broken in ghci and ext-interp ways As noted in #16541. - - - - - 5fc43fca by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Omit tcrun022 in hpc way As noted in #16542, the expected rule doesn't fire. However, this doesn't seem terribly surpring given the circumstances. - - - - - 3ccce2e9 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark Overflow as broken in hpc way As noted in #16543. - - - - - 9f585733 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark closure_size as broken in threaded2, optasm, dyn ways As noted in #16531. - - - - - 80e75d0f by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark T2783 as fragile in threaded1 It was previously marked as broken but it passes non-deterministically. See #2783. - - - - - c88c1d08 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Skip T7919 in ghci way It times out pretty reliably. It's not clear that much is gained by running this test in the ghci way anyways. - - - - - f8e486f0 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Fix fragile_for test modifier - - - - - 078affea by Ben Gamari at 2019-04-13T13:59:03Z users-guide: Add pretty to package list - - - - - 0a853ea1 by Ben Gamari at 2019-04-13T13:59:03Z Bump unix submodule Marks posix002 as fragile in threaded2 way due to #16550. - - - - - e21d2ca9 by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Fix omit_ways usage omit_ways expects a list but this was broken in several cases. - - - - - 3eda117e by Ben Gamari at 2019-04-13T13:59:03Z testsuite: Mark threadstatus-T9333 as fragile in ghci way As noted in #16555. - - - - - 30 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - compiler/ghci/ByteCodeLink.hs - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - configure.ac - − distrib/remilestoning.pl - docs/users_guide/8.8.1-notes.rst - docs/users_guide/debugging.rst - hadrian/doc/make.md - hadrian/doc/testsuite.md - hadrian/src/CommandLine.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Settings/Builders/RunTest.hs - libraries/base/tests/all.T - libraries/ghc-heap/tests/all.T - libraries/ghc-heap/tests/closure_size.hs - libraries/process - libraries/unix - testsuite/driver/testlib.py - testsuite/tests/concurrent/should_run/all.T - testsuite/tests/dependent/should_compile/all.T - testsuite/tests/lib/integer/all.T - testsuite/tests/programs/barton-mangler-bug/test.T - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f1024129e7087bf29e03af8e6f686b921ecc7cd0...3eda117ef13183fd4eca3a8e65a12255bfe69d05 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/f1024129e7087bf29e03af8e6f686b921ecc7cd0...3eda117ef13183fd4eca3a8e65a12255bfe69d05 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 13:59:28 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 09:59:28 -0400 Subject: [Git][ghc/ghc][wip/T16546] 12 commits: Hadrian: fix binary-dir with --docs=none Message-ID: <5cb1eb40bbf03_62b33fa2cc9e1db83289850@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/T16546 at Glasgow Haskell Compiler / GHC Commits: c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - cf14e5e0 by Ben Gamari at 2019-04-13T13:59:24Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - 20 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - compiler/ghci/ByteCodeLink.hs - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - configure.ac - − distrib/remilestoning.pl - docs/users_guide/debugging.rst - hadrian/doc/make.md - hadrian/doc/testsuite.md - hadrian/src/CommandLine.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Settings/Builders/RunTest.hs - libraries/base/System/Timeout.hs - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 Changes: ===================================== .gitlab-ci.yml ===================================== @@ -59,6 +59,7 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-version-number.sh dependencies: [] tags: - lint @@ -84,7 +85,14 @@ lint-submods: refs: - master - /ghc-[0-9]+\.[0-9]+/ - - wip/marge_bot_batch_merge_job + +lint-submods-marge: + extends: .lint-submods + only: + refs: + - merge_requests + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*wip/marge_bot_batch_merge_job.*/ lint-submods-mr: extends: .lint-submods @@ -524,7 +532,7 @@ validate-x86_64-linux-fedora27: paths: - ghc.tar.xz -validate-x86_64-windows-hadrian: +.validate-x86_64-windows-hadrian: extends: .build-windows-hadrian variables: MSYSTEM: MINGW64 ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +set -e + +grep -E -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== compiler/ghci/ByteCodeLink.hs ===================================== @@ -154,8 +154,8 @@ linkFail who what , "the missing library using the -L/path/to/object/dir and -lmissinglibname" , "flags, or simply by naming the relevant files on the GHCi command line." , "Alternatively, this link failure might indicate a bug in GHCi." - , "If you suspect the latter, please send a bug report to:" - , " glasgow-haskell-bugs at haskell.org" + , "If you suspect the latter, please report this as a GHC bug:" + , " https://www.haskell.org/ghc/reportabug" ]) ===================================== compiler/main/DynFlags.hs ===================================== @@ -384,6 +384,7 @@ data DumpFlag | Opt_D_dump_spec | Opt_D_dump_prep | Opt_D_dump_stg + | Opt_D_dump_stg_final | Opt_D_dump_call_arity | Opt_D_dump_exitify | Opt_D_dump_stranal @@ -3339,6 +3340,8 @@ dynamic_flags_deps = [ (setDumpFlag Opt_D_dump_prep) , make_ord_flag defGhcFlag "ddump-stg" (setDumpFlag Opt_D_dump_stg) + , make_ord_flag defGhcFlag "ddump-stg-final" + (setDumpFlag Opt_D_dump_stg_final) , make_ord_flag defGhcFlag "ddump-call-arity" (setDumpFlag Opt_D_dump_call_arity) , make_ord_flag defGhcFlag "ddump-exitify" ===================================== compiler/main/HscMain.hs ===================================== @@ -1470,6 +1470,8 @@ doCodeGen hsc_env this_mod data_tycons let dflags = hsc_dflags hsc_env let stg_binds_w_fvs = annTopBindingsFreeVars stg_binds + dumpIfSet_dyn dflags Opt_D_dump_stg_final + "STG for code gen:" (pprGenStgTopBindings stg_binds_w_fvs) let cmm_stream :: Stream IO CmmGroup () cmm_stream = {-# SCC "StgCmm" #-} StgCmm.codeGen dflags this_mod data_tycons ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} ===================================== distrib/remilestoning.pl deleted ===================================== @@ -1,119 +0,0 @@ -#!/usr/bin/env perl - -use warnings; -use strict; - -use DBI; - -# ===== Config: - -my $dbfile = "trac.db"; -my $milestone = "7.4.1"; -my $test = 0; - -# ===== Code: - -my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","", {}); - -my %emailof; -my %ticketsfor; - -sub getUserAddress { - my $sth = $dbh->prepare("SELECT sid, value FROM session_attribute WHERE name = 'email'"); - $sth->execute(); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $username = $result->{sid}; - my $email = $result->{value}; - if (defined($emailof{$username})) { - die "Two e-mail addresses found for $username"; - } - if ($email =~ /@/) { - $emailof{$username} = $email; - } - else { - # warn "The e-mail address $email for $username contains no @"; - } - } - $sth->finish; -} - -sub doTickets { - my $sth = $dbh->prepare("SELECT id, summary, reporter, cc FROM ticket WHERE milestone = ? AND status = 'new'"); - $sth->execute($milestone); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $ticket = $result->{id}; - my $title = $result->{summary}; - my $reporter = $result->{reporter}; - my $cc = $result->{cc}; - my %addresses; - my $address_added; - for my $who ($reporter, split /[ ,]+/, $cc) { - $address_added = 0; - if ($who =~ /@/) { - $addresses{$who} = 1; - $address_added = 1; - } - if (defined($emailof{$who})) { - $addresses{$emailof{$who}} = 1; - $address_added = 1; - } - if ($who ne "nobody" && $address_added eq 0) { - # warn "No address found for $who"; - } - } - for my $address (keys(%addresses)) { - $ticketsfor{$address}{$ticket}{"title"} = $title; - } - } - $sth->finish; -} - -sub doEmails { - for my $email (sort (keys %ticketsfor)) { - if ($test ne 0) { - open FH, ">&STDOUT"; - } - else { - open(FH, '|-', 'mail', '-s', 'GHC bugs', '-a', 'From: glasgow-haskell-bugs at haskell.org', $email) or die "Running mail failed: $!"; - } - print FH <<'EOF'; - -Hello, - -You are receiving this mail because you are the reporter, or on the CC -list, for one or more GHC tickets that are automatically having their -priority reduced due to our post-release ticket handling policy: - https://gitlab.haskell.org/ghc/ghc/wikis/working-conventions/bug-tracker#re-milestoning-tickets-after-a-release - -The list of tickets for which you are the reporter or on the CC list is -given below. If any of these are causing problems for you, please let us -know on glasgow-haskell-bugs at haskell.org and we'll look at raising the -priority. - -Better still, if you are able to make any progress on any of the tickets -yourself (whether that be actually fixing the bug, or just making it -easier for someone else to - for example, by making a small, -self-contained test-case), then that would be a great help. We at GHC HQ -have limited resources, so if anything is waiting for us to make -progress then it can be waiting a long time! -EOF - for my $ticket (sort {$a <=> $b} (keys %{$ticketsfor{$email}})) { - my $title = $ticketsfor{$email}{$ticket}{"title"}; - print FH "\n"; - print FH "#$ticket $title:\n"; - print FH " https://gitlab.haskell.org/ghc/ghc/issues/$ticket\n"; - } - print FH <<'EOF'; - --- -The GHC Team -http://www.haskell.org/ghc/ -EOF - close FH or die "Close failed: $!"; - } -} - -&getUserAddress(); -&doTickets(); -&doEmails(); - ===================================== docs/users_guide/debugging.rst ===================================== @@ -380,6 +380,11 @@ These flags dump various phases of GHC's STG pipeline. Show the output of the intermediate STG-to-STG pass. (*lots* of output!) +.. ghc-flag:: -ddump-stg-final + :shortdesc: Show output of last STG pass. + :type: dynamic + + Show the output of the last STG pass before we generate Cmm. C-\\- representation ~~~~~~~~~~~~~~~~~~~~ ===================================== hadrian/doc/make.md ===================================== @@ -123,6 +123,8 @@ time you fire up a build. This is not possible with the Make build system. make test # (1) make test TEST=plugins01 # (2) make test TEST="plugins01 plugins02" # (3) + make accept # (4) + PLATFORM=YES OS=YES make accept # (5) # Hadrian @@ -134,6 +136,12 @@ time you fire up a build. This is not possible with the Make build system. build test --only="plugins01 plugins02" # equivalent to (3) TEST="plugins01 plugins02" build test # equivalent to (3) TEST=plugins01 build test --only=plugins02 # equivalent to (3) + + build test -a # equivalent to (4) + build test --test-accept # equivalent to (4) + + PLATFORM=YES OS=YES build test -a # equivalent to (5) + PLATFORM=YES OS=YES build test --test-accept # equivalent to (5) ``` As illustrated in the examples above, you can use the `TEST` environment ===================================== hadrian/doc/testsuite.md ===================================== @@ -40,6 +40,31 @@ TEST="test1 test2" build test TEST="test1 test2" build test --only="test3 test4" ``` +## Accepting new output + +You can use the `-a` or `--test-accept` flag to "accept" the new +output of your tests. This has the effect of updating the expected +output of all the tests that fail due to mismatching output, so as to +consider the new output the correct one. + +When the `PLATFORM` environment variable is set to `YES`, passing this flag has +the effect of accepting the new output for the current platform. + +When the `OS` environment variable is set to `YES`, passing this flag has the +effect of accepting the new output for all word sizes on the current OS. + +``` sh +# accept new output for all tests +build test -a + +# just run and accept new output for 'test123' and 'test456' +build test -a --only="test123 test456" + +# accept new output for current platform and all word sizes for +# the current OS, for all tests +PLATFORM=YES OS=YES build test -a +``` + ## Performance tests You can use the `--only-perf` and `--skip-perf` flags to ===================================== hadrian/src/CommandLine.hs ===================================== @@ -56,7 +56,8 @@ data TestArgs = TestArgs , testSpeed :: TestSpeed , testSummary :: Maybe FilePath , testVerbosity :: Maybe String - , testWays :: [String] } + , testWays :: [String] + , testAccept :: Bool} deriving (Eq, Show) -- | Default value for `TestArgs`. @@ -73,7 +74,8 @@ defaultTestArgs = TestArgs , testSpeed = TestNormal , testSummary = Nothing , testVerbosity = Nothing - , testWays = [] } + , testWays = [] + , testAccept = False } readConfigure :: Either String (CommandLineArgs -> CommandLineArgs) readConfigure = Right $ \flags -> flags { configure = True } @@ -124,6 +126,9 @@ readProgressInfo ms = readTestKeepFiles :: Either String (CommandLineArgs -> CommandLineArgs) readTestKeepFiles = Right $ \flags -> flags { testArgs = (testArgs flags) { testKeepFiles = True } } +readTestAccept :: Either String (CommandLineArgs -> CommandLineArgs) +readTestAccept = Right $ \flags -> flags { testArgs = (testArgs flags) { testAccept = True } } + readTestCompiler :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) readTestCompiler compiler = maybe (Left "Cannot parse compiler") (Right . set) compiler where @@ -245,7 +250,8 @@ optDescrs = , Option [] ["test-verbose"] (OptArg readTestVerbose "TEST_VERBOSE") "A verbosity value between 0 and 5. 0 is silent, 4 and higher activates extra output." , Option [] ["test-way"] (OptArg readTestWay "TEST_WAY") - "only run these ways" ] + "only run these ways" + , Option ['a'] ["test-accept"] (NoArg readTestAccept) "Accept new output of tests" ] -- | A type-indexed map containing Hadrian command line arguments to be passed -- to Shake via 'shakeExtra'. ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -121,7 +121,13 @@ bindistRules = do copyDirectory (ghcBuildDir -/- "lib") bindistFilesDir copyDirectory (rtsIncludeDir) bindistFilesDir need ["docs"] - copyDirectory (root -/- "docs") bindistFilesDir + -- TODO: we should only embed the docs that have been generated + -- depending on the current settings (flavours' "ghcDocs" field and + -- "--docs=.." command-line flag) + -- Currently we embed the "docs" directory if it exists but it may + -- contain outdated or even invalid data. + whenM (doesDirectoryExist (root -/- "docs")) $ do + copyDirectory (root -/- "docs") bindistFilesDir when windows $ do copyDirectory (root -/- "mingw") bindistFilesDir -- we use that opportunity to delete the .stamp file that we use @@ -141,7 +147,7 @@ bindistRules = do (["configure", "Makefile"] ++ bindistInstallFiles) need $ map ((bindistFilesDir -/- "wrappers") -/-) ["check-api-annotations" , "check-ppr", "ghc", "ghc-iserv", "ghc-pkg" - , "ghci-script", "ghci", "haddock", "hpc", "hp2ps", "hsc2hs" + , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz @@ -287,9 +293,8 @@ bindistMakefile = unlines , "\tdone" , "" , "install_ghci:" - , "\t at echo \"Copying and installing ghci\"" - , "\t$(CREATE_SCRIPT) '$(WrapperBinsDir)/ghci'" - , "\t at echo \"#!$(SHELL)\" >> '$(WrapperBinsDir)/ghci'" + , "\t at echo \"Installing ghci wrapper\"" + , "\t at echo \"#!$(SHELL)\" > '$(WrapperBinsDir)/ghci'" , "\tcat wrappers/ghci-script >> '$(WrapperBinsDir)/ghci'" , "\t$(EXECUTABLE_FILE) '$(WrapperBinsDir)/ghci'" , "" @@ -325,6 +330,7 @@ bindistMakefile = unlines , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ "$(ActualLibsDir),$(docdir),$(includedir)))" + , "\trm -f '$(WrapperBinsDir)/ghci-script'" -- FIXME: we shouldn't generate it in the first place , "" , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" @@ -350,7 +356,6 @@ bindistMakefile = unlines wrapper :: FilePath -> String wrapper "ghc" = ghcWrapper wrapper "ghc-pkg" = ghcPkgWrapper -wrapper "ghci" = ghciWrapper wrapper "ghci-script" = ghciScriptWrapper wrapper "haddock" = haddockWrapper wrapper "hsc2hs" = hsc2hsWrapper @@ -367,9 +372,6 @@ ghcPkgWrapper = unlines [ "PKGCONF=\"$libdir/package.conf.d\"" , "exec \"$executablename\" --global-package-db \"$PKGCONF\" ${1+\"$@\"}" ] -ghciWrapper :: String -ghciWrapper = "exec \"$executablename\" --interactive \"$@\"\n" - haddockWrapper :: String haddockWrapper = "exec \"$executablename\" -B\"$libdir\" -l\"$libdir\" ${1+\"$@\"}\n" ===================================== hadrian/src/Settings/Builders/RunTest.hs ===================================== @@ -71,6 +71,11 @@ runTestBuilderArgs = builder RunTest ? do debugged <- read <$> getTestSetting TestGhcDebugged keepFiles <- expr (testKeepFiles <$> userSetting defaultTestArgs) + accept <- expr (testAccept <$> userSetting defaultTestArgs) + (acceptPlatform, acceptOS) <- expr . liftIO $ + (,) <$> (maybe False (=="YES") <$> lookupEnv "PLATFORM") + <*> (maybe False (=="YES") <$> lookupEnv "OS") + windows <- expr windowsHost darwin <- expr osxHost threads <- shakeThreads <$> expr getShakeOptions @@ -95,6 +100,9 @@ runTestBuilderArgs = builder RunTest ? do , arg "-e", arg $ "darwin=" ++ show darwin , arg "-e", arg $ "config.local=False" , arg "-e", arg $ "config.cleanup=" ++ show (not keepFiles) + , arg "-e", arg $ "config.accept=" ++ show accept + , arg "-e", arg $ "config.accept_platform=" ++ show acceptPlatform + , arg "-e", arg $ "config.accept_os=" ++ show acceptOS , arg "-e", arg $ "config.exeext=" ++ quote exe , arg "-e", arg $ "config.compiler_debugged=" ++ quote (yesNo debugged) , arg "-e", arg $ "ghc_debugged=" ++ quote (yesNo debugged) ===================================== libraries/base/System/Timeout.hs ===================================== @@ -16,7 +16,7 @@ -- ------------------------------------------------------------------------------- -module System.Timeout ( timeout ) where +module System.Timeout ( Timeout, timeout ) where #if !defined(mingw32_HOST_OS) import Control.Monad @@ -35,7 +35,11 @@ import Data.Unique (Unique, newUnique) -- interrupt the running IO computation when the timeout has -- expired. -newtype Timeout = Timeout Unique deriving Eq -- ^ @since 4.0 +-- | An exception thrown to a thread by 'timeout' to interrupt a timed-out +-- computation. +-- +-- @since 4.0 +newtype Timeout = Timeout Unique deriving Eq -- | @since 4.0 instance Show Timeout where @@ -67,20 +71,25 @@ instance Exception Timeout where -- another thread. -- -- A tricky implementation detail is the question of how to abort an @IO@ --- computation. This combinator relies on asynchronous exceptions internally. --- The technique works very well for computations executing inside of the --- Haskell runtime system, but it doesn't work at all for non-Haskell code. --- Foreign function calls, for example, cannot be timed out with this --- combinator simply because an arbitrary C function cannot receive --- asynchronous exceptions. When @timeout@ is used to wrap an FFI call that --- blocks, no timeout event can be delivered until the FFI call returns, which --- pretty much negates the purpose of the combinator. In practice, however, --- this limitation is less severe than it may sound. Standard I\/O functions --- like 'System.IO.hGetBuf', 'System.IO.hPutBuf', Network.Socket.accept, or --- 'System.IO.hWaitForInput' appear to be blocking, but they really don't --- because the runtime system uses scheduling mechanisms like @select(2)@ to --- perform asynchronous I\/O, so it is possible to interrupt standard socket --- I\/O or file I\/O using this combinator. +-- computation. This combinator relies on asynchronous exceptions internally +-- (namely throwing the computation the 'Timeout' exception). The technique +-- works very well for computations executing inside of the Haskell runtime +-- system, but it doesn't work at all for non-Haskell code. Foreign function +-- calls, for example, cannot be timed out with this combinator simply because +-- an arbitrary C function cannot receive asynchronous exceptions. When +-- @timeout@ is used to wrap an FFI call that blocks, no timeout event can be +-- delivered until the FFI call returns, which pretty much negates the purpose +-- of the combinator. In practice, however, this limitation is less severe than +-- it may sound. Standard I\/O functions like 'System.IO.hGetBuf', +-- 'System.IO.hPutBuf', Network.Socket.accept, or 'System.IO.hWaitForInput' +-- appear to be blocking, but they really don't because the runtime system uses +-- scheduling mechanisms like @select(2)@ to perform asynchronous I\/O, so it +-- is possible to interrupt standard socket I\/O or file I\/O using this +-- combinator. +--- +-- Note that 'timeout' cancels the computation by throwing it the 'Timeout' +-- exception. Consequently blanket exception handlers (e.g. catching +-- 'SomeException') within the computation will break the timeout behavior. timeout :: Int -> IO a -> IO (Maybe a) timeout n f | n < 0 = fmap Just f ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/466fb8462f0131a5acc0c36dbdb8689c89334d9b...cf14e5e0ad910e8d0f8a0ea78bafd7800106f2a9 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/466fb8462f0131a5acc0c36dbdb8689c89334d9b...cf14e5e0ad910e8d0f8a0ea78bafd7800106f2a9 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 14:00:04 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 10:00:04 -0400 Subject: [Git][ghc/ghc][wip/ara/ci-badge] 12 commits: Hadrian: fix binary-dir with --docs=none Message-ID: <5cb1eb64d3c02_62b33fa2d0f08c7032906ac@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/ara/ci-badge at Glasgow Haskell Compiler / GHC Commits: c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - 2e7b2e55 by Ara Adkins at 2019-04-13T14:00:02Z [skip ci] Update CI badge in readme This trivial MR updates the CI badge in the readme to point to the new CI on gitlab, rather than the very out-of-date badge from Travis. - - - - - 20 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - README.md - compiler/ghci/ByteCodeLink.hs - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - configure.ac - − distrib/remilestoning.pl - docs/users_guide/debugging.rst - hadrian/doc/make.md - hadrian/doc/testsuite.md - hadrian/src/CommandLine.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Settings/Builders/RunTest.hs - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 - testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 Changes: ===================================== .gitlab-ci.yml ===================================== @@ -59,6 +59,7 @@ ghc-linters: - validate-whitespace .git $(git rev-list $base..$CI_COMMIT_SHA) - .gitlab/linters/check-makefiles.py $base $CI_COMMIT_SHA - .gitlab/linters/check-cpp.py $base $CI_COMMIT_SHA + - .gitlab/linters/check-version-number.sh dependencies: [] tags: - lint @@ -84,7 +85,14 @@ lint-submods: refs: - master - /ghc-[0-9]+\.[0-9]+/ - - wip/marge_bot_batch_merge_job + +lint-submods-marge: + extends: .lint-submods + only: + refs: + - merge_requests + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*wip/marge_bot_batch_merge_job.*/ lint-submods-mr: extends: .lint-submods @@ -524,7 +532,7 @@ validate-x86_64-linux-fedora27: paths: - ghc.tar.xz -validate-x86_64-windows-hadrian: +.validate-x86_64-windows-hadrian: extends: .build-windows-hadrian variables: MSYSTEM: MINGW64 ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -0,0 +1,6 @@ +#!/usr/bin/env bash + +set -e + +grep -E -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || + ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== README.md ===================================== @@ -1,7 +1,7 @@ The Glasgow Haskell Compiler ============================ -[![Build Status](https://api.travis-ci.org/ghc/ghc.svg?branch=master)](http://travis-ci.org/ghc/ghc) +[![pipeline status](https://gitlab.haskell.org/ghc/ghc/badges/master/pipeline.svg?style=flat)](https://gitlab.haskell.org/ghc/ghc/commits/master) This is the source tree for [GHC][1], a compiler and interactive environment for the Haskell functional programming language. ===================================== compiler/ghci/ByteCodeLink.hs ===================================== @@ -154,8 +154,8 @@ linkFail who what , "the missing library using the -L/path/to/object/dir and -lmissinglibname" , "flags, or simply by naming the relevant files on the GHCi command line." , "Alternatively, this link failure might indicate a bug in GHCi." - , "If you suspect the latter, please send a bug report to:" - , " glasgow-haskell-bugs at haskell.org" + , "If you suspect the latter, please report this as a GHC bug:" + , " https://www.haskell.org/ghc/reportabug" ]) ===================================== compiler/main/DynFlags.hs ===================================== @@ -384,6 +384,7 @@ data DumpFlag | Opt_D_dump_spec | Opt_D_dump_prep | Opt_D_dump_stg + | Opt_D_dump_stg_final | Opt_D_dump_call_arity | Opt_D_dump_exitify | Opt_D_dump_stranal @@ -3339,6 +3340,8 @@ dynamic_flags_deps = [ (setDumpFlag Opt_D_dump_prep) , make_ord_flag defGhcFlag "ddump-stg" (setDumpFlag Opt_D_dump_stg) + , make_ord_flag defGhcFlag "ddump-stg-final" + (setDumpFlag Opt_D_dump_stg_final) , make_ord_flag defGhcFlag "ddump-call-arity" (setDumpFlag Opt_D_dump_call_arity) , make_ord_flag defGhcFlag "ddump-exitify" ===================================== compiler/main/HscMain.hs ===================================== @@ -1470,6 +1470,8 @@ doCodeGen hsc_env this_mod data_tycons let dflags = hsc_dflags hsc_env let stg_binds_w_fvs = annTopBindingsFreeVars stg_binds + dumpIfSet_dyn dflags Opt_D_dump_stg_final + "STG for code gen:" (pprGenStgTopBindings stg_binds_w_fvs) let cmm_stream :: Stream IO CmmGroup () cmm_stream = {-# SCC "StgCmm" #-} StgCmm.codeGen dflags this_mod data_tycons ===================================== configure.ac ===================================== @@ -13,7 +13,7 @@ dnl # see what flags are available. (Better yet, read the documentation!) # -AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) +AC_INIT([The Glorious Glasgow Haskell Compilation System], [8.9.0], [glasgow-haskell-bugs at haskell.org], [ghc-AC_PACKAGE_VERSION]) # Set this to YES for a released version, otherwise NO : ${RELEASE=NO} ===================================== distrib/remilestoning.pl deleted ===================================== @@ -1,119 +0,0 @@ -#!/usr/bin/env perl - -use warnings; -use strict; - -use DBI; - -# ===== Config: - -my $dbfile = "trac.db"; -my $milestone = "7.4.1"; -my $test = 0; - -# ===== Code: - -my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","", {}); - -my %emailof; -my %ticketsfor; - -sub getUserAddress { - my $sth = $dbh->prepare("SELECT sid, value FROM session_attribute WHERE name = 'email'"); - $sth->execute(); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $username = $result->{sid}; - my $email = $result->{value}; - if (defined($emailof{$username})) { - die "Two e-mail addresses found for $username"; - } - if ($email =~ /@/) { - $emailof{$username} = $email; - } - else { - # warn "The e-mail address $email for $username contains no @"; - } - } - $sth->finish; -} - -sub doTickets { - my $sth = $dbh->prepare("SELECT id, summary, reporter, cc FROM ticket WHERE milestone = ? AND status = 'new'"); - $sth->execute($milestone); - while (my $result = $sth->fetchrow_hashref("NAME_lc")) { - my $ticket = $result->{id}; - my $title = $result->{summary}; - my $reporter = $result->{reporter}; - my $cc = $result->{cc}; - my %addresses; - my $address_added; - for my $who ($reporter, split /[ ,]+/, $cc) { - $address_added = 0; - if ($who =~ /@/) { - $addresses{$who} = 1; - $address_added = 1; - } - if (defined($emailof{$who})) { - $addresses{$emailof{$who}} = 1; - $address_added = 1; - } - if ($who ne "nobody" && $address_added eq 0) { - # warn "No address found for $who"; - } - } - for my $address (keys(%addresses)) { - $ticketsfor{$address}{$ticket}{"title"} = $title; - } - } - $sth->finish; -} - -sub doEmails { - for my $email (sort (keys %ticketsfor)) { - if ($test ne 0) { - open FH, ">&STDOUT"; - } - else { - open(FH, '|-', 'mail', '-s', 'GHC bugs', '-a', 'From: glasgow-haskell-bugs at haskell.org', $email) or die "Running mail failed: $!"; - } - print FH <<'EOF'; - -Hello, - -You are receiving this mail because you are the reporter, or on the CC -list, for one or more GHC tickets that are automatically having their -priority reduced due to our post-release ticket handling policy: - https://gitlab.haskell.org/ghc/ghc/wikis/working-conventions/bug-tracker#re-milestoning-tickets-after-a-release - -The list of tickets for which you are the reporter or on the CC list is -given below. If any of these are causing problems for you, please let us -know on glasgow-haskell-bugs at haskell.org and we'll look at raising the -priority. - -Better still, if you are able to make any progress on any of the tickets -yourself (whether that be actually fixing the bug, or just making it -easier for someone else to - for example, by making a small, -self-contained test-case), then that would be a great help. We at GHC HQ -have limited resources, so if anything is waiting for us to make -progress then it can be waiting a long time! -EOF - for my $ticket (sort {$a <=> $b} (keys %{$ticketsfor{$email}})) { - my $title = $ticketsfor{$email}{$ticket}{"title"}; - print FH "\n"; - print FH "#$ticket $title:\n"; - print FH " https://gitlab.haskell.org/ghc/ghc/issues/$ticket\n"; - } - print FH <<'EOF'; - --- -The GHC Team -http://www.haskell.org/ghc/ -EOF - close FH or die "Close failed: $!"; - } -} - -&getUserAddress(); -&doTickets(); -&doEmails(); - ===================================== docs/users_guide/debugging.rst ===================================== @@ -380,6 +380,11 @@ These flags dump various phases of GHC's STG pipeline. Show the output of the intermediate STG-to-STG pass. (*lots* of output!) +.. ghc-flag:: -ddump-stg-final + :shortdesc: Show output of last STG pass. + :type: dynamic + + Show the output of the last STG pass before we generate Cmm. C-\\- representation ~~~~~~~~~~~~~~~~~~~~ ===================================== hadrian/doc/make.md ===================================== @@ -123,6 +123,8 @@ time you fire up a build. This is not possible with the Make build system. make test # (1) make test TEST=plugins01 # (2) make test TEST="plugins01 plugins02" # (3) + make accept # (4) + PLATFORM=YES OS=YES make accept # (5) # Hadrian @@ -134,6 +136,12 @@ time you fire up a build. This is not possible with the Make build system. build test --only="plugins01 plugins02" # equivalent to (3) TEST="plugins01 plugins02" build test # equivalent to (3) TEST=plugins01 build test --only=plugins02 # equivalent to (3) + + build test -a # equivalent to (4) + build test --test-accept # equivalent to (4) + + PLATFORM=YES OS=YES build test -a # equivalent to (5) + PLATFORM=YES OS=YES build test --test-accept # equivalent to (5) ``` As illustrated in the examples above, you can use the `TEST` environment ===================================== hadrian/doc/testsuite.md ===================================== @@ -40,6 +40,31 @@ TEST="test1 test2" build test TEST="test1 test2" build test --only="test3 test4" ``` +## Accepting new output + +You can use the `-a` or `--test-accept` flag to "accept" the new +output of your tests. This has the effect of updating the expected +output of all the tests that fail due to mismatching output, so as to +consider the new output the correct one. + +When the `PLATFORM` environment variable is set to `YES`, passing this flag has +the effect of accepting the new output for the current platform. + +When the `OS` environment variable is set to `YES`, passing this flag has the +effect of accepting the new output for all word sizes on the current OS. + +``` sh +# accept new output for all tests +build test -a + +# just run and accept new output for 'test123' and 'test456' +build test -a --only="test123 test456" + +# accept new output for current platform and all word sizes for +# the current OS, for all tests +PLATFORM=YES OS=YES build test -a +``` + ## Performance tests You can use the `--only-perf` and `--skip-perf` flags to ===================================== hadrian/src/CommandLine.hs ===================================== @@ -56,7 +56,8 @@ data TestArgs = TestArgs , testSpeed :: TestSpeed , testSummary :: Maybe FilePath , testVerbosity :: Maybe String - , testWays :: [String] } + , testWays :: [String] + , testAccept :: Bool} deriving (Eq, Show) -- | Default value for `TestArgs`. @@ -73,7 +74,8 @@ defaultTestArgs = TestArgs , testSpeed = TestNormal , testSummary = Nothing , testVerbosity = Nothing - , testWays = [] } + , testWays = [] + , testAccept = False } readConfigure :: Either String (CommandLineArgs -> CommandLineArgs) readConfigure = Right $ \flags -> flags { configure = True } @@ -124,6 +126,9 @@ readProgressInfo ms = readTestKeepFiles :: Either String (CommandLineArgs -> CommandLineArgs) readTestKeepFiles = Right $ \flags -> flags { testArgs = (testArgs flags) { testKeepFiles = True } } +readTestAccept :: Either String (CommandLineArgs -> CommandLineArgs) +readTestAccept = Right $ \flags -> flags { testArgs = (testArgs flags) { testAccept = True } } + readTestCompiler :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) readTestCompiler compiler = maybe (Left "Cannot parse compiler") (Right . set) compiler where @@ -245,7 +250,8 @@ optDescrs = , Option [] ["test-verbose"] (OptArg readTestVerbose "TEST_VERBOSE") "A verbosity value between 0 and 5. 0 is silent, 4 and higher activates extra output." , Option [] ["test-way"] (OptArg readTestWay "TEST_WAY") - "only run these ways" ] + "only run these ways" + , Option ['a'] ["test-accept"] (NoArg readTestAccept) "Accept new output of tests" ] -- | A type-indexed map containing Hadrian command line arguments to be passed -- to Shake via 'shakeExtra'. ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -121,7 +121,13 @@ bindistRules = do copyDirectory (ghcBuildDir -/- "lib") bindistFilesDir copyDirectory (rtsIncludeDir) bindistFilesDir need ["docs"] - copyDirectory (root -/- "docs") bindistFilesDir + -- TODO: we should only embed the docs that have been generated + -- depending on the current settings (flavours' "ghcDocs" field and + -- "--docs=.." command-line flag) + -- Currently we embed the "docs" directory if it exists but it may + -- contain outdated or even invalid data. + whenM (doesDirectoryExist (root -/- "docs")) $ do + copyDirectory (root -/- "docs") bindistFilesDir when windows $ do copyDirectory (root -/- "mingw") bindistFilesDir -- we use that opportunity to delete the .stamp file that we use @@ -141,7 +147,7 @@ bindistRules = do (["configure", "Makefile"] ++ bindistInstallFiles) need $ map ((bindistFilesDir -/- "wrappers") -/-) ["check-api-annotations" , "check-ppr", "ghc", "ghc-iserv", "ghc-pkg" - , "ghci-script", "ghci", "haddock", "hpc", "hp2ps", "hsc2hs" + , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz @@ -287,9 +293,8 @@ bindistMakefile = unlines , "\tdone" , "" , "install_ghci:" - , "\t at echo \"Copying and installing ghci\"" - , "\t$(CREATE_SCRIPT) '$(WrapperBinsDir)/ghci'" - , "\t at echo \"#!$(SHELL)\" >> '$(WrapperBinsDir)/ghci'" + , "\t at echo \"Installing ghci wrapper\"" + , "\t at echo \"#!$(SHELL)\" > '$(WrapperBinsDir)/ghci'" , "\tcat wrappers/ghci-script >> '$(WrapperBinsDir)/ghci'" , "\t$(EXECUTABLE_FILE) '$(WrapperBinsDir)/ghci'" , "" @@ -325,6 +330,7 @@ bindistMakefile = unlines , "\t\t$(call installscript,$p,$(WrapperBinsDir)/$p," ++ "$(WrapperBinsDir),$(ActualBinsDir),$(ActualBinsDir)/$p," ++ "$(ActualLibsDir),$(docdir),$(includedir)))" + , "\trm -f '$(WrapperBinsDir)/ghci-script'" -- FIXME: we shouldn't generate it in the first place , "" , "PKG_CONFS = $(wildcard $(ActualLibsDir)/package.conf.d/*)" , "update_package_db:" @@ -350,7 +356,6 @@ bindistMakefile = unlines wrapper :: FilePath -> String wrapper "ghc" = ghcWrapper wrapper "ghc-pkg" = ghcPkgWrapper -wrapper "ghci" = ghciWrapper wrapper "ghci-script" = ghciScriptWrapper wrapper "haddock" = haddockWrapper wrapper "hsc2hs" = hsc2hsWrapper @@ -367,9 +372,6 @@ ghcPkgWrapper = unlines [ "PKGCONF=\"$libdir/package.conf.d\"" , "exec \"$executablename\" --global-package-db \"$PKGCONF\" ${1+\"$@\"}" ] -ghciWrapper :: String -ghciWrapper = "exec \"$executablename\" --interactive \"$@\"\n" - haddockWrapper :: String haddockWrapper = "exec \"$executablename\" -B\"$libdir\" -l\"$libdir\" ${1+\"$@\"}\n" ===================================== hadrian/src/Settings/Builders/RunTest.hs ===================================== @@ -71,6 +71,11 @@ runTestBuilderArgs = builder RunTest ? do debugged <- read <$> getTestSetting TestGhcDebugged keepFiles <- expr (testKeepFiles <$> userSetting defaultTestArgs) + accept <- expr (testAccept <$> userSetting defaultTestArgs) + (acceptPlatform, acceptOS) <- expr . liftIO $ + (,) <$> (maybe False (=="YES") <$> lookupEnv "PLATFORM") + <*> (maybe False (=="YES") <$> lookupEnv "OS") + windows <- expr windowsHost darwin <- expr osxHost threads <- shakeThreads <$> expr getShakeOptions @@ -95,6 +100,9 @@ runTestBuilderArgs = builder RunTest ? do , arg "-e", arg $ "darwin=" ++ show darwin , arg "-e", arg $ "config.local=False" , arg "-e", arg $ "config.cleanup=" ++ show (not keepFiles) + , arg "-e", arg $ "config.accept=" ++ show accept + , arg "-e", arg $ "config.accept_platform=" ++ show acceptPlatform + , arg "-e", arg $ "config.accept_os=" ++ show acceptOS , arg "-e", arg $ "config.exeext=" ++ quote exe , arg "-e", arg $ "config.compiler_debugged=" ++ quote (yesNo debugged) , arg "-e", arg $ "ghc_debugged=" ++ quote (yesNo debugged) ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_link_order_a_b_2_fail.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-32-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug ===================================== testsuite/tests/rts/T11223/T11223_simple_duplicate_lib.stderr-ws-64-mingw32 ===================================== @@ -20,6 +20,6 @@ archives or DLLs needed by your current session. Restart GHCi, specifying the missing library using the -L/path/to/object/dir and -lmissinglibname flags, or simply by naming the relevant files on the GHCi command line. Alternatively, this link failure might indicate a bug in GHCi. -If you suspect the latter, please send a bug report to: - glasgow-haskell-bugs at haskell.org +If you suspect the latter, please report this as a GHC bug: + https://www.haskell.org/ghc/reportabug View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/cc91ae2c6f3763a39a509ad79d1bfa656a641063...2e7b2e55de503d3b5086c0cec5f320667503f699 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/cc91ae2c6f3763a39a509ad79d1bfa656a641063...2e7b2e55de503d3b5086c0cec5f320667503f699 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 14:01:58 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 10:01:58 -0400 Subject: [Git][ghc/ghc][master] [skip ci] Update CI badge in readme Message-ID: <5cb1ebd6e3bed_62b33fa2ef5894b432913ba@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 2e7b2e55 by Ara Adkins at 2019-04-13T14:00:02Z [skip ci] Update CI badge in readme This trivial MR updates the CI badge in the readme to point to the new CI on gitlab, rather than the very out-of-date badge from Travis. - - - - - 1 changed file: - README.md Changes: ===================================== README.md ===================================== @@ -1,7 +1,7 @@ The Glasgow Haskell Compiler ============================ -[![Build Status](https://api.travis-ci.org/ghc/ghc.svg?branch=master)](http://travis-ci.org/ghc/ghc) +[![pipeline status](https://gitlab.haskell.org/ghc/ghc/badges/master/pipeline.svg?style=flat)](https://gitlab.haskell.org/ghc/ghc/commits/master) This is the source tree for [GHC][1], a compiler and interactive environment for the Haskell functional programming language. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/2e7b2e55de503d3b5086c0cec5f320667503f699 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/2e7b2e55de503d3b5086c0cec5f320667503f699 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 14:02:40 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 10:02:40 -0400 Subject: [Git][ghc/ghc][wip/T16546] 2 commits: [skip ci] Update CI badge in readme Message-ID: <5cb1ec00256ce_62b33fa2ebceddbc32930e5@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/T16546 at Glasgow Haskell Compiler / GHC Commits: 2e7b2e55 by Ara Adkins at 2019-04-13T14:00:02Z [skip ci] Update CI badge in readme This trivial MR updates the CI badge in the readme to point to the new CI on gitlab, rather than the very out-of-date badge from Travis. - - - - - 40848a43 by Ben Gamari at 2019-04-13T14:02:36Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - 2 changed files: - README.md - libraries/base/System/Timeout.hs Changes: ===================================== README.md ===================================== @@ -1,7 +1,7 @@ The Glasgow Haskell Compiler ============================ -[![Build Status](https://api.travis-ci.org/ghc/ghc.svg?branch=master)](http://travis-ci.org/ghc/ghc) +[![pipeline status](https://gitlab.haskell.org/ghc/ghc/badges/master/pipeline.svg?style=flat)](https://gitlab.haskell.org/ghc/ghc/commits/master) This is the source tree for [GHC][1], a compiler and interactive environment for the Haskell functional programming language. ===================================== libraries/base/System/Timeout.hs ===================================== @@ -16,7 +16,7 @@ -- ------------------------------------------------------------------------------- -module System.Timeout ( timeout ) where +module System.Timeout ( Timeout, timeout ) where #if !defined(mingw32_HOST_OS) import Control.Monad @@ -35,7 +35,11 @@ import Data.Unique (Unique, newUnique) -- interrupt the running IO computation when the timeout has -- expired. -newtype Timeout = Timeout Unique deriving Eq -- ^ @since 4.0 +-- | An exception thrown to a thread by 'timeout' to interrupt a timed-out +-- computation. +-- +-- @since 4.0 +newtype Timeout = Timeout Unique deriving Eq -- | @since 4.0 instance Show Timeout where @@ -67,20 +71,25 @@ instance Exception Timeout where -- another thread. -- -- A tricky implementation detail is the question of how to abort an @IO@ --- computation. This combinator relies on asynchronous exceptions internally. --- The technique works very well for computations executing inside of the --- Haskell runtime system, but it doesn't work at all for non-Haskell code. --- Foreign function calls, for example, cannot be timed out with this --- combinator simply because an arbitrary C function cannot receive --- asynchronous exceptions. When @timeout@ is used to wrap an FFI call that --- blocks, no timeout event can be delivered until the FFI call returns, which --- pretty much negates the purpose of the combinator. In practice, however, --- this limitation is less severe than it may sound. Standard I\/O functions --- like 'System.IO.hGetBuf', 'System.IO.hPutBuf', Network.Socket.accept, or --- 'System.IO.hWaitForInput' appear to be blocking, but they really don't --- because the runtime system uses scheduling mechanisms like @select(2)@ to --- perform asynchronous I\/O, so it is possible to interrupt standard socket --- I\/O or file I\/O using this combinator. +-- computation. This combinator relies on asynchronous exceptions internally +-- (namely throwing the computation the 'Timeout' exception). The technique +-- works very well for computations executing inside of the Haskell runtime +-- system, but it doesn't work at all for non-Haskell code. Foreign function +-- calls, for example, cannot be timed out with this combinator simply because +-- an arbitrary C function cannot receive asynchronous exceptions. When +-- @timeout@ is used to wrap an FFI call that blocks, no timeout event can be +-- delivered until the FFI call returns, which pretty much negates the purpose +-- of the combinator. In practice, however, this limitation is less severe than +-- it may sound. Standard I\/O functions like 'System.IO.hGetBuf', +-- 'System.IO.hPutBuf', Network.Socket.accept, or 'System.IO.hWaitForInput' +-- appear to be blocking, but they really don't because the runtime system uses +-- scheduling mechanisms like @select(2)@ to perform asynchronous I\/O, so it +-- is possible to interrupt standard socket I\/O or file I\/O using this +-- combinator. +--- +-- Note that 'timeout' cancels the computation by throwing it the 'Timeout' +-- exception. Consequently blanket exception handlers (e.g. catching +-- 'SomeException') within the computation will break the timeout behavior. timeout :: Int -> IO a -> IO (Maybe a) timeout n f | n < 0 = fmap Just f View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/cf14e5e0ad910e8d0f8a0ea78bafd7800106f2a9...40848a43072768d5a0a41a1df05f7a8ffd85f345 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/cf14e5e0ad910e8d0f8a0ea78bafd7800106f2a9...40848a43072768d5a0a41a1df05f7a8ffd85f345 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 14:03:28 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 10:03:28 -0400 Subject: [Git][ghc/ghc][master] base: Better document implementation implications of Data.Timeout Message-ID: <5cb1ec307b41c_62b33fa2ef5353f0329384@gitlab.haskell.org.mail> Ben Gamari pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 40848a43 by Ben Gamari at 2019-04-13T14:02:36Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - 1 changed file: - libraries/base/System/Timeout.hs Changes: ===================================== libraries/base/System/Timeout.hs ===================================== @@ -16,7 +16,7 @@ -- ------------------------------------------------------------------------------- -module System.Timeout ( timeout ) where +module System.Timeout ( Timeout, timeout ) where #if !defined(mingw32_HOST_OS) import Control.Monad @@ -35,7 +35,11 @@ import Data.Unique (Unique, newUnique) -- interrupt the running IO computation when the timeout has -- expired. -newtype Timeout = Timeout Unique deriving Eq -- ^ @since 4.0 +-- | An exception thrown to a thread by 'timeout' to interrupt a timed-out +-- computation. +-- +-- @since 4.0 +newtype Timeout = Timeout Unique deriving Eq -- | @since 4.0 instance Show Timeout where @@ -67,20 +71,25 @@ instance Exception Timeout where -- another thread. -- -- A tricky implementation detail is the question of how to abort an @IO@ --- computation. This combinator relies on asynchronous exceptions internally. --- The technique works very well for computations executing inside of the --- Haskell runtime system, but it doesn't work at all for non-Haskell code. --- Foreign function calls, for example, cannot be timed out with this --- combinator simply because an arbitrary C function cannot receive --- asynchronous exceptions. When @timeout@ is used to wrap an FFI call that --- blocks, no timeout event can be delivered until the FFI call returns, which --- pretty much negates the purpose of the combinator. In practice, however, --- this limitation is less severe than it may sound. Standard I\/O functions --- like 'System.IO.hGetBuf', 'System.IO.hPutBuf', Network.Socket.accept, or --- 'System.IO.hWaitForInput' appear to be blocking, but they really don't --- because the runtime system uses scheduling mechanisms like @select(2)@ to --- perform asynchronous I\/O, so it is possible to interrupt standard socket --- I\/O or file I\/O using this combinator. +-- computation. This combinator relies on asynchronous exceptions internally +-- (namely throwing the computation the 'Timeout' exception). The technique +-- works very well for computations executing inside of the Haskell runtime +-- system, but it doesn't work at all for non-Haskell code. Foreign function +-- calls, for example, cannot be timed out with this combinator simply because +-- an arbitrary C function cannot receive asynchronous exceptions. When +-- @timeout@ is used to wrap an FFI call that blocks, no timeout event can be +-- delivered until the FFI call returns, which pretty much negates the purpose +-- of the combinator. In practice, however, this limitation is less severe than +-- it may sound. Standard I\/O functions like 'System.IO.hGetBuf', +-- 'System.IO.hPutBuf', Network.Socket.accept, or 'System.IO.hWaitForInput' +-- appear to be blocking, but they really don't because the runtime system uses +-- scheduling mechanisms like @select(2)@ to perform asynchronous I\/O, so it +-- is possible to interrupt standard socket I\/O or file I\/O using this +-- combinator. +--- +-- Note that 'timeout' cancels the computation by throwing it the 'Timeout' +-- exception. Consequently blanket exception handlers (e.g. catching +-- 'SomeException') within the computation will break the timeout behavior. timeout :: Int -> IO a -> IO (Maybe a) timeout n f | n < 0 = fmap Just f View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/40848a43072768d5a0a41a1df05f7a8ffd85f345 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/40848a43072768d5a0a41a1df05f7a8ffd85f345 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 14:19:40 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sat, 13 Apr 2019 10:19:40 -0400 Subject: [Git][ghc/ghc][wip/run-nofib] 42 commits: Replace git.haskell.org with gitlab.haskell.org (#16196) Message-ID: <5cb1effcbe109_62b33fa2ef5353f03307531@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/run-nofib at Glasgow Haskell Compiler / GHC Commits: 75abaaea by Yuriy Syrovetskiy at 2019-04-04T08:23:19Z Replace git.haskell.org with gitlab.haskell.org (#16196) - - - - - 25c02ea1 by Ryan Scott at 2019-04-04T08:29:29Z Fix #16518 with some more kind-splitting smarts This patch corrects two simple oversights that led to #16518: 1. `HsUtils.typeToLHsType` was taking visibility into account in the `TyConApp` case, but not the `AppTy` case. I've factored out the visibility-related logic into its own `go_app` function and now invoke `go_app` from both the `TyConApp` and `AppTy` cases. 2. `Type.fun_kind_arg_flags` did not properly split kinds with nested `forall`s, such as `(forall k. k -> Type) -> (forall k. k -> Type)`. This was simply because `fun_kind_arg_flags`'s `FunTy` case always bailed out and assumed all subsequent arguments were `Required`, which clearly isn't the case for nested `forall`s. I tweaked the `FunTy` case to recur on the result kind. - - - - - 51fd3571 by Ryan Scott at 2019-04-04T08:35:39Z Use funPrec, not topPrec, to parenthesize GADT argument types A simple oversight. Fixes #16527. - - - - - 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - 2e7b2e55 by Ara Adkins at 2019-04-13T14:00:02Z [skip ci] Update CI badge in readme This trivial MR updates the CI badge in the readme to point to the new CI on gitlab, rather than the very out-of-date badge from Travis. - - - - - 40848a43 by Ben Gamari at 2019-04-13T14:02:36Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - cc333b33 by Ben Gamari at 2019-04-13T14:18:31Z gitlab-ci: Fix image names - - - - - dcd46c88 by Ben Gamari at 2019-04-13T14:19:13Z gitlab-ci: Run nofib on binary distributions Updates docker images to ensure that the `time` utility is available. - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - .gitlab/win32-init.sh - HACKING.md - README.md - boot - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/ghci/ByteCodeLink.hs - compiler/hsSyn/HsUtils.hs - compiler/iface/IfaceSyn.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscMain.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/dce66331e88630c697e2df388976da34fc372795...dcd46c886092d80b5f8dd45f7143631d0aaac85b -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/dce66331e88630c697e2df388976da34fc372795...dcd46c886092d80b5f8dd45f7143631d0aaac85b You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sat Apr 13 14:27:59 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sat, 13 Apr 2019 10:27:59 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 13 commits: Add -ddump-stg-final to dump stg as it is used for codegen. Message-ID: <5cb1f1efd81e7_62b33fa2ebceddbc3327767@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - 2e7b2e55 by Ara Adkins at 2019-04-13T14:00:02Z [skip ci] Update CI badge in readme This trivial MR updates the CI badge in the readme to point to the new CI on gitlab, rather than the very out-of-date badge from Travis. - - - - - 40848a43 by Ben Gamari at 2019-04-13T14:02:36Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - c632bd9f by David Eichmann at 2019-04-13T14:27:44Z Hadrian: add rts shared library symlinks for backwards compatability Fixes test T3807 when building with Hadrian. Trac #16370 - - - - - 0f477c92 by Sylvain Henry at 2019-04-13T14:27:47Z Hadrian: add binary-dist-dir target This patch adds an Hadrian target "binary-dist-dir". Compared to "binary-dist", it only builds a binary distribution directory without creating the Tar archive. It makes the use/test of the bindist installation script easier. - - - - - 88deb623 by Krzysztof Gogolewski at 2019-04-13T14:27:47Z Fix assertion failures reported in #16533 - - - - - b89f2687 by Artem Pyanykh at 2019-04-13T14:27:49Z codegen: unroll memcpy calls for small bytearrays - - - - - 851ee22f by Artem Pyanykh at 2019-04-13T14:27:49Z docs: mention memcpy optimization for ByteArrays in 8.10.1-notes - - - - - 29 changed files: - .gitlab-ci.yml - .gitlab/linters/check-version-number.sh - README.md - compiler/cmm/CmmExpr.hs - compiler/codeGen/StgCmmPrim.hs - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/typecheck/TcCanonical.hs - compiler/typecheck/TcSigs.hs - compiler/types/OptCoercion.hs - compiler/types/Type.hs - docs/users_guide/8.10.1-notes.rst - docs/users_guide/debugging.rst - hadrian/doc/make.md - hadrian/doc/testsuite.md - hadrian/hadrian.cabal - hadrian/src/CommandLine.hs - hadrian/src/Hadrian/Utilities.hs - hadrian/src/Rules.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Rules/Register.hs - + hadrian/src/Rules/Rts.hs - hadrian/src/Settings/Builders/RunTest.hs - libraries/base/System/Timeout.hs - testsuite/tests/codeGen/should_gen_asm/all.T - + testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.asm - + testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.hs - testsuite/tests/dynlibs/Makefile Changes: ===================================== .gitlab-ci.yml ===================================== @@ -85,7 +85,14 @@ lint-submods: refs: - master - /ghc-[0-9]+\.[0-9]+/ - - wip/marge_bot_batch_merge_job + +lint-submods-marge: + extends: .lint-submods + only: + refs: + - merge_requests + variables: + - $CI_MERGE_REQUEST_LABELS =~ /.*wip/marge_bot_batch_merge_job.*/ lint-submods-mr: extends: .lint-submods @@ -525,7 +532,7 @@ validate-x86_64-linux-fedora27: paths: - ghc.tar.xz -validate-x86_64-windows-hadrian: +.validate-x86_64-windows-hadrian: extends: .build-windows-hadrian variables: MSYSTEM: MINGW64 ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -2,5 +2,5 @@ set -e -grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || +grep -E -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== README.md ===================================== @@ -1,7 +1,7 @@ The Glasgow Haskell Compiler ============================ -[![Build Status](https://api.travis-ci.org/ghc/ghc.svg?branch=master)](http://travis-ci.org/ghc/ghc) +[![pipeline status](https://gitlab.haskell.org/ghc/ghc/badges/master/pipeline.svg?style=flat)](https://gitlab.haskell.org/ghc/ghc/commits/master) This is the source tree for [GHC][1], a compiler and interactive environment for the Haskell functional programming language. ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -5,7 +5,7 @@ {-# LANGUAGE UndecidableInstances #-} module CmmExpr - ( CmmExpr(..), cmmExprType, cmmExprWidth, maybeInvertCmmExpr + ( CmmExpr(..), cmmExprType, cmmExprWidth, cmmExprAlignment, maybeInvertCmmExpr , CmmReg(..), cmmRegType, cmmRegWidth , CmmLit(..), cmmLitType , LocalReg(..), localRegType @@ -43,6 +43,8 @@ import Unique import Data.Set (Set) import qualified Data.Set as Set +import BasicTypes (Alignment, mkAlignment, alignmentOf) + ----------------------------------------------------------------------------- -- CmmExpr -- An expression. Expressions have no side effects. @@ -239,6 +241,13 @@ cmmLabelType dflags lbl cmmExprWidth :: DynFlags -> CmmExpr -> Width cmmExprWidth dflags e = typeWidth (cmmExprType dflags e) +-- | Returns an alignment in bytes of a CmmExpr when it's a statically +-- known integer constant, otherwise returns an alignment of 1 byte. +-- The caller is responsible for using with a sensible CmmExpr +-- argument. +cmmExprAlignment :: CmmExpr -> Alignment +cmmExprAlignment (CmmLit (CmmInt intOff _)) = alignmentOf (fromInteger intOff) +cmmExprAlignment _ = mkAlignment 1 -------- --- Negation for conditional branches ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -2035,8 +2035,8 @@ doCopyByteArrayOp = emitCopyByteArray copy where -- Copy data (we assume the arrays aren't overlapping since -- they're of different types) - copy _src _dst dst_p src_p bytes = - emitMemcpyCall dst_p src_p bytes 1 + copy _src _dst dst_p src_p bytes align = + emitMemcpyCall dst_p src_p bytes align -- | Takes a source 'MutableByteArray#', an offset in the source -- array, a destination 'MutableByteArray#', an offset into the @@ -2050,22 +2050,26 @@ doCopyMutableByteArrayOp = emitCopyByteArray copy -- The only time the memory might overlap is when the two arrays -- we were provided are the same array! -- TODO: Optimize branch for common case of no aliasing. - copy src dst dst_p src_p bytes = do + copy src dst dst_p src_p bytes align = do dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair - (getCode $ emitMemmoveCall dst_p src_p bytes 1) - (getCode $ emitMemcpyCall dst_p src_p bytes 1) + (getCode $ emitMemmoveCall dst_p src_p bytes align) + (getCode $ emitMemcpyCall dst_p src_p bytes align) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopyByteArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr - -> FCode ()) + -> Alignment -> FCode ()) -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> FCode () emitCopyByteArray copy src src_off dst dst_off n = do dflags <- getDynFlags + let byteArrayAlignment = wordAlignment dflags + srcOffAlignment = cmmExprAlignment src_off + dstOffAlignment = cmmExprAlignment dst_off + align = minimum [byteArrayAlignment, srcOffAlignment, dstOffAlignment] dst_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags dst (arrWordsHdrSize dflags)) dst_off src_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags src (arrWordsHdrSize dflags)) src_off - copy src dst dst_p src_p n + copy src dst dst_p src_p n align -- | Takes a source 'ByteArray#', an offset in the source array, a -- destination 'Addr#', and the number of bytes to copy. Copies the given @@ -2075,7 +2079,7 @@ doCopyByteArrayToAddrOp src src_off dst_p bytes = do -- Use memcpy (we are allowed to assume the arrays aren't overlapping) dflags <- getDynFlags src_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags src (arrWordsHdrSize dflags)) src_off - emitMemcpyCall dst_p src_p bytes 1 + emitMemcpyCall dst_p src_p bytes (mkAlignment 1) -- | Takes a source 'MutableByteArray#', an offset in the source array, a -- destination 'Addr#', and the number of bytes to copy. Copies the given @@ -2092,7 +2096,7 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do -- Use memcpy (we are allowed to assume the arrays aren't overlapping) dflags <- getDynFlags dst_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags dst (arrWordsHdrSize dflags)) dst_off - emitMemcpyCall dst_p src_p bytes 1 + emitMemcpyCall dst_p src_p bytes (mkAlignment 1) -- ---------------------------------------------------------------------------- @@ -2107,9 +2111,7 @@ doSetByteArrayOp ba off len c = do dflags <- getDynFlags let byteArrayAlignment = wordAlignment dflags -- known since BA is allocated on heap - offsetAlignment = case off of - CmmLit (CmmInt intOff _) -> alignmentOf (fromInteger intOff) - _ -> mkAlignment 1 + offsetAlignment = cmmExprAlignment off align = min byteArrayAlignment offsetAlignment p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off @@ -2180,7 +2182,7 @@ doCopyArrayOp = emitCopyArray copy copy _src _dst dst_p src_p bytes = do dflags <- getDynFlags emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags) + (wordAlignment dflags) -- | Takes a source 'MutableArray#', an offset in the source array, a @@ -2198,9 +2200,9 @@ doCopyMutableArrayOp = emitCopyArray copy dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair (getCode $ emitMemmoveCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) (getCode $ emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopyArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> ByteOff @@ -2247,7 +2249,7 @@ doCopySmallArrayOp = emitCopySmallArray copy copy _src _dst dst_p src_p bytes = do dflags <- getDynFlags emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags) + (wordAlignment dflags) doCopySmallMutableArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> WordOff @@ -2261,9 +2263,9 @@ doCopySmallMutableArrayOp = emitCopySmallArray copy dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair (getCode $ emitMemmoveCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) (getCode $ emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopySmallArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> ByteOff @@ -2328,7 +2330,7 @@ emitCloneArray info_p res_r src src_off n = do (mkIntExpr dflags (arrPtrsHdrSizeW dflags)) src_off) emitMemcpyCall dst_p src_p (mkIntExpr dflags (wordsToBytes dflags n)) - (wORD_SIZE dflags) + (wordAlignment dflags) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) @@ -2365,7 +2367,7 @@ emitCloneSmallArray info_p res_r src src_off n = do (mkIntExpr dflags (smallArrPtrsHdrSizeW dflags)) src_off) emitMemcpyCall dst_p src_p (mkIntExpr dflags (wordsToBytes dflags n)) - (wORD_SIZE dflags) + (wordAlignment dflags) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) @@ -2493,19 +2495,19 @@ doCasByteArray res mba idx idx_ty old new = do -- Helpers for emitting function calls -- | Emit a call to @memcpy at . -emitMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemcpyCall dst src n align = do emitPrimCall [ {-no results-} ] - (MO_Memcpy align) + (MO_Memcpy (alignmentBytes align)) [ dst, src, n ] -- | Emit a call to @memmove at . -emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemmoveCall dst src n align = do emitPrimCall [ {- no results -} ] - (MO_Memmove align) + (MO_Memmove (alignmentBytes align)) [ dst, src, n ] -- | Emit a call to @memset at . The second argument must fit inside an ===================================== compiler/main/DynFlags.hs ===================================== @@ -384,6 +384,7 @@ data DumpFlag | Opt_D_dump_spec | Opt_D_dump_prep | Opt_D_dump_stg + | Opt_D_dump_stg_final | Opt_D_dump_call_arity | Opt_D_dump_exitify | Opt_D_dump_stranal @@ -3339,6 +3340,8 @@ dynamic_flags_deps = [ (setDumpFlag Opt_D_dump_prep) , make_ord_flag defGhcFlag "ddump-stg" (setDumpFlag Opt_D_dump_stg) + , make_ord_flag defGhcFlag "ddump-stg-final" + (setDumpFlag Opt_D_dump_stg_final) , make_ord_flag defGhcFlag "ddump-call-arity" (setDumpFlag Opt_D_dump_call_arity) , make_ord_flag defGhcFlag "ddump-exitify" ===================================== compiler/main/HscMain.hs ===================================== @@ -1470,6 +1470,8 @@ doCodeGen hsc_env this_mod data_tycons let dflags = hsc_dflags hsc_env let stg_binds_w_fvs = annTopBindingsFreeVars stg_binds + dumpIfSet_dyn dflags Opt_D_dump_stg_final + "STG for code gen:" (pprGenStgTopBindings stg_binds_w_fvs) let cmm_stream :: Stream IO CmmGroup () cmm_stream = {-# SCC "StgCmm" #-} StgCmm.codeGen dflags this_mod data_tycons ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -1767,12 +1767,11 @@ genCCall -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --- Unroll memcpy calls if the source and destination pointers are at --- least DWORD aligned and the number of bytes to copy isn't too +-- Unroll memcpy calls if the number of bytes to copy isn't too -- large. Otherwise, call C's memcpy. -genCCall dflags is32Bit (PrimTarget (MO_Memcpy align)) _ +genCCall dflags _ (PrimTarget (MO_Memcpy align)) _ [dst, src, CmmLit (CmmInt n _)] _ - | fromInteger insns <= maxInlineMemcpyInsns dflags && align .&. 3 == 0 = do + | fromInteger insns <= maxInlineMemcpyInsns dflags = do code_dst <- getAnyReg dst dst_r <- getNewRegNat format code_src <- getAnyReg src @@ -1785,7 +1784,9 @@ genCCall dflags is32Bit (PrimTarget (MO_Memcpy align)) _ -- instructions per move. insns = 2 * ((n + sizeBytes - 1) `div` sizeBytes) - format = if align .&. 4 /= 0 then II32 else (archWordFormat is32Bit) + maxAlignment = wordAlignment dflags -- only machine word wide MOVs are supported + effectiveAlignment = min (alignmentOf align) maxAlignment + format = intFormat . widthFromBytes $ alignmentBytes effectiveAlignment -- The size of each move, in bytes. sizeBytes :: Integer ===================================== compiler/typecheck/TcCanonical.hs ===================================== @@ -1015,7 +1015,7 @@ can_eq_nc_forall ev eq_rel s1 s2 -- Done: unify phi1 ~ phi2 go [] subst bndrs2 = ASSERT( null bndrs2 ) - unify loc (eqRelRole eq_rel) phi1' (substTy subst phi2) + unify loc (eqRelRole eq_rel) phi1' (substTyUnchecked subst phi2) go _ _ _ = panic "cna_eq_nc_forall" -- case (s:ss) [] ===================================== compiler/typecheck/TcSigs.hs ===================================== @@ -515,7 +515,7 @@ tcInstSig hs_sig@(PartialSig { psig_hs_ty = hs_ty , sig_inst_skols = tv_prs , sig_inst_wcs = wcs , sig_inst_wcx = wcx - , sig_inst_theta = substTys subst theta + , sig_inst_theta = substTysUnchecked subst theta , sig_inst_tau = substTyUnchecked subst tau } ; traceTc "End partial sig }" (ppr inst_sig) ; return inst_sig } ===================================== compiler/types/OptCoercion.hs ===================================== @@ -118,8 +118,8 @@ optCoercion' env co (Pair in_ty1 in_ty2, in_role) = coercionKindRole co (Pair out_ty1 out_ty2, out_role) = coercionKindRole out_co in - ASSERT2( substTy env in_ty1 `eqType` out_ty1 && - substTy env in_ty2 `eqType` out_ty2 && + ASSERT2( substTyUnchecked env in_ty1 `eqType` out_ty1 && + substTyUnchecked env in_ty2 `eqType` out_ty2 && in_role == out_role , text "optCoercion changed types!" $$ hang (text "in_co:") 2 (ppr co) ===================================== compiler/types/Type.hs ===================================== @@ -1044,7 +1044,7 @@ piResultTys ty orig_args@(arg:args) init_subst = mkEmptyTCvSubst $ mkInScopeSet (tyCoVarsOfTypes (ty:orig_args)) go :: TCvSubst -> Type -> [Type] -> Type - go subst ty [] = substTy subst ty + go subst ty [] = substTyUnchecked subst ty go subst ty all_args@(arg:args) | Just ty' <- coreView ty ===================================== docs/users_guide/8.10.1-notes.rst ===================================== @@ -61,10 +61,11 @@ Compiler :ghc-flag:`-Wredundant-record-wildcards` which warn users when they have redundant or unused uses of a record wildcard match. -- Calls to `memset` are now unrolled more aggressively and the - produced code is more efficient on `x86_64` with added support for - 64-bit `MOV`s. In particular, `setByteArray#` calls that were not - optimized before, now will be. See :ghc-ticket:`16052`. +- Calls to `memset` and `memcpy` are now unrolled more aggressively + and the produced code is more efficient on `x86_64` with added + support for 64-bit `MOV`s. In particular, `setByteArray#` and + `copyByteArray#` calls that were not optimized before, now will + be. See :ghc-ticket:`16052`. Runtime system ~~~~~~~~~~~~~~ ===================================== docs/users_guide/debugging.rst ===================================== @@ -380,6 +380,11 @@ These flags dump various phases of GHC's STG pipeline. Show the output of the intermediate STG-to-STG pass. (*lots* of output!) +.. ghc-flag:: -ddump-stg-final + :shortdesc: Show output of last STG pass. + :type: dynamic + + Show the output of the last STG pass before we generate Cmm. C-\\- representation ~~~~~~~~~~~~~~~~~~~~ ===================================== hadrian/doc/make.md ===================================== @@ -123,6 +123,8 @@ time you fire up a build. This is not possible with the Make build system. make test # (1) make test TEST=plugins01 # (2) make test TEST="plugins01 plugins02" # (3) + make accept # (4) + PLATFORM=YES OS=YES make accept # (5) # Hadrian @@ -134,6 +136,12 @@ time you fire up a build. This is not possible with the Make build system. build test --only="plugins01 plugins02" # equivalent to (3) TEST="plugins01 plugins02" build test # equivalent to (3) TEST=plugins01 build test --only=plugins02 # equivalent to (3) + + build test -a # equivalent to (4) + build test --test-accept # equivalent to (4) + + PLATFORM=YES OS=YES build test -a # equivalent to (5) + PLATFORM=YES OS=YES build test --test-accept # equivalent to (5) ``` As illustrated in the examples above, you can use the `TEST` environment ===================================== hadrian/doc/testsuite.md ===================================== @@ -40,6 +40,31 @@ TEST="test1 test2" build test TEST="test1 test2" build test --only="test3 test4" ``` +## Accepting new output + +You can use the `-a` or `--test-accept` flag to "accept" the new +output of your tests. This has the effect of updating the expected +output of all the tests that fail due to mismatching output, so as to +consider the new output the correct one. + +When the `PLATFORM` environment variable is set to `YES`, passing this flag has +the effect of accepting the new output for the current platform. + +When the `OS` environment variable is set to `YES`, passing this flag has the +effect of accepting the new output for all word sizes on the current OS. + +``` sh +# accept new output for all tests +build test -a + +# just run and accept new output for 'test123' and 'test456' +build test -a --only="test123 test456" + +# accept new output for current platform and all word sizes for +# the current OS, for all tests +PLATFORM=YES OS=YES build test -a +``` + ## Performance tests You can use the `--only-perf` and `--skip-perf` flags to ===================================== hadrian/hadrian.cabal ===================================== @@ -66,6 +66,7 @@ executable hadrian , Rules.Nofib , Rules.Program , Rules.Register + , Rules.Rts , Rules.Selftest , Rules.SimpleTargets , Rules.SourceDist @@ -121,7 +122,7 @@ executable hadrian build-depends: base >= 4.8 && < 5 , Cabal >= 3.0 && < 3.1 , containers >= 0.5 && < 0.7 - , directory >= 1.2 && < 1.4 + , directory >= 1.3.1.0 && < 1.4 , extra >= 1.4.7 , filepath , mtl == 2.2.* ===================================== hadrian/src/CommandLine.hs ===================================== @@ -56,7 +56,8 @@ data TestArgs = TestArgs , testSpeed :: TestSpeed , testSummary :: Maybe FilePath , testVerbosity :: Maybe String - , testWays :: [String] } + , testWays :: [String] + , testAccept :: Bool} deriving (Eq, Show) -- | Default value for `TestArgs`. @@ -73,7 +74,8 @@ defaultTestArgs = TestArgs , testSpeed = TestNormal , testSummary = Nothing , testVerbosity = Nothing - , testWays = [] } + , testWays = [] + , testAccept = False } readConfigure :: Either String (CommandLineArgs -> CommandLineArgs) readConfigure = Right $ \flags -> flags { configure = True } @@ -124,6 +126,9 @@ readProgressInfo ms = readTestKeepFiles :: Either String (CommandLineArgs -> CommandLineArgs) readTestKeepFiles = Right $ \flags -> flags { testArgs = (testArgs flags) { testKeepFiles = True } } +readTestAccept :: Either String (CommandLineArgs -> CommandLineArgs) +readTestAccept = Right $ \flags -> flags { testArgs = (testArgs flags) { testAccept = True } } + readTestCompiler :: Maybe String -> Either String (CommandLineArgs -> CommandLineArgs) readTestCompiler compiler = maybe (Left "Cannot parse compiler") (Right . set) compiler where @@ -245,7 +250,8 @@ optDescrs = , Option [] ["test-verbose"] (OptArg readTestVerbose "TEST_VERBOSE") "A verbosity value between 0 and 5. 0 is silent, 4 and higher activates extra output." , Option [] ["test-way"] (OptArg readTestWay "TEST_WAY") - "only run these ways" ] + "only run these ways" + , Option ['a'] ["test-accept"] (NoArg readTestAccept) "Accept new output of tests" ] -- | A type-indexed map containing Hadrian command line arguments to be passed -- to Shake via 'shakeExtra'. ===================================== hadrian/src/Hadrian/Utilities.hs ===================================== @@ -16,8 +16,9 @@ module Hadrian.Utilities ( BuildRoot (..), buildRoot, buildRootRules, isGeneratedSource, -- * File system operations - copyFile, copyFileUntracked, fixFile, makeExecutable, moveFile, removeFile, - createDirectory, copyDirectory, moveDirectory, removeDirectory, + copyFile, copyFileUntracked, createFileLinkUntracked, fixFile, + makeExecutable, moveFile, removeFile, createDirectory, copyDirectory, + moveDirectory, removeDirectory, -- * Diagnostic info UseColour (..), Colour (..), ANSIColour (..), putColoured, @@ -288,6 +289,14 @@ infixl 1 <&> isGeneratedSource :: FilePath -> Action Bool isGeneratedSource file = buildRoot <&> (`isPrefixOf` file) +-- | Link a file tracking the source. Create the target directory if missing. +createFileLinkUntracked :: FilePath -> FilePath -> Action () +createFileLinkUntracked linkTarget link = do + let dir = takeDirectory linkTarget + liftIO $ IO.createDirectoryIfMissing True dir + putProgressInfo =<< renderCreateFileLink linkTarget link + quietly . liftIO $ IO.createFileLink linkTarget link + -- | Copy a file tracking the source. Create the target directory if missing. copyFile :: FilePath -> FilePath -> Action () copyFile source target = do @@ -460,8 +469,12 @@ renderAction what input output = do return $ case progressInfo of None -> "" Brief -> "| " ++ what ++ ": " ++ i ++ " => " ++ o - Normal -> renderBox [ what, " input: " ++ i, " => output: " ++ o ] - Unicorn -> renderUnicorn [ what, " input: " ++ i, " => output: " ++ o ] + Normal -> renderBox [ what + , " input: " ++ i + , " => output: " ++ o ] + Unicorn -> renderUnicorn [ what + , " input: " ++ i + , " => output: " ++ o ] where i = unifyPath input o = unifyPath output @@ -478,6 +491,24 @@ renderActionNoOutput what input = do where i = unifyPath input +-- | Render creating a file link. +renderCreateFileLink :: String -> FilePath -> Action String +renderCreateFileLink linkTarget link' = do + progressInfo <- userSetting Brief + let what = "Creating file link" + linkString = link ++ " -> " ++ linkTarget + return $ case progressInfo of + None -> "" + Brief -> "| " ++ what ++ ": " ++ linkString + Normal -> renderBox [ what + , " link name: " ++ link + , " -> link target: " ++ linkTarget ] + Unicorn -> renderUnicorn [ what + , " link name: " ++ link + , " -> link target: " ++ linkTarget ] + where + link = unifyPath link' + -- | Render the successful build of a program. renderProgram :: String -> String -> String -> String renderProgram name bin synopsis = renderBox $ ===================================== hadrian/src/Rules.hs ===================================== @@ -21,6 +21,7 @@ import qualified Rules.Libffi import qualified Rules.Library import qualified Rules.Program import qualified Rules.Register +import qualified Rules.Rts import qualified Rules.SimpleTargets import Settings import Target @@ -158,6 +159,7 @@ buildRules = do Rules.Gmp.gmpRules Rules.Libffi.libffiRules Rules.Library.libraryRules + Rules.Rts.rtsRules packageRules oracleRules :: Rules () ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -97,7 +97,7 @@ other, the install script: bindistRules :: Rules () bindistRules = do root <- buildRootRules - phony "binary-dist" $ do + phony "binary-dist-dir" $ do -- We 'need' all binaries and libraries targets <- mapM pkgTarget =<< stagePackages Stage1 need targets @@ -150,6 +150,16 @@ bindistRules = do , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] + + phony "binary-dist" $ do + + need ["binary-dist-dir"] + + version <- setting ProjectVersion + targetPlatform <- setting TargetPlatformFull + + let ghcVersionPretty = "ghc-" ++ version ++ "-" ++ targetPlatform + -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz tarPath <- builderPath (Tar Create) cmd [Cwd $ root -/- "bindist"] tarPath ===================================== hadrian/src/Rules/Register.hs ===================================== @@ -8,6 +8,7 @@ import Hadrian.Haskell.Cabal import Oracles.Setting import Packages import Rules.Gmp +import Rules.Rts import Settings import Target import Utilities @@ -117,6 +118,9 @@ buildConf _ context at Context {..} conf = do Cabal.copyPackage context Cabal.registerPackage context + -- | Dynamic RTS library files need symlinks (Rules.Rts.rtsRules). + when (package == rts) (needRtsSymLinks stage ways) + -- The above two steps produce an entry in the package database, with copies -- of many of the files we have build, e.g. Haskell interface files. We need -- to record this side effect so that Shake can cache these files too. ===================================== hadrian/src/Rules/Rts.hs ===================================== @@ -0,0 +1,54 @@ +module Rules.Rts (rtsRules, needRtsSymLinks) where + +import Packages (rts) +import Hadrian.Utilities +import Settings.Builders.Common + +-- | Dynamic RTS library files need symlinks without the dummy version number. +-- This is for backwards compatibility (the old make build system omitted the +-- dummy version number). +-- This rule has priority 2 to override the general rule for generating share +-- library files (see Rules.Library.libraryRules). +rtsRules :: Rules () +rtsRules = priority 2 $ do + root <- buildRootRules + [ root -/- "//libHSrts_*-ghc*.so", + root -/- "//libHSrts_*-ghc*.dylib", + root -/- "//libHSrts-ghc*.so", + root -/- "//libHSrts-ghc*.dylib"] + |%> \ rtsLibFilePath' -> createFileLinkUntracked + (addRtsDummyVersion $ takeFileName rtsLibFilePath') + rtsLibFilePath' + +-- Need symlinks generated by rtsRules. +needRtsSymLinks :: Stage -> [Way] -> Action () +needRtsSymLinks stage rtsWays + = forM_ (filter (wayUnit Dynamic) rtsWays) $ \ way -> do + let ctx = Context stage rts way + libPath <- libPath ctx + distDir <- distDir stage + rtsLibFile <- takeFileName <$> pkgLibraryFile ctx + need [removeRtsDummyVersion (libPath distDir rtsLibFile)] + +prefix, versionlessPrefix :: String +versionlessPrefix = "libHSrts" +prefix = versionlessPrefix ++ "-1.0" + +-- removeRtsDummyVersion "a/libHSrts-1.0-ghc1.2.3.4.so" +-- == "a/libHSrts-ghc1.2.3.4.so" +removeRtsDummyVersion :: FilePath -> FilePath +removeRtsDummyVersion = replaceLibFilePrefix prefix versionlessPrefix + +-- addRtsDummyVersion "a/libHSrts-ghc1.2.3.4.so" +-- == "a/libHSrts-1.0-ghc1.2.3.4.so" +addRtsDummyVersion :: FilePath -> FilePath +addRtsDummyVersion = replaceLibFilePrefix versionlessPrefix prefix + +replaceLibFilePrefix :: String -> String -> FilePath -> FilePath +replaceLibFilePrefix oldPrefix newPrefix oldFilePath = let + oldFileName = takeFileName oldFilePath + newFileName = maybe + (error $ "Expected RTS library file to start with " ++ oldPrefix) + (newPrefix ++) + (stripPrefix oldPrefix oldFileName) + in replaceFileName oldFilePath newFileName \ No newline at end of file ===================================== hadrian/src/Settings/Builders/RunTest.hs ===================================== @@ -71,6 +71,11 @@ runTestBuilderArgs = builder RunTest ? do debugged <- read <$> getTestSetting TestGhcDebugged keepFiles <- expr (testKeepFiles <$> userSetting defaultTestArgs) + accept <- expr (testAccept <$> userSetting defaultTestArgs) + (acceptPlatform, acceptOS) <- expr . liftIO $ + (,) <$> (maybe False (=="YES") <$> lookupEnv "PLATFORM") + <*> (maybe False (=="YES") <$> lookupEnv "OS") + windows <- expr windowsHost darwin <- expr osxHost threads <- shakeThreads <$> expr getShakeOptions @@ -95,6 +100,9 @@ runTestBuilderArgs = builder RunTest ? do , arg "-e", arg $ "darwin=" ++ show darwin , arg "-e", arg $ "config.local=False" , arg "-e", arg $ "config.cleanup=" ++ show (not keepFiles) + , arg "-e", arg $ "config.accept=" ++ show accept + , arg "-e", arg $ "config.accept_platform=" ++ show acceptPlatform + , arg "-e", arg $ "config.accept_os=" ++ show acceptOS , arg "-e", arg $ "config.exeext=" ++ quote exe , arg "-e", arg $ "config.compiler_debugged=" ++ quote (yesNo debugged) , arg "-e", arg $ "ghc_debugged=" ++ quote (yesNo debugged) ===================================== libraries/base/System/Timeout.hs ===================================== @@ -16,7 +16,7 @@ -- ------------------------------------------------------------------------------- -module System.Timeout ( timeout ) where +module System.Timeout ( Timeout, timeout ) where #if !defined(mingw32_HOST_OS) import Control.Monad @@ -35,7 +35,11 @@ import Data.Unique (Unique, newUnique) -- interrupt the running IO computation when the timeout has -- expired. -newtype Timeout = Timeout Unique deriving Eq -- ^ @since 4.0 +-- | An exception thrown to a thread by 'timeout' to interrupt a timed-out +-- computation. +-- +-- @since 4.0 +newtype Timeout = Timeout Unique deriving Eq -- | @since 4.0 instance Show Timeout where @@ -67,20 +71,25 @@ instance Exception Timeout where -- another thread. -- -- A tricky implementation detail is the question of how to abort an @IO@ --- computation. This combinator relies on asynchronous exceptions internally. --- The technique works very well for computations executing inside of the --- Haskell runtime system, but it doesn't work at all for non-Haskell code. --- Foreign function calls, for example, cannot be timed out with this --- combinator simply because an arbitrary C function cannot receive --- asynchronous exceptions. When @timeout@ is used to wrap an FFI call that --- blocks, no timeout event can be delivered until the FFI call returns, which --- pretty much negates the purpose of the combinator. In practice, however, --- this limitation is less severe than it may sound. Standard I\/O functions --- like 'System.IO.hGetBuf', 'System.IO.hPutBuf', Network.Socket.accept, or --- 'System.IO.hWaitForInput' appear to be blocking, but they really don't --- because the runtime system uses scheduling mechanisms like @select(2)@ to --- perform asynchronous I\/O, so it is possible to interrupt standard socket --- I\/O or file I\/O using this combinator. +-- computation. This combinator relies on asynchronous exceptions internally +-- (namely throwing the computation the 'Timeout' exception). The technique +-- works very well for computations executing inside of the Haskell runtime +-- system, but it doesn't work at all for non-Haskell code. Foreign function +-- calls, for example, cannot be timed out with this combinator simply because +-- an arbitrary C function cannot receive asynchronous exceptions. When +-- @timeout@ is used to wrap an FFI call that blocks, no timeout event can be +-- delivered until the FFI call returns, which pretty much negates the purpose +-- of the combinator. In practice, however, this limitation is less severe than +-- it may sound. Standard I\/O functions like 'System.IO.hGetBuf', +-- 'System.IO.hPutBuf', Network.Socket.accept, or 'System.IO.hWaitForInput' +-- appear to be blocking, but they really don't because the runtime system uses +-- scheduling mechanisms like @select(2)@ to perform asynchronous I\/O, so it +-- is possible to interrupt standard socket I\/O or file I\/O using this +-- combinator. +--- +-- Note that 'timeout' cancels the computation by throwing it the 'Timeout' +-- exception. Consequently blanket exception handlers (e.g. catching +-- 'SomeException') within the computation will break the timeout behavior. timeout :: Int -> IO a -> IO (Maybe a) timeout n f | n < 0 = fmap Just f ===================================== testsuite/tests/codeGen/should_gen_asm/all.T ===================================== @@ -8,3 +8,4 @@ test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('bytearray-memset-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) +test('bytearray-memcpy-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.asm ===================================== @@ -0,0 +1,8 @@ +movw 0(%rax),%dx +movw %dx,0(%rcx) +movw 2(%rax),%dx +movw %dx,2(%rcx) +movw 4(%rax),%dx +movw %dx,4(%rcx) +movw 6(%rax),%ax +movw %ax,6(%rcx) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.hs ===================================== @@ -0,0 +1,19 @@ +{-# language MagicHash #-} +{-# language UnboxedTuples #-} + +module CopyArray + ( smallCopy + ) where + +import GHC.Exts +import GHC.IO + +data ByteArray = ByteArray ByteArray# + +-- Does an 8 byte copy with sub-word (2 bytes) alignment +-- Should be unrolled into 4 aligned stores (MOVWs) +smallCopy :: ByteArray -> IO ByteArray +smallCopy (ByteArray ba) = IO $ \s0 -> case newByteArray# 8# s0 of + (# s1, mut #) -> case copyByteArray# ba 2# mut 0# 8# s1 of + s2 -> case unsafeFreezeByteArray# mut s2 of + (# s3, frozen #) -> (# s3, ByteArray frozen #) ===================================== testsuite/tests/dynlibs/Makefile ===================================== @@ -9,6 +9,11 @@ T3807: $(RM) T3807-export.o T3807-load.o $(RM) T3807test.so $(RM) T3807-load + + # GHC does not automatically link with the RTS when building shared + # libraries. This is done to allow the RTS flavour to be chosen later (i.e. + # when linking an executable). + # Hence we must explicitly linking with the RTS here. '$(TEST_HC)' $(filter-out -rtsopts,$(TEST_HC_OPTS)) -v0 --make -dynamic -fPIC -shared T3807Export.hs T3807-export.c -o T3807test.so -lHSrts-ghc`'$(TEST_HC)' $(TEST_HC_OPTS) --numeric-version` '$(TEST_HC)' $(filter-out -rtsopts,$(TEST_HC_OPTS)) -no-auto-link-packages -no-hs-main T3807-load.c -o T3807-load -ldl ./T3807-load View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/3d19895f2e5739f7a96c69793b469bf521410f40...851ee22f379ca4840e75e771832cdacd89ce5693 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/3d19895f2e5739f7a96c69793b469bf521410f40...851ee22f379ca4840e75e771832cdacd89ce5693 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 05:14:20 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 01:14:20 -0400 Subject: [Git][ghc/ghc][master] Hadrian: add rts shared library symlinks for backwards compatability Message-ID: <5cb2c1ac3c74a_62b33fa2cc9e1db8337396d@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 5f183081 by David Eichmann at 2019-04-14T05:08:15Z Hadrian: add rts shared library symlinks for backwards compatability Fixes test T3807 when building with Hadrian. Trac #16370 - - - - - 6 changed files: - hadrian/hadrian.cabal - hadrian/src/Hadrian/Utilities.hs - hadrian/src/Rules.hs - hadrian/src/Rules/Register.hs - + hadrian/src/Rules/Rts.hs - testsuite/tests/dynlibs/Makefile Changes: ===================================== hadrian/hadrian.cabal ===================================== @@ -66,6 +66,7 @@ executable hadrian , Rules.Nofib , Rules.Program , Rules.Register + , Rules.Rts , Rules.Selftest , Rules.SimpleTargets , Rules.SourceDist @@ -121,7 +122,7 @@ executable hadrian build-depends: base >= 4.8 && < 5 , Cabal >= 3.0 && < 3.1 , containers >= 0.5 && < 0.7 - , directory >= 1.2 && < 1.4 + , directory >= 1.3.1.0 && < 1.4 , extra >= 1.4.7 , filepath , mtl == 2.2.* ===================================== hadrian/src/Hadrian/Utilities.hs ===================================== @@ -16,8 +16,9 @@ module Hadrian.Utilities ( BuildRoot (..), buildRoot, buildRootRules, isGeneratedSource, -- * File system operations - copyFile, copyFileUntracked, fixFile, makeExecutable, moveFile, removeFile, - createDirectory, copyDirectory, moveDirectory, removeDirectory, + copyFile, copyFileUntracked, createFileLinkUntracked, fixFile, + makeExecutable, moveFile, removeFile, createDirectory, copyDirectory, + moveDirectory, removeDirectory, -- * Diagnostic info UseColour (..), Colour (..), ANSIColour (..), putColoured, @@ -288,6 +289,14 @@ infixl 1 <&> isGeneratedSource :: FilePath -> Action Bool isGeneratedSource file = buildRoot <&> (`isPrefixOf` file) +-- | Link a file tracking the source. Create the target directory if missing. +createFileLinkUntracked :: FilePath -> FilePath -> Action () +createFileLinkUntracked linkTarget link = do + let dir = takeDirectory linkTarget + liftIO $ IO.createDirectoryIfMissing True dir + putProgressInfo =<< renderCreateFileLink linkTarget link + quietly . liftIO $ IO.createFileLink linkTarget link + -- | Copy a file tracking the source. Create the target directory if missing. copyFile :: FilePath -> FilePath -> Action () copyFile source target = do @@ -460,8 +469,12 @@ renderAction what input output = do return $ case progressInfo of None -> "" Brief -> "| " ++ what ++ ": " ++ i ++ " => " ++ o - Normal -> renderBox [ what, " input: " ++ i, " => output: " ++ o ] - Unicorn -> renderUnicorn [ what, " input: " ++ i, " => output: " ++ o ] + Normal -> renderBox [ what + , " input: " ++ i + , " => output: " ++ o ] + Unicorn -> renderUnicorn [ what + , " input: " ++ i + , " => output: " ++ o ] where i = unifyPath input o = unifyPath output @@ -478,6 +491,24 @@ renderActionNoOutput what input = do where i = unifyPath input +-- | Render creating a file link. +renderCreateFileLink :: String -> FilePath -> Action String +renderCreateFileLink linkTarget link' = do + progressInfo <- userSetting Brief + let what = "Creating file link" + linkString = link ++ " -> " ++ linkTarget + return $ case progressInfo of + None -> "" + Brief -> "| " ++ what ++ ": " ++ linkString + Normal -> renderBox [ what + , " link name: " ++ link + , " -> link target: " ++ linkTarget ] + Unicorn -> renderUnicorn [ what + , " link name: " ++ link + , " -> link target: " ++ linkTarget ] + where + link = unifyPath link' + -- | Render the successful build of a program. renderProgram :: String -> String -> String -> String renderProgram name bin synopsis = renderBox $ ===================================== hadrian/src/Rules.hs ===================================== @@ -21,6 +21,7 @@ import qualified Rules.Libffi import qualified Rules.Library import qualified Rules.Program import qualified Rules.Register +import qualified Rules.Rts import qualified Rules.SimpleTargets import Settings import Target @@ -158,6 +159,7 @@ buildRules = do Rules.Gmp.gmpRules Rules.Libffi.libffiRules Rules.Library.libraryRules + Rules.Rts.rtsRules packageRules oracleRules :: Rules () ===================================== hadrian/src/Rules/Register.hs ===================================== @@ -8,6 +8,7 @@ import Hadrian.Haskell.Cabal import Oracles.Setting import Packages import Rules.Gmp +import Rules.Rts import Settings import Target import Utilities @@ -117,6 +118,9 @@ buildConf _ context at Context {..} conf = do Cabal.copyPackage context Cabal.registerPackage context + -- | Dynamic RTS library files need symlinks (Rules.Rts.rtsRules). + when (package == rts) (needRtsSymLinks stage ways) + -- The above two steps produce an entry in the package database, with copies -- of many of the files we have build, e.g. Haskell interface files. We need -- to record this side effect so that Shake can cache these files too. ===================================== hadrian/src/Rules/Rts.hs ===================================== @@ -0,0 +1,54 @@ +module Rules.Rts (rtsRules, needRtsSymLinks) where + +import Packages (rts) +import Hadrian.Utilities +import Settings.Builders.Common + +-- | Dynamic RTS library files need symlinks without the dummy version number. +-- This is for backwards compatibility (the old make build system omitted the +-- dummy version number). +-- This rule has priority 2 to override the general rule for generating share +-- library files (see Rules.Library.libraryRules). +rtsRules :: Rules () +rtsRules = priority 2 $ do + root <- buildRootRules + [ root -/- "//libHSrts_*-ghc*.so", + root -/- "//libHSrts_*-ghc*.dylib", + root -/- "//libHSrts-ghc*.so", + root -/- "//libHSrts-ghc*.dylib"] + |%> \ rtsLibFilePath' -> createFileLinkUntracked + (addRtsDummyVersion $ takeFileName rtsLibFilePath') + rtsLibFilePath' + +-- Need symlinks generated by rtsRules. +needRtsSymLinks :: Stage -> [Way] -> Action () +needRtsSymLinks stage rtsWays + = forM_ (filter (wayUnit Dynamic) rtsWays) $ \ way -> do + let ctx = Context stage rts way + libPath <- libPath ctx + distDir <- distDir stage + rtsLibFile <- takeFileName <$> pkgLibraryFile ctx + need [removeRtsDummyVersion (libPath distDir rtsLibFile)] + +prefix, versionlessPrefix :: String +versionlessPrefix = "libHSrts" +prefix = versionlessPrefix ++ "-1.0" + +-- removeRtsDummyVersion "a/libHSrts-1.0-ghc1.2.3.4.so" +-- == "a/libHSrts-ghc1.2.3.4.so" +removeRtsDummyVersion :: FilePath -> FilePath +removeRtsDummyVersion = replaceLibFilePrefix prefix versionlessPrefix + +-- addRtsDummyVersion "a/libHSrts-ghc1.2.3.4.so" +-- == "a/libHSrts-1.0-ghc1.2.3.4.so" +addRtsDummyVersion :: FilePath -> FilePath +addRtsDummyVersion = replaceLibFilePrefix versionlessPrefix prefix + +replaceLibFilePrefix :: String -> String -> FilePath -> FilePath +replaceLibFilePrefix oldPrefix newPrefix oldFilePath = let + oldFileName = takeFileName oldFilePath + newFileName = maybe + (error $ "Expected RTS library file to start with " ++ oldPrefix) + (newPrefix ++) + (stripPrefix oldPrefix oldFileName) + in replaceFileName oldFilePath newFileName \ No newline at end of file ===================================== testsuite/tests/dynlibs/Makefile ===================================== @@ -9,6 +9,11 @@ T3807: $(RM) T3807-export.o T3807-load.o $(RM) T3807test.so $(RM) T3807-load + + # GHC does not automatically link with the RTS when building shared + # libraries. This is done to allow the RTS flavour to be chosen later (i.e. + # when linking an executable). + # Hence we must explicitly linking with the RTS here. '$(TEST_HC)' $(filter-out -rtsopts,$(TEST_HC_OPTS)) -v0 --make -dynamic -fPIC -shared T3807Export.hs T3807-export.c -o T3807test.so -lHSrts-ghc`'$(TEST_HC)' $(TEST_HC_OPTS) --numeric-version` '$(TEST_HC)' $(filter-out -rtsopts,$(TEST_HC_OPTS)) -no-auto-link-packages -no-hs-main T3807-load.c -o T3807-load -ldl ./T3807-load View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/5f1830817b90960d5d11bee95a99df3e1425f8ab -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/5f1830817b90960d5d11bee95a99df3e1425f8ab You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 05:20:27 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 01:20:27 -0400 Subject: [Git][ghc/ghc][master] Hadrian: add binary-dist-dir target Message-ID: <5cb2c31b8191b_62b33fa2ea7ac1ec3376262@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 9b142c53 by Sylvain Henry at 2019-04-14T05:14:23Z Hadrian: add binary-dist-dir target This patch adds an Hadrian target "binary-dist-dir". Compared to "binary-dist", it only builds a binary distribution directory without creating the Tar archive. It makes the use/test of the bindist installation script easier. - - - - - 1 changed file: - hadrian/src/Rules/BinaryDist.hs Changes: ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -97,7 +97,7 @@ other, the install script: bindistRules :: Rules () bindistRules = do root <- buildRootRules - phony "binary-dist" $ do + phony "binary-dist-dir" $ do -- We 'need' all binaries and libraries targets <- mapM pkgTarget =<< stagePackages Stage1 need targets @@ -150,6 +150,16 @@ bindistRules = do , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] + + phony "binary-dist" $ do + + need ["binary-dist-dir"] + + version <- setting ProjectVersion + targetPlatform <- setting TargetPlatformFull + + let ghcVersionPretty = "ghc-" ++ version ++ "-" ++ targetPlatform + -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz tarPath <- builderPath (Tar Create) cmd [Cwd $ root -/- "bindist"] tarPath View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9b142c53325ffee6e3eef55daabefe9e2881f9e9 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/9b142c53325ffee6e3eef55daabefe9e2881f9e9 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 05:20:32 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 01:20:32 -0400 Subject: [Git][ghc/ghc][wip/T16533] 40 commits: testsuite: Add testcase for #16111 Message-ID: <5cb2c320a4eee_62b33fa2cc9e1db83376689@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/T16533 at Glasgow Haskell Compiler / GHC Commits: 6c0dd085 by Ben Gamari at 2019-04-04T12:12:24Z testsuite: Add testcase for #16111 - - - - - cbb88865 by klebinger.andreas at gmx.at at 2019-04-04T12:12:25Z Restore Xmm registers properly in StgCRun.c This fixes #16514: Xmm6-15 was restored based off rax instead of rsp. The code was introduced in the fix for #14619. - - - - - 33b0a291 by Ryan Scott at 2019-04-04T12:12:28Z Tweak error messages for narrowly-kinded assoc default decls This program, from #13971, currently has a rather confusing error message: ```hs class C a where type T a :: k type T a = Int ``` ``` • Kind mis-match on LHS of default declaration for ‘T’ • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` It's not at all obvious why GHC is complaining about the LHS until you realize that the default, when printed with `-fprint-explicit-kinds`, is actually `type T @{k} @* a = Int`. That is to say, the kind of `a` is being instantiated to `Type`, whereas it ought to be a kind variable. The primary thrust of this patch is to weak the error message to make this connection more obvious: ``` • Illegal argument ‘*’ in: ‘type T @{k} @* a = Int’ The arguments to ‘T’ must all be type variables • In the default type instance declaration for ‘T’ In the class declaration for ‘C’ ``` Along the way, I performed some code cleanup suggested by @rae in https://gitlab.haskell.org/ghc/ghc/issues/13971#note_191287. Before, we were creating a substitution from the default declaration's type variables to the type family tycon's type variables by way of `tcMatchTys`. But this is overkill, since we already know (from the aforementioned validity checking) that all the arguments in a default declaration must be type variables anyway. Therefore, creating the substitution is as simple as using `zipTvSubst`. I took the opportunity to perform this refactoring while I was in town. Fixes #13971. - - - - - 3a38ea44 by Eric Crockett at 2019-04-07T19:21:59Z Fix #16282. Previously, -W(all-)missed-specs was created with 'NoReason', so no information about the flag was printed along with the warning. Now, -Wall-missed-specs is listed as the Reason if it was set, otherwise -Wmissed-specs is listed as the reason. - - - - - 63b7d5fb by Michal Terepeta at 2019-04-08T18:29:34Z Generate straightline code for inline array allocation GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta at gmail.com> - - - - - 2b3f4718 by Phuong Trinh at 2019-04-08T18:35:43Z Fix #16500: look for interface files in -hidir flag in OneShot mode We are currently ignoring options set in the hiDir field of hsc_dflags when looking for interface files while compiling in OneShot mode. This is inconsistent with the behaviour of other directory redirecting fields (such as objectDir or hieDir). It is also inconsistent with the behaviour of compilation in CompManager mode (a.k.a `ghc --make`) which looks for interface files in the directory set in hidir flag. This changes Finder.hs so that we use the value of hiDir while looking for interface in OneShot mode. - - - - - 97502be8 by Yuriy Syrovetskiy at 2019-04-08T18:41:51Z Add `-optcxx` option (#16477) - - - - - 97d3d546 by Ben Gamari at 2019-04-08T18:47:54Z testsuite: Unmark T16190 as broken Was broken via #16389 yet strangely it has started passing despite the fact that the suggested root cause has not changed. - - - - - a42d206a by Yuriy Syrovetskiy at 2019-04-08T18:54:02Z Fix whitespace style - - - - - 4dda2270 by Matthew Pickering at 2019-04-08T19:00:08Z Use ./hadrian/ghci.sh in .ghcid - - - - - d236d9d0 by Sebastian Graf at 2019-04-08T19:06:15Z Make `singleConstructor` cope with pattern synonyms Previously, `singleConstructor` didn't handle singleton `COMPLETE` sets of a single pattern synonym, resulting in incomplete pattern warnings in #15753. This is fixed by making `singleConstructor` (now named `singleMatchConstructor`) query `allCompleteMatches`, necessarily making it effectful. As a result, most of this patch is concerned with threading the side-effect through to `singleMatchConstructor`. Unfortunately, this is not enough to completely fix the original reproduction from #15753 and #15884, which are related to function applications in pattern guards being translated too conservatively. - - - - - 1085090e by Ömer Sinan Ağacan at 2019-04-08T19:12:22Z Skip test ArithInt16 and ArithWord16 in GHCi way These tests use unboxed tuples, which GHCi doesn't support - - - - - 7287bb9e by Ömer Sinan Ağacan at 2019-04-08T19:18:33Z testsuite: Show exit code of GHCi tests on failure - - - - - f5604d37 by John Ericson at 2019-04-08T19:24:43Z settings.in: Reformat We're might be about to switch to generating it in Hadrian/Make. This reformat makes it easier to programmingmatically generate and end up with the exact same thing, which is good for diffing to ensure no regressions. I had this as part of !712, but given the difficulty of satisfying CI, I figured I should break things up even further. - - - - - cf9e1837 by Ryan Scott at 2019-04-08T19:30:51Z Bump hpc submodule Currently, the `hpc` submodule is pinned against the `wip/final-mfp` branch, not against `master`. This pins it back against `master`. - - - - - 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - 2e7b2e55 by Ara Adkins at 2019-04-13T14:00:02Z [skip ci] Update CI badge in readme This trivial MR updates the CI badge in the readme to point to the new CI on gitlab, rather than the very out-of-date badge from Travis. - - - - - 40848a43 by Ben Gamari at 2019-04-13T14:02:36Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - 5f183081 by David Eichmann at 2019-04-14T05:08:15Z Hadrian: add rts shared library symlinks for backwards compatability Fixes test T3807 when building with Hadrian. Trac #16370 - - - - - 9b142c53 by Sylvain Henry at 2019-04-14T05:14:23Z Hadrian: add binary-dist-dir target This patch adds an Hadrian target "binary-dist-dir". Compared to "binary-dist", it only builds a binary distribution directory without creating the Tar archive. It makes the use/test of the bindist installation script easier. - - - - - 6febc444 by Krzysztof Gogolewski at 2019-04-14T05:20:29Z Fix assertion failures reported in #16533 - - - - - 30 changed files: - .ghcid - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - .gitlab/win32-init.sh - README.md - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/deSugar/Check.hs - compiler/ghci/ByteCodeLink.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DriverPhases.hs - compiler/main/DriverPipeline.hs - compiler/main/DynFlags.hs - compiler/main/Finder.hs - compiler/main/HscMain.hs - compiler/main/HscTypes.hs - compiler/main/SysTools.hs - compiler/main/SysTools/ExtraObj.hs - compiler/main/SysTools/Tasks.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/7aebf304c5e02a1f212e6b00f30dc0c75a974bde...6febc444c0abea6c033174aa0e813c950b9b2877 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/7aebf304c5e02a1f212e6b00f30dc0c75a974bde...6febc444c0abea6c033174aa0e813c950b9b2877 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 05:26:33 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 01:26:33 -0400 Subject: [Git][ghc/ghc][master] Fix assertion failures reported in #16533 Message-ID: <5cb2c48990b55_62b33fa2ef5353f03379550@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 6febc444 by Krzysztof Gogolewski at 2019-04-14T05:20:29Z Fix assertion failures reported in #16533 - - - - - 4 changed files: - compiler/typecheck/TcCanonical.hs - compiler/typecheck/TcSigs.hs - compiler/types/OptCoercion.hs - compiler/types/Type.hs Changes: ===================================== compiler/typecheck/TcCanonical.hs ===================================== @@ -1015,7 +1015,7 @@ can_eq_nc_forall ev eq_rel s1 s2 -- Done: unify phi1 ~ phi2 go [] subst bndrs2 = ASSERT( null bndrs2 ) - unify loc (eqRelRole eq_rel) phi1' (substTy subst phi2) + unify loc (eqRelRole eq_rel) phi1' (substTyUnchecked subst phi2) go _ _ _ = panic "cna_eq_nc_forall" -- case (s:ss) [] ===================================== compiler/typecheck/TcSigs.hs ===================================== @@ -515,7 +515,7 @@ tcInstSig hs_sig@(PartialSig { psig_hs_ty = hs_ty , sig_inst_skols = tv_prs , sig_inst_wcs = wcs , sig_inst_wcx = wcx - , sig_inst_theta = substTys subst theta + , sig_inst_theta = substTysUnchecked subst theta , sig_inst_tau = substTyUnchecked subst tau } ; traceTc "End partial sig }" (ppr inst_sig) ; return inst_sig } ===================================== compiler/types/OptCoercion.hs ===================================== @@ -118,8 +118,8 @@ optCoercion' env co (Pair in_ty1 in_ty2, in_role) = coercionKindRole co (Pair out_ty1 out_ty2, out_role) = coercionKindRole out_co in - ASSERT2( substTy env in_ty1 `eqType` out_ty1 && - substTy env in_ty2 `eqType` out_ty2 && + ASSERT2( substTyUnchecked env in_ty1 `eqType` out_ty1 && + substTyUnchecked env in_ty2 `eqType` out_ty2 && in_role == out_role , text "optCoercion changed types!" $$ hang (text "in_co:") 2 (ppr co) ===================================== compiler/types/Type.hs ===================================== @@ -1044,7 +1044,7 @@ piResultTys ty orig_args@(arg:args) init_subst = mkEmptyTCvSubst $ mkInScopeSet (tyCoVarsOfTypes (ty:orig_args)) go :: TCvSubst -> Type -> [Type] -> Type - go subst ty [] = substTy subst ty + go subst ty [] = substTyUnchecked subst ty go subst ty all_args@(arg:args) | Just ty' <- coreView ty View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/6febc444c0abea6c033174aa0e813c950b9b2877 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/6febc444c0abea6c033174aa0e813c950b9b2877 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 05:32:41 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 01:32:41 -0400 Subject: [Git][ghc/ghc][master] 2 commits: codegen: unroll memcpy calls for small bytearrays Message-ID: <5cb2c5f9e7ae3_62b33fa2ea7ac1ec3382024@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: edcef7b3 by Artem Pyanykh at 2019-04-14T05:26:35Z codegen: unroll memcpy calls for small bytearrays - - - - - 6094d43f by Artem Pyanykh at 2019-04-14T05:26:35Z docs: mention memcpy optimization for ByteArrays in 8.10.1-notes - - - - - 7 changed files: - compiler/cmm/CmmExpr.hs - compiler/codeGen/StgCmmPrim.hs - compiler/nativeGen/X86/CodeGen.hs - docs/users_guide/8.10.1-notes.rst - testsuite/tests/codeGen/should_gen_asm/all.T - + testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.asm - + testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.hs Changes: ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -5,7 +5,7 @@ {-# LANGUAGE UndecidableInstances #-} module CmmExpr - ( CmmExpr(..), cmmExprType, cmmExprWidth, maybeInvertCmmExpr + ( CmmExpr(..), cmmExprType, cmmExprWidth, cmmExprAlignment, maybeInvertCmmExpr , CmmReg(..), cmmRegType, cmmRegWidth , CmmLit(..), cmmLitType , LocalReg(..), localRegType @@ -43,6 +43,8 @@ import Unique import Data.Set (Set) import qualified Data.Set as Set +import BasicTypes (Alignment, mkAlignment, alignmentOf) + ----------------------------------------------------------------------------- -- CmmExpr -- An expression. Expressions have no side effects. @@ -239,6 +241,13 @@ cmmLabelType dflags lbl cmmExprWidth :: DynFlags -> CmmExpr -> Width cmmExprWidth dflags e = typeWidth (cmmExprType dflags e) +-- | Returns an alignment in bytes of a CmmExpr when it's a statically +-- known integer constant, otherwise returns an alignment of 1 byte. +-- The caller is responsible for using with a sensible CmmExpr +-- argument. +cmmExprAlignment :: CmmExpr -> Alignment +cmmExprAlignment (CmmLit (CmmInt intOff _)) = alignmentOf (fromInteger intOff) +cmmExprAlignment _ = mkAlignment 1 -------- --- Negation for conditional branches ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -2035,8 +2035,8 @@ doCopyByteArrayOp = emitCopyByteArray copy where -- Copy data (we assume the arrays aren't overlapping since -- they're of different types) - copy _src _dst dst_p src_p bytes = - emitMemcpyCall dst_p src_p bytes 1 + copy _src _dst dst_p src_p bytes align = + emitMemcpyCall dst_p src_p bytes align -- | Takes a source 'MutableByteArray#', an offset in the source -- array, a destination 'MutableByteArray#', an offset into the @@ -2050,22 +2050,26 @@ doCopyMutableByteArrayOp = emitCopyByteArray copy -- The only time the memory might overlap is when the two arrays -- we were provided are the same array! -- TODO: Optimize branch for common case of no aliasing. - copy src dst dst_p src_p bytes = do + copy src dst dst_p src_p bytes align = do dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair - (getCode $ emitMemmoveCall dst_p src_p bytes 1) - (getCode $ emitMemcpyCall dst_p src_p bytes 1) + (getCode $ emitMemmoveCall dst_p src_p bytes align) + (getCode $ emitMemcpyCall dst_p src_p bytes align) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopyByteArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr - -> FCode ()) + -> Alignment -> FCode ()) -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> FCode () emitCopyByteArray copy src src_off dst dst_off n = do dflags <- getDynFlags + let byteArrayAlignment = wordAlignment dflags + srcOffAlignment = cmmExprAlignment src_off + dstOffAlignment = cmmExprAlignment dst_off + align = minimum [byteArrayAlignment, srcOffAlignment, dstOffAlignment] dst_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags dst (arrWordsHdrSize dflags)) dst_off src_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags src (arrWordsHdrSize dflags)) src_off - copy src dst dst_p src_p n + copy src dst dst_p src_p n align -- | Takes a source 'ByteArray#', an offset in the source array, a -- destination 'Addr#', and the number of bytes to copy. Copies the given @@ -2075,7 +2079,7 @@ doCopyByteArrayToAddrOp src src_off dst_p bytes = do -- Use memcpy (we are allowed to assume the arrays aren't overlapping) dflags <- getDynFlags src_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags src (arrWordsHdrSize dflags)) src_off - emitMemcpyCall dst_p src_p bytes 1 + emitMemcpyCall dst_p src_p bytes (mkAlignment 1) -- | Takes a source 'MutableByteArray#', an offset in the source array, a -- destination 'Addr#', and the number of bytes to copy. Copies the given @@ -2092,7 +2096,7 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do -- Use memcpy (we are allowed to assume the arrays aren't overlapping) dflags <- getDynFlags dst_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags dst (arrWordsHdrSize dflags)) dst_off - emitMemcpyCall dst_p src_p bytes 1 + emitMemcpyCall dst_p src_p bytes (mkAlignment 1) -- ---------------------------------------------------------------------------- @@ -2107,9 +2111,7 @@ doSetByteArrayOp ba off len c = do dflags <- getDynFlags let byteArrayAlignment = wordAlignment dflags -- known since BA is allocated on heap - offsetAlignment = case off of - CmmLit (CmmInt intOff _) -> alignmentOf (fromInteger intOff) - _ -> mkAlignment 1 + offsetAlignment = cmmExprAlignment off align = min byteArrayAlignment offsetAlignment p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off @@ -2180,7 +2182,7 @@ doCopyArrayOp = emitCopyArray copy copy _src _dst dst_p src_p bytes = do dflags <- getDynFlags emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags) + (wordAlignment dflags) -- | Takes a source 'MutableArray#', an offset in the source array, a @@ -2198,9 +2200,9 @@ doCopyMutableArrayOp = emitCopyArray copy dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair (getCode $ emitMemmoveCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) (getCode $ emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopyArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> ByteOff @@ -2247,7 +2249,7 @@ doCopySmallArrayOp = emitCopySmallArray copy copy _src _dst dst_p src_p bytes = do dflags <- getDynFlags emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags) + (wordAlignment dflags) doCopySmallMutableArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> WordOff @@ -2261,9 +2263,9 @@ doCopySmallMutableArrayOp = emitCopySmallArray copy dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair (getCode $ emitMemmoveCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) (getCode $ emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopySmallArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> ByteOff @@ -2328,7 +2330,7 @@ emitCloneArray info_p res_r src src_off n = do (mkIntExpr dflags (arrPtrsHdrSizeW dflags)) src_off) emitMemcpyCall dst_p src_p (mkIntExpr dflags (wordsToBytes dflags n)) - (wORD_SIZE dflags) + (wordAlignment dflags) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) @@ -2365,7 +2367,7 @@ emitCloneSmallArray info_p res_r src src_off n = do (mkIntExpr dflags (smallArrPtrsHdrSizeW dflags)) src_off) emitMemcpyCall dst_p src_p (mkIntExpr dflags (wordsToBytes dflags n)) - (wORD_SIZE dflags) + (wordAlignment dflags) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) @@ -2493,19 +2495,19 @@ doCasByteArray res mba idx idx_ty old new = do -- Helpers for emitting function calls -- | Emit a call to @memcpy at . -emitMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemcpyCall dst src n align = do emitPrimCall [ {-no results-} ] - (MO_Memcpy align) + (MO_Memcpy (alignmentBytes align)) [ dst, src, n ] -- | Emit a call to @memmove at . -emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemmoveCall dst src n align = do emitPrimCall [ {- no results -} ] - (MO_Memmove align) + (MO_Memmove (alignmentBytes align)) [ dst, src, n ] -- | Emit a call to @memset at . The second argument must fit inside an ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -1767,12 +1767,11 @@ genCCall -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --- Unroll memcpy calls if the source and destination pointers are at --- least DWORD aligned and the number of bytes to copy isn't too +-- Unroll memcpy calls if the number of bytes to copy isn't too -- large. Otherwise, call C's memcpy. -genCCall dflags is32Bit (PrimTarget (MO_Memcpy align)) _ +genCCall dflags _ (PrimTarget (MO_Memcpy align)) _ [dst, src, CmmLit (CmmInt n _)] _ - | fromInteger insns <= maxInlineMemcpyInsns dflags && align .&. 3 == 0 = do + | fromInteger insns <= maxInlineMemcpyInsns dflags = do code_dst <- getAnyReg dst dst_r <- getNewRegNat format code_src <- getAnyReg src @@ -1785,7 +1784,9 @@ genCCall dflags is32Bit (PrimTarget (MO_Memcpy align)) _ -- instructions per move. insns = 2 * ((n + sizeBytes - 1) `div` sizeBytes) - format = if align .&. 4 /= 0 then II32 else (archWordFormat is32Bit) + maxAlignment = wordAlignment dflags -- only machine word wide MOVs are supported + effectiveAlignment = min (alignmentOf align) maxAlignment + format = intFormat . widthFromBytes $ alignmentBytes effectiveAlignment -- The size of each move, in bytes. sizeBytes :: Integer ===================================== docs/users_guide/8.10.1-notes.rst ===================================== @@ -61,10 +61,11 @@ Compiler :ghc-flag:`-Wredundant-record-wildcards` which warn users when they have redundant or unused uses of a record wildcard match. -- Calls to `memset` are now unrolled more aggressively and the - produced code is more efficient on `x86_64` with added support for - 64-bit `MOV`s. In particular, `setByteArray#` calls that were not - optimized before, now will be. See :ghc-ticket:`16052`. +- Calls to `memset` and `memcpy` are now unrolled more aggressively + and the produced code is more efficient on `x86_64` with added + support for 64-bit `MOV`s. In particular, `setByteArray#` and + `copyByteArray#` calls that were not optimized before, now will + be. See :ghc-ticket:`16052`. Runtime system ~~~~~~~~~~~~~~ ===================================== testsuite/tests/codeGen/should_gen_asm/all.T ===================================== @@ -8,3 +8,4 @@ test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('bytearray-memset-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) +test('bytearray-memcpy-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.asm ===================================== @@ -0,0 +1,8 @@ +movw 0(%rax),%dx +movw %dx,0(%rcx) +movw 2(%rax),%dx +movw %dx,2(%rcx) +movw 4(%rax),%dx +movw %dx,4(%rcx) +movw 6(%rax),%ax +movw %ax,6(%rcx) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.hs ===================================== @@ -0,0 +1,19 @@ +{-# language MagicHash #-} +{-# language UnboxedTuples #-} + +module CopyArray + ( smallCopy + ) where + +import GHC.Exts +import GHC.IO + +data ByteArray = ByteArray ByteArray# + +-- Does an 8 byte copy with sub-word (2 bytes) alignment +-- Should be unrolled into 4 aligned stores (MOVWs) +smallCopy :: ByteArray -> IO ByteArray +smallCopy (ByteArray ba) = IO $ \s0 -> case newByteArray# 8# s0 of + (# s1, mut #) -> case copyByteArray# ba 2# mut 0# 8# s1 of + s2 -> case unsafeFreezeByteArray# mut s2 of + (# s3, frozen #) -> (# s3, ByteArray frozen #) View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/6febc444c0abea6c033174aa0e813c950b9b2877...6094d43f36bdab5ff3f246afca9a6018545fdd73 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/6febc444c0abea6c033174aa0e813c950b9b2877...6094d43f36bdab5ff3f246afca9a6018545fdd73 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 06:03:07 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 02:03:07 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 9 commits: Hadrian: add rts shared library symlinks for backwards compatability Message-ID: <5cb2cd1baab03_62b33fa2eb4dfa6c3386598@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: 5f183081 by David Eichmann at 2019-04-14T05:08:15Z Hadrian: add rts shared library symlinks for backwards compatability Fixes test T3807 when building with Hadrian. Trac #16370 - - - - - 9b142c53 by Sylvain Henry at 2019-04-14T05:14:23Z Hadrian: add binary-dist-dir target This patch adds an Hadrian target "binary-dist-dir". Compared to "binary-dist", it only builds a binary distribution directory without creating the Tar archive. It makes the use/test of the bindist installation script easier. - - - - - 6febc444 by Krzysztof Gogolewski at 2019-04-14T05:20:29Z Fix assertion failures reported in #16533 - - - - - edcef7b3 by Artem Pyanykh at 2019-04-14T05:26:35Z codegen: unroll memcpy calls for small bytearrays - - - - - 6094d43f by Artem Pyanykh at 2019-04-14T05:26:35Z docs: mention memcpy optimization for ByteArrays in 8.10.1-notes - - - - - 0dcf4207 by Simon Jakobi at 2019-04-14T06:02:58Z Ord docs: Add explanation on 'min' and 'max' operator interactions [ci skip] - - - - - 6237e98c by Krzysztof Gogolewski at 2019-04-14T06:02:58Z Add a safeguard to Core Lint Lint returns a pair (Maybe a, WarnsAndErrs). The Maybe monad allows to handle an unrecoverable failure. In case of such a failure, the error should be added to the second component of the pair. If this is not done, Lint will silently accept bad programs. This situation actually happened during development of linear types. This adds a safeguard. - - - - - 257a2546 by Ben Gamari at 2019-04-14T06:02:58Z CODEOWNERS: Add simonmar as owner of rts/linker I suspect this is why @simonmar wasn't notified of !706. [skip ci] - - - - - fea4089e by Alp Mestanogullari at 2019-04-14T06:03:00Z Hadrian: don't accept p_dyn for executables, to fix --flavour=prof - - - - - 23 changed files: - CODEOWNERS - compiler/cmm/CmmExpr.hs - compiler/codeGen/StgCmmPrim.hs - compiler/coreSyn/CoreLint.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/typecheck/TcCanonical.hs - compiler/typecheck/TcSigs.hs - compiler/types/OptCoercion.hs - compiler/types/Type.hs - docs/users_guide/8.10.1-notes.rst - hadrian/hadrian.cabal - hadrian/src/Hadrian/Utilities.hs - hadrian/src/Rules.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Rules/Register.hs - + hadrian/src/Rules/Rts.hs - hadrian/src/Settings.hs - hadrian/src/Settings/Flavours/Profiled.hs - libraries/ghc-prim/GHC/Classes.hs - testsuite/tests/codeGen/should_gen_asm/all.T - + testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.asm - + testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.hs - testsuite/tests/dynlibs/Makefile Changes: ===================================== CODEOWNERS ===================================== @@ -12,7 +12,7 @@ # RTS-like things /rts/ @bgamari @simonmar @osa1 @Phyx @angerman -/rts/linker/ @angerman @Phyx +/rts/linker/ @angerman @Phyx @simonmar /includes/ @bgamari @simonmar @osa1 # The compiler ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -5,7 +5,7 @@ {-# LANGUAGE UndecidableInstances #-} module CmmExpr - ( CmmExpr(..), cmmExprType, cmmExprWidth, maybeInvertCmmExpr + ( CmmExpr(..), cmmExprType, cmmExprWidth, cmmExprAlignment, maybeInvertCmmExpr , CmmReg(..), cmmRegType, cmmRegWidth , CmmLit(..), cmmLitType , LocalReg(..), localRegType @@ -43,6 +43,8 @@ import Unique import Data.Set (Set) import qualified Data.Set as Set +import BasicTypes (Alignment, mkAlignment, alignmentOf) + ----------------------------------------------------------------------------- -- CmmExpr -- An expression. Expressions have no side effects. @@ -239,6 +241,13 @@ cmmLabelType dflags lbl cmmExprWidth :: DynFlags -> CmmExpr -> Width cmmExprWidth dflags e = typeWidth (cmmExprType dflags e) +-- | Returns an alignment in bytes of a CmmExpr when it's a statically +-- known integer constant, otherwise returns an alignment of 1 byte. +-- The caller is responsible for using with a sensible CmmExpr +-- argument. +cmmExprAlignment :: CmmExpr -> Alignment +cmmExprAlignment (CmmLit (CmmInt intOff _)) = alignmentOf (fromInteger intOff) +cmmExprAlignment _ = mkAlignment 1 -------- --- Negation for conditional branches ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -2035,8 +2035,8 @@ doCopyByteArrayOp = emitCopyByteArray copy where -- Copy data (we assume the arrays aren't overlapping since -- they're of different types) - copy _src _dst dst_p src_p bytes = - emitMemcpyCall dst_p src_p bytes 1 + copy _src _dst dst_p src_p bytes align = + emitMemcpyCall dst_p src_p bytes align -- | Takes a source 'MutableByteArray#', an offset in the source -- array, a destination 'MutableByteArray#', an offset into the @@ -2050,22 +2050,26 @@ doCopyMutableByteArrayOp = emitCopyByteArray copy -- The only time the memory might overlap is when the two arrays -- we were provided are the same array! -- TODO: Optimize branch for common case of no aliasing. - copy src dst dst_p src_p bytes = do + copy src dst dst_p src_p bytes align = do dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair - (getCode $ emitMemmoveCall dst_p src_p bytes 1) - (getCode $ emitMemcpyCall dst_p src_p bytes 1) + (getCode $ emitMemmoveCall dst_p src_p bytes align) + (getCode $ emitMemcpyCall dst_p src_p bytes align) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopyByteArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr - -> FCode ()) + -> Alignment -> FCode ()) -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> FCode () emitCopyByteArray copy src src_off dst dst_off n = do dflags <- getDynFlags + let byteArrayAlignment = wordAlignment dflags + srcOffAlignment = cmmExprAlignment src_off + dstOffAlignment = cmmExprAlignment dst_off + align = minimum [byteArrayAlignment, srcOffAlignment, dstOffAlignment] dst_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags dst (arrWordsHdrSize dflags)) dst_off src_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags src (arrWordsHdrSize dflags)) src_off - copy src dst dst_p src_p n + copy src dst dst_p src_p n align -- | Takes a source 'ByteArray#', an offset in the source array, a -- destination 'Addr#', and the number of bytes to copy. Copies the given @@ -2075,7 +2079,7 @@ doCopyByteArrayToAddrOp src src_off dst_p bytes = do -- Use memcpy (we are allowed to assume the arrays aren't overlapping) dflags <- getDynFlags src_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags src (arrWordsHdrSize dflags)) src_off - emitMemcpyCall dst_p src_p bytes 1 + emitMemcpyCall dst_p src_p bytes (mkAlignment 1) -- | Takes a source 'MutableByteArray#', an offset in the source array, a -- destination 'Addr#', and the number of bytes to copy. Copies the given @@ -2092,7 +2096,7 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do -- Use memcpy (we are allowed to assume the arrays aren't overlapping) dflags <- getDynFlags dst_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags dst (arrWordsHdrSize dflags)) dst_off - emitMemcpyCall dst_p src_p bytes 1 + emitMemcpyCall dst_p src_p bytes (mkAlignment 1) -- ---------------------------------------------------------------------------- @@ -2107,9 +2111,7 @@ doSetByteArrayOp ba off len c = do dflags <- getDynFlags let byteArrayAlignment = wordAlignment dflags -- known since BA is allocated on heap - offsetAlignment = case off of - CmmLit (CmmInt intOff _) -> alignmentOf (fromInteger intOff) - _ -> mkAlignment 1 + offsetAlignment = cmmExprAlignment off align = min byteArrayAlignment offsetAlignment p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off @@ -2180,7 +2182,7 @@ doCopyArrayOp = emitCopyArray copy copy _src _dst dst_p src_p bytes = do dflags <- getDynFlags emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags) + (wordAlignment dflags) -- | Takes a source 'MutableArray#', an offset in the source array, a @@ -2198,9 +2200,9 @@ doCopyMutableArrayOp = emitCopyArray copy dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair (getCode $ emitMemmoveCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) (getCode $ emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopyArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> ByteOff @@ -2247,7 +2249,7 @@ doCopySmallArrayOp = emitCopySmallArray copy copy _src _dst dst_p src_p bytes = do dflags <- getDynFlags emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags) + (wordAlignment dflags) doCopySmallMutableArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> WordOff @@ -2261,9 +2263,9 @@ doCopySmallMutableArrayOp = emitCopySmallArray copy dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair (getCode $ emitMemmoveCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) (getCode $ emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopySmallArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> ByteOff @@ -2328,7 +2330,7 @@ emitCloneArray info_p res_r src src_off n = do (mkIntExpr dflags (arrPtrsHdrSizeW dflags)) src_off) emitMemcpyCall dst_p src_p (mkIntExpr dflags (wordsToBytes dflags n)) - (wORD_SIZE dflags) + (wordAlignment dflags) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) @@ -2365,7 +2367,7 @@ emitCloneSmallArray info_p res_r src src_off n = do (mkIntExpr dflags (smallArrPtrsHdrSizeW dflags)) src_off) emitMemcpyCall dst_p src_p (mkIntExpr dflags (wordsToBytes dflags n)) - (wORD_SIZE dflags) + (wordAlignment dflags) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) @@ -2493,19 +2495,19 @@ doCasByteArray res mba idx idx_ty old new = do -- Helpers for emitting function calls -- | Emit a call to @memcpy at . -emitMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemcpyCall dst src n align = do emitPrimCall [ {-no results-} ] - (MO_Memcpy align) + (MO_Memcpy (alignmentBytes align)) [ dst, src, n ] -- | Emit a call to @memmove at . -emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemmoveCall dst src n align = do emitPrimCall [ {- no results -} ] - (MO_Memmove align) + (MO_Memmove (alignmentBytes align)) [ dst, src, n ] -- | Emit a call to @memset at . The second argument must fit inside an ===================================== compiler/coreSyn/CoreLint.hs ===================================== @@ -2080,7 +2080,7 @@ defaultLintFlags = LF { lf_check_global_ids = False newtype LintM a = LintM { unLintM :: LintEnv -> - WarnsAndErrs -> -- Error and warning messages so far + WarnsAndErrs -> -- Warning and error messages so far (Maybe a, WarnsAndErrs) } -- Result and messages (if any) type WarnsAndErrs = (Bag MsgDoc, Bag MsgDoc) @@ -2189,10 +2189,13 @@ data LintLocInfo | InCo Coercion -- Inside a coercion initL :: DynFlags -> LintFlags -> InScopeSet - -> LintM a -> WarnsAndErrs -- Errors and warnings + -> LintM a -> WarnsAndErrs -- Warnings and errors initL dflags flags in_scope m = case unLintM m env (emptyBag, emptyBag) of - (_, errs) -> errs + (Just _, errs) -> errs + (Nothing, errs@(_, e)) | not (isEmptyBag e) -> errs + | otherwise -> pprPanic ("Bug in Lint: a failure occurred " ++ + "without reporting an error message") empty where env = LE { le_flags = flags , le_subst = mkEmptyTCvSubst in_scope ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -1767,12 +1767,11 @@ genCCall -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --- Unroll memcpy calls if the source and destination pointers are at --- least DWORD aligned and the number of bytes to copy isn't too +-- Unroll memcpy calls if the number of bytes to copy isn't too -- large. Otherwise, call C's memcpy. -genCCall dflags is32Bit (PrimTarget (MO_Memcpy align)) _ +genCCall dflags _ (PrimTarget (MO_Memcpy align)) _ [dst, src, CmmLit (CmmInt n _)] _ - | fromInteger insns <= maxInlineMemcpyInsns dflags && align .&. 3 == 0 = do + | fromInteger insns <= maxInlineMemcpyInsns dflags = do code_dst <- getAnyReg dst dst_r <- getNewRegNat format code_src <- getAnyReg src @@ -1785,7 +1784,9 @@ genCCall dflags is32Bit (PrimTarget (MO_Memcpy align)) _ -- instructions per move. insns = 2 * ((n + sizeBytes - 1) `div` sizeBytes) - format = if align .&. 4 /= 0 then II32 else (archWordFormat is32Bit) + maxAlignment = wordAlignment dflags -- only machine word wide MOVs are supported + effectiveAlignment = min (alignmentOf align) maxAlignment + format = intFormat . widthFromBytes $ alignmentBytes effectiveAlignment -- The size of each move, in bytes. sizeBytes :: Integer ===================================== compiler/typecheck/TcCanonical.hs ===================================== @@ -1015,7 +1015,7 @@ can_eq_nc_forall ev eq_rel s1 s2 -- Done: unify phi1 ~ phi2 go [] subst bndrs2 = ASSERT( null bndrs2 ) - unify loc (eqRelRole eq_rel) phi1' (substTy subst phi2) + unify loc (eqRelRole eq_rel) phi1' (substTyUnchecked subst phi2) go _ _ _ = panic "cna_eq_nc_forall" -- case (s:ss) [] ===================================== compiler/typecheck/TcSigs.hs ===================================== @@ -515,7 +515,7 @@ tcInstSig hs_sig@(PartialSig { psig_hs_ty = hs_ty , sig_inst_skols = tv_prs , sig_inst_wcs = wcs , sig_inst_wcx = wcx - , sig_inst_theta = substTys subst theta + , sig_inst_theta = substTysUnchecked subst theta , sig_inst_tau = substTyUnchecked subst tau } ; traceTc "End partial sig }" (ppr inst_sig) ; return inst_sig } ===================================== compiler/types/OptCoercion.hs ===================================== @@ -118,8 +118,8 @@ optCoercion' env co (Pair in_ty1 in_ty2, in_role) = coercionKindRole co (Pair out_ty1 out_ty2, out_role) = coercionKindRole out_co in - ASSERT2( substTy env in_ty1 `eqType` out_ty1 && - substTy env in_ty2 `eqType` out_ty2 && + ASSERT2( substTyUnchecked env in_ty1 `eqType` out_ty1 && + substTyUnchecked env in_ty2 `eqType` out_ty2 && in_role == out_role , text "optCoercion changed types!" $$ hang (text "in_co:") 2 (ppr co) ===================================== compiler/types/Type.hs ===================================== @@ -1044,7 +1044,7 @@ piResultTys ty orig_args@(arg:args) init_subst = mkEmptyTCvSubst $ mkInScopeSet (tyCoVarsOfTypes (ty:orig_args)) go :: TCvSubst -> Type -> [Type] -> Type - go subst ty [] = substTy subst ty + go subst ty [] = substTyUnchecked subst ty go subst ty all_args@(arg:args) | Just ty' <- coreView ty ===================================== docs/users_guide/8.10.1-notes.rst ===================================== @@ -61,10 +61,11 @@ Compiler :ghc-flag:`-Wredundant-record-wildcards` which warn users when they have redundant or unused uses of a record wildcard match. -- Calls to `memset` are now unrolled more aggressively and the - produced code is more efficient on `x86_64` with added support for - 64-bit `MOV`s. In particular, `setByteArray#` calls that were not - optimized before, now will be. See :ghc-ticket:`16052`. +- Calls to `memset` and `memcpy` are now unrolled more aggressively + and the produced code is more efficient on `x86_64` with added + support for 64-bit `MOV`s. In particular, `setByteArray#` and + `copyByteArray#` calls that were not optimized before, now will + be. See :ghc-ticket:`16052`. Runtime system ~~~~~~~~~~~~~~ ===================================== hadrian/hadrian.cabal ===================================== @@ -66,6 +66,7 @@ executable hadrian , Rules.Nofib , Rules.Program , Rules.Register + , Rules.Rts , Rules.Selftest , Rules.SimpleTargets , Rules.SourceDist @@ -121,7 +122,7 @@ executable hadrian build-depends: base >= 4.8 && < 5 , Cabal >= 3.0 && < 3.1 , containers >= 0.5 && < 0.7 - , directory >= 1.2 && < 1.4 + , directory >= 1.3.1.0 && < 1.4 , extra >= 1.4.7 , filepath , mtl == 2.2.* ===================================== hadrian/src/Hadrian/Utilities.hs ===================================== @@ -16,8 +16,9 @@ module Hadrian.Utilities ( BuildRoot (..), buildRoot, buildRootRules, isGeneratedSource, -- * File system operations - copyFile, copyFileUntracked, fixFile, makeExecutable, moveFile, removeFile, - createDirectory, copyDirectory, moveDirectory, removeDirectory, + copyFile, copyFileUntracked, createFileLinkUntracked, fixFile, + makeExecutable, moveFile, removeFile, createDirectory, copyDirectory, + moveDirectory, removeDirectory, -- * Diagnostic info UseColour (..), Colour (..), ANSIColour (..), putColoured, @@ -288,6 +289,14 @@ infixl 1 <&> isGeneratedSource :: FilePath -> Action Bool isGeneratedSource file = buildRoot <&> (`isPrefixOf` file) +-- | Link a file tracking the source. Create the target directory if missing. +createFileLinkUntracked :: FilePath -> FilePath -> Action () +createFileLinkUntracked linkTarget link = do + let dir = takeDirectory linkTarget + liftIO $ IO.createDirectoryIfMissing True dir + putProgressInfo =<< renderCreateFileLink linkTarget link + quietly . liftIO $ IO.createFileLink linkTarget link + -- | Copy a file tracking the source. Create the target directory if missing. copyFile :: FilePath -> FilePath -> Action () copyFile source target = do @@ -460,8 +469,12 @@ renderAction what input output = do return $ case progressInfo of None -> "" Brief -> "| " ++ what ++ ": " ++ i ++ " => " ++ o - Normal -> renderBox [ what, " input: " ++ i, " => output: " ++ o ] - Unicorn -> renderUnicorn [ what, " input: " ++ i, " => output: " ++ o ] + Normal -> renderBox [ what + , " input: " ++ i + , " => output: " ++ o ] + Unicorn -> renderUnicorn [ what + , " input: " ++ i + , " => output: " ++ o ] where i = unifyPath input o = unifyPath output @@ -478,6 +491,24 @@ renderActionNoOutput what input = do where i = unifyPath input +-- | Render creating a file link. +renderCreateFileLink :: String -> FilePath -> Action String +renderCreateFileLink linkTarget link' = do + progressInfo <- userSetting Brief + let what = "Creating file link" + linkString = link ++ " -> " ++ linkTarget + return $ case progressInfo of + None -> "" + Brief -> "| " ++ what ++ ": " ++ linkString + Normal -> renderBox [ what + , " link name: " ++ link + , " -> link target: " ++ linkTarget ] + Unicorn -> renderUnicorn [ what + , " link name: " ++ link + , " -> link target: " ++ linkTarget ] + where + link = unifyPath link' + -- | Render the successful build of a program. renderProgram :: String -> String -> String -> String renderProgram name bin synopsis = renderBox $ ===================================== hadrian/src/Rules.hs ===================================== @@ -21,6 +21,7 @@ import qualified Rules.Libffi import qualified Rules.Library import qualified Rules.Program import qualified Rules.Register +import qualified Rules.Rts import qualified Rules.SimpleTargets import Settings import Target @@ -158,6 +159,7 @@ buildRules = do Rules.Gmp.gmpRules Rules.Libffi.libffiRules Rules.Library.libraryRules + Rules.Rts.rtsRules packageRules oracleRules :: Rules () ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -97,7 +97,7 @@ other, the install script: bindistRules :: Rules () bindistRules = do root <- buildRootRules - phony "binary-dist" $ do + phony "binary-dist-dir" $ do -- We 'need' all binaries and libraries targets <- mapM pkgTarget =<< stagePackages Stage1 need targets @@ -150,6 +150,16 @@ bindistRules = do , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] + + phony "binary-dist" $ do + + need ["binary-dist-dir"] + + version <- setting ProjectVersion + targetPlatform <- setting TargetPlatformFull + + let ghcVersionPretty = "ghc-" ++ version ++ "-" ++ targetPlatform + -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz tarPath <- builderPath (Tar Create) cmd [Cwd $ root -/- "bindist"] tarPath ===================================== hadrian/src/Rules/Register.hs ===================================== @@ -8,6 +8,7 @@ import Hadrian.Haskell.Cabal import Oracles.Setting import Packages import Rules.Gmp +import Rules.Rts import Settings import Target import Utilities @@ -117,6 +118,9 @@ buildConf _ context at Context {..} conf = do Cabal.copyPackage context Cabal.registerPackage context + -- | Dynamic RTS library files need symlinks (Rules.Rts.rtsRules). + when (package == rts) (needRtsSymLinks stage ways) + -- The above two steps produce an entry in the package database, with copies -- of many of the files we have build, e.g. Haskell interface files. We need -- to record this side effect so that Shake can cache these files too. ===================================== hadrian/src/Rules/Rts.hs ===================================== @@ -0,0 +1,54 @@ +module Rules.Rts (rtsRules, needRtsSymLinks) where + +import Packages (rts) +import Hadrian.Utilities +import Settings.Builders.Common + +-- | Dynamic RTS library files need symlinks without the dummy version number. +-- This is for backwards compatibility (the old make build system omitted the +-- dummy version number). +-- This rule has priority 2 to override the general rule for generating share +-- library files (see Rules.Library.libraryRules). +rtsRules :: Rules () +rtsRules = priority 2 $ do + root <- buildRootRules + [ root -/- "//libHSrts_*-ghc*.so", + root -/- "//libHSrts_*-ghc*.dylib", + root -/- "//libHSrts-ghc*.so", + root -/- "//libHSrts-ghc*.dylib"] + |%> \ rtsLibFilePath' -> createFileLinkUntracked + (addRtsDummyVersion $ takeFileName rtsLibFilePath') + rtsLibFilePath' + +-- Need symlinks generated by rtsRules. +needRtsSymLinks :: Stage -> [Way] -> Action () +needRtsSymLinks stage rtsWays + = forM_ (filter (wayUnit Dynamic) rtsWays) $ \ way -> do + let ctx = Context stage rts way + libPath <- libPath ctx + distDir <- distDir stage + rtsLibFile <- takeFileName <$> pkgLibraryFile ctx + need [removeRtsDummyVersion (libPath distDir rtsLibFile)] + +prefix, versionlessPrefix :: String +versionlessPrefix = "libHSrts" +prefix = versionlessPrefix ++ "-1.0" + +-- removeRtsDummyVersion "a/libHSrts-1.0-ghc1.2.3.4.so" +-- == "a/libHSrts-ghc1.2.3.4.so" +removeRtsDummyVersion :: FilePath -> FilePath +removeRtsDummyVersion = replaceLibFilePrefix prefix versionlessPrefix + +-- addRtsDummyVersion "a/libHSrts-ghc1.2.3.4.so" +-- == "a/libHSrts-1.0-ghc1.2.3.4.so" +addRtsDummyVersion :: FilePath -> FilePath +addRtsDummyVersion = replaceLibFilePrefix versionlessPrefix prefix + +replaceLibFilePrefix :: String -> String -> FilePath -> FilePath +replaceLibFilePrefix oldPrefix newPrefix oldFilePath = let + oldFileName = takeFileName oldFilePath + newFileName = maybe + (error $ "Expected RTS library file to start with " ++ oldPrefix) + (newPrefix ++) + (stripPrefix oldPrefix oldFileName) + in replaceFileName oldFilePath newFileName \ No newline at end of file ===================================== hadrian/src/Settings.hs ===================================== @@ -60,10 +60,14 @@ programContext :: Stage -> Package -> Action Context programContext stage pkg = do profiled <- ghcProfiled <$> flavour dynGhcProgs <- dynamicGhcPrograms =<< flavour - return . Context stage pkg . wayFromUnits . concat $ - [ [ Profiling | pkg == ghc && profiled && stage > Stage0 ] - , [ Dynamic | dynGhcProgs && stage > Stage0 ] - ] + return $ Context stage pkg (wayFor profiled dynGhcProgs) + + where wayFor prof dyn + | prof && dyn = + error "programContext: profiling+dynamic not supported" + | pkg == ghc && prof && stage > Stage0 = profiling + | dyn && stage > Stage0 = dynamic + | otherwise = vanilla -- TODO: switch to Set Package as the order of packages should not matter? -- Otherwise we have to keep remembering to sort packages from time to time. ===================================== hadrian/src/Settings/Flavours/Profiled.hs ===================================== @@ -10,7 +10,8 @@ profiledFlavour :: Flavour profiledFlavour = defaultFlavour { name = "prof" , args = defaultBuilderArgs <> profiledArgs <> defaultPackageArgs - , ghcProfiled = True } + , ghcProfiled = True + , dynamicGhcPrograms = pure False } profiledArgs :: Args profiledArgs = sourceArgs SourceArgs ===================================== libraries/ghc-prim/GHC/Classes.hs ===================================== @@ -331,6 +331,10 @@ instance Ord TyCon where -- 7. @min x y == if x <= y then x else y@ = 'True' -- 8. @max x y == if x >= y then x else y@ = 'True' -- +-- Note that (7.) and (8.) do /not/ require 'min' and 'max' to return either of +-- their arguments. The result is merely required to /equal/ one of the +-- arguments in terms of '(==)'. +-- -- Minimal complete definition: either 'compare' or '<='. -- Using 'compare' can be more efficient for complex types. -- ===================================== testsuite/tests/codeGen/should_gen_asm/all.T ===================================== @@ -8,3 +8,4 @@ test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('bytearray-memset-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) +test('bytearray-memcpy-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.asm ===================================== @@ -0,0 +1,8 @@ +movw 0(%rax),%dx +movw %dx,0(%rcx) +movw 2(%rax),%dx +movw %dx,2(%rcx) +movw 4(%rax),%dx +movw %dx,4(%rcx) +movw 6(%rax),%ax +movw %ax,6(%rcx) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.hs ===================================== @@ -0,0 +1,19 @@ +{-# language MagicHash #-} +{-# language UnboxedTuples #-} + +module CopyArray + ( smallCopy + ) where + +import GHC.Exts +import GHC.IO + +data ByteArray = ByteArray ByteArray# + +-- Does an 8 byte copy with sub-word (2 bytes) alignment +-- Should be unrolled into 4 aligned stores (MOVWs) +smallCopy :: ByteArray -> IO ByteArray +smallCopy (ByteArray ba) = IO $ \s0 -> case newByteArray# 8# s0 of + (# s1, mut #) -> case copyByteArray# ba 2# mut 0# 8# s1 of + s2 -> case unsafeFreezeByteArray# mut s2 of + (# s3, frozen #) -> (# s3, ByteArray frozen #) ===================================== testsuite/tests/dynlibs/Makefile ===================================== @@ -9,6 +9,11 @@ T3807: $(RM) T3807-export.o T3807-load.o $(RM) T3807test.so $(RM) T3807-load + + # GHC does not automatically link with the RTS when building shared + # libraries. This is done to allow the RTS flavour to be chosen later (i.e. + # when linking an executable). + # Hence we must explicitly linking with the RTS here. '$(TEST_HC)' $(filter-out -rtsopts,$(TEST_HC_OPTS)) -v0 --make -dynamic -fPIC -shared T3807Export.hs T3807-export.c -o T3807test.so -lHSrts-ghc`'$(TEST_HC)' $(TEST_HC_OPTS) --numeric-version` '$(TEST_HC)' $(filter-out -rtsopts,$(TEST_HC_OPTS)) -no-auto-link-packages -no-hs-main T3807-load.c -o T3807-load -ldl ./T3807-load View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/851ee22f379ca4840e75e771832cdacd89ce5693...fea4089eba89997d791a03759d082ccad40ed9d6 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/851ee22f379ca4840e75e771832cdacd89ce5693...fea4089eba89997d791a03759d082ccad40ed9d6 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 12:49:18 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 08:49:18 -0400 Subject: [Git][ghc/ghc][master] Ord docs: Add explanation on 'min' and 'max' operator interactions Message-ID: <5cb32c4eef4d0_62b33fa2eb90cf7c3401449@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: d2271fe4 by Simon Jakobi at 2019-04-14T12:43:17Z Ord docs: Add explanation on 'min' and 'max' operator interactions [ci skip] - - - - - 1 changed file: - libraries/ghc-prim/GHC/Classes.hs Changes: ===================================== libraries/ghc-prim/GHC/Classes.hs ===================================== @@ -331,6 +331,10 @@ instance Ord TyCon where -- 7. @min x y == if x <= y then x else y@ = 'True' -- 8. @max x y == if x >= y then x else y@ = 'True' -- +-- Note that (7.) and (8.) do /not/ require 'min' and 'max' to return either of +-- their arguments. The result is merely required to /equal/ one of the +-- arguments in terms of '(==)'. +-- -- Minimal complete definition: either 'compare' or '<='. -- Using 'compare' can be more efficient for complex types. -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d2271fe4e841cc157385bfc9ee498a0bf805f250 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d2271fe4e841cc157385bfc9ee498a0bf805f250 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 12:49:25 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 08:49:25 -0400 Subject: [Git][ghc/ghc][wip/corelint-safeguard] 29 commits: users-guide: Document how to disable package environments Message-ID: <5cb32c55e11a4_62b33fa2cc0a748c340311e@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/corelint-safeguard at Glasgow Haskell Compiler / GHC Commits: 36d38047 by Ben Gamari at 2019-04-09T14:23:47Z users-guide: Document how to disable package environments As noted in #16309 this somehow went undocumented. - - - - - af4cea7f by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: fix memset unroll for small bytearrays, add 64-bit sets Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op. - - - - - bd2de4f0 by Artem Pyanykh at 2019-04-09T14:30:13Z codegen: use newtype for Alignment in BasicTypes - - - - - 14a78707 by Artem Pyanykh at 2019-04-09T14:30:13Z docs: add a note about changes in memset unrolling to 8.10.1-notes - - - - - fe40ddd9 by Sylvain Henry at 2019-04-09T16:50:15Z Hadrian: fix library install paths in bindist Makefile (#16498) GHC now works out-of-the-box (i.e. without any wrapper script) by assuming that @bin@ and @lib@ directories sit next to each other. In particular, its RUNPATH uses $ORIGIN-based relative path to find the libraries. However, to be good citizens we want to support the case where @bin@ and @lib@ directories (respectively BINDIR and LIBDIR) don't sit next to each other or are renamed. To do that the install script simply creates GHC specific @bin@ and @lib@ siblings directories into: LIBDIR/ghc-VERSION/{bin,lib} Then it installs wrapper scripts into BINDIR that call the appropriate programs into LIBDIR/ghc-VERSION/bin/. The issue fixed by this patch is that libraries were not installed into LIBDIR/ghc-VERSION/lib but directly into LIBDIR. - - - - - 9acdc4c0 by Ben Gamari at 2019-04-09T16:56:38Z gitlab: Bump cabal-install version used by Windows builds to 2.4 Hopefully fixes Windows Hadrian build. - - - - - fc3f421b by Joachim Breitner at 2019-04-10T03:17:37Z GHC no longer ever defines TABLES_NEXT_TO_CODE on its own It should be entirely the responsibility of make/Hadrian to ensure that everything that needs this flag gets it. GHC shouldn't be hardcoded to assist with bootstrapping since it builds other things besides itself. Reviewers: Subscribers: TerrorJack, rwbarton, carter GHC Trac Issues: #15548 -- progress towards but not fix Differential Revision: https://phabricator.haskell.org/D5082 -- extract from that - - - - - be0dde8e by Ryan Scott at 2019-04-10T03:23:50Z Use ghc-prim < 0.7, not <= 0.6.1, as upper version bounds Using `ghc-prim <= 0.6.1` is somewhat dodgy from a PVP point of view, as it makes it awkward to support new minor releases of `ghc-prim`. Let's instead use `< 0.7`, which is the idiomatic way of expressing PVP-compliant upper version bounds. - - - - - 42504f4a by Carter Schonwald at 2019-04-11T00:28:41Z removing x87 register support from native code gen * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors. - - - - - c401f8a4 by Sylvain Henry at 2019-04-11T23:51:24Z Hadrian: fix binary-dir with --docs=none Hadrian's "binary-dist" target must check that the "docs" directory exists (it may not since we can disable docs generation). - - - - - 091195a4 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Remove unused remilestoning script - - - - - fa0ccbb8 by Ömer Sinan Ağacan at 2019-04-11T23:57:38Z Update a panic message Point users to the right URL - - - - - beaa07d2 by Sylvain Henry at 2019-04-12T17:17:21Z Hadrian: fix ghci wrapper script generation (#16508) - - - - - e05df3e1 by Ben Gamari at 2019-04-12T17:23:30Z gitlab-ci: Ensure that version number has three components - - - - - 885d2e04 by klebinger.andreas at gmx.at at 2019-04-12T18:40:04Z Add -ddump-stg-final to dump stg as it is used for codegen. Intermediate STG does not contain free variables which can be useful sometimes. So adding a flag to dump that info. - - - - - 3c759ced by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: add a --test-accept/-a flag, to mimic 'make accept' When -a or --test-accept is passed, and if one runs the 'test' target, then any test failing because of mismatching output and which is not expected to fail will have its expected output adjusted by the test driver, effectively considering the new output correct from now on. When this flag is passed, hadrian's 'test' target becomes sensitive to the PLATFORM and OS environment variable, just like the Make build system: - when the PLATFORM env var is set to "YES", when accepting a result, accept it for the current platform; - when the OS env var is set to "YES", when accepting a result, accept it for all wordsizes of the current operating system. This can all be combined with `--only="..."` and `TEST="..." to only accept the new output of a subset of tests. - - - - - f4b5a6c0 by Alp Mestanogullari at 2019-04-12T18:46:54Z Hadrian: document -a/--test-accept - - - - - 30a0988d by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Disable windows-hadrian job Not only is it reliably failing due to #16574 but all of the quickly failing builds also causes the Windows runners to run out of disk space. - - - - - 8870a51b by Ben Gamari at 2019-04-12T19:41:07Z gitlab: Don't run lint-submods job on Marge branches This broke Marge by creating a second pipeline (consisting of only the `lint-submods` job). Marge then looked at this pipeline and concluded that CI for her merge branch passed. However, this is ignores the fact that the majority of the CI jobs are triggered on `merge_request` and are therefore in another pipeline. - - - - - 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - 2e7b2e55 by Ara Adkins at 2019-04-13T14:00:02Z [skip ci] Update CI badge in readme This trivial MR updates the CI badge in the readme to point to the new CI on gitlab, rather than the very out-of-date badge from Travis. - - - - - 40848a43 by Ben Gamari at 2019-04-13T14:02:36Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - 5f183081 by David Eichmann at 2019-04-14T05:08:15Z Hadrian: add rts shared library symlinks for backwards compatability Fixes test T3807 when building with Hadrian. Trac #16370 - - - - - 9b142c53 by Sylvain Henry at 2019-04-14T05:14:23Z Hadrian: add binary-dist-dir target This patch adds an Hadrian target "binary-dist-dir". Compared to "binary-dist", it only builds a binary distribution directory without creating the Tar archive. It makes the use/test of the bindist installation script easier. - - - - - 6febc444 by Krzysztof Gogolewski at 2019-04-14T05:20:29Z Fix assertion failures reported in #16533 - - - - - edcef7b3 by Artem Pyanykh at 2019-04-14T05:26:35Z codegen: unroll memcpy calls for small bytearrays - - - - - 6094d43f by Artem Pyanykh at 2019-04-14T05:26:35Z docs: mention memcpy optimization for ByteArrays in 8.10.1-notes - - - - - d2271fe4 by Simon Jakobi at 2019-04-14T12:43:17Z Ord docs: Add explanation on 'min' and 'max' operator interactions [ci skip] - - - - - e7cad16c by Krzysztof Gogolewski at 2019-04-14T12:49:23Z Add a safeguard to Core Lint Lint returns a pair (Maybe a, WarnsAndErrs). The Maybe monad allows to handle an unrecoverable failure. In case of such a failure, the error should be added to the second component of the pair. If this is not done, Lint will silently accept bad programs. This situation actually happened during development of linear types. This adds a safeguard. - - - - - 30 changed files: - .gitlab-ci.yml - + .gitlab/linters/check-version-number.sh - .gitlab/win32-init.sh - README.md - compiler/basicTypes/BasicTypes.hs - compiler/cmm/CmmCallConv.hs - compiler/cmm/CmmExpr.hs - compiler/cmm/CmmType.hs - compiler/codeGen/StgCmmPrim.hs - compiler/coreSyn/CoreLint.hs - compiler/ghci/ByteCodeLink.hs - compiler/llvmGen/LlvmCodeGen/Base.hs - compiler/main/DynFlags.hs - compiler/main/HscMain.hs - compiler/main/SysTools.hs - compiler/nativeGen/AsmCodeGen.hs - compiler/nativeGen/Format.hs - compiler/nativeGen/NCGMonad.hs - compiler/nativeGen/PPC/CodeGen.hs - compiler/nativeGen/PPC/Ppr.hs - compiler/nativeGen/PPC/Regs.hs - compiler/nativeGen/Reg.hs - compiler/nativeGen/RegAlloc/Graph/TrivColorable.hs - compiler/nativeGen/RegClass.hs - compiler/nativeGen/SPARC/Instr.hs - compiler/nativeGen/SPARC/Ppr.hs - compiler/nativeGen/SPARC/Regs.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/nativeGen/X86/Instr.hs - compiler/nativeGen/X86/Ppr.hs The diff was not included because it is too large. View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1cae41ee28a319697b312285ec3f95f06528221e...e7cad16c19fb226353d3fb6e92914ed953d32857 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/1cae41ee28a319697b312285ec3f95f06528221e...e7cad16c19fb226353d3fb6e92914ed953d32857 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 12:55:26 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 08:55:26 -0400 Subject: [Git][ghc/ghc][master] Add a safeguard to Core Lint Message-ID: <5cb32dbed0188_62b33fa2e4c027c434054b9@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: e7cad16c by Krzysztof Gogolewski at 2019-04-14T12:49:23Z Add a safeguard to Core Lint Lint returns a pair (Maybe a, WarnsAndErrs). The Maybe monad allows to handle an unrecoverable failure. In case of such a failure, the error should be added to the second component of the pair. If this is not done, Lint will silently accept bad programs. This situation actually happened during development of linear types. This adds a safeguard. - - - - - 1 changed file: - compiler/coreSyn/CoreLint.hs Changes: ===================================== compiler/coreSyn/CoreLint.hs ===================================== @@ -2080,7 +2080,7 @@ defaultLintFlags = LF { lf_check_global_ids = False newtype LintM a = LintM { unLintM :: LintEnv -> - WarnsAndErrs -> -- Error and warning messages so far + WarnsAndErrs -> -- Warning and error messages so far (Maybe a, WarnsAndErrs) } -- Result and messages (if any) type WarnsAndErrs = (Bag MsgDoc, Bag MsgDoc) @@ -2189,10 +2189,13 @@ data LintLocInfo | InCo Coercion -- Inside a coercion initL :: DynFlags -> LintFlags -> InScopeSet - -> LintM a -> WarnsAndErrs -- Errors and warnings + -> LintM a -> WarnsAndErrs -- Warnings and errors initL dflags flags in_scope m = case unLintM m env (emptyBag, emptyBag) of - (_, errs) -> errs + (Just _, errs) -> errs + (Nothing, errs@(_, e)) | not (isEmptyBag e) -> errs + | otherwise -> pprPanic ("Bug in Lint: a failure occurred " ++ + "without reporting an error message") empty where env = LE { le_flags = flags , le_subst = mkEmptyTCvSubst in_scope View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e7cad16c19fb226353d3fb6e92914ed953d32857 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/e7cad16c19fb226353d3fb6e92914ed953d32857 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 12:55:35 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 08:55:35 -0400 Subject: [Git][ghc/ghc][wip/codeowners] 11 commits: linters: Fix check-version-number Message-ID: <5cb32dc7406ca_62b33fa2eb4dfa6c340563d@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/codeowners at Glasgow Haskell Compiler / GHC Commits: 7876d088 by Ben Gamari at 2019-04-13T13:51:59Z linters: Fix check-version-number This should have used `grep -E`, not `grep -e` - - - - - 2e7b2e55 by Ara Adkins at 2019-04-13T14:00:02Z [skip ci] Update CI badge in readme This trivial MR updates the CI badge in the readme to point to the new CI on gitlab, rather than the very out-of-date badge from Travis. - - - - - 40848a43 by Ben Gamari at 2019-04-13T14:02:36Z base: Better document implementation implications of Data.Timeout As noted in #16546 timeout uses asynchronous exceptions internally, an implementation detail which can leak out in surprising ways. Note this fact. Also expose the `Timeout` tycon. [skip ci] - - - - - 5f183081 by David Eichmann at 2019-04-14T05:08:15Z Hadrian: add rts shared library symlinks for backwards compatability Fixes test T3807 when building with Hadrian. Trac #16370 - - - - - 9b142c53 by Sylvain Henry at 2019-04-14T05:14:23Z Hadrian: add binary-dist-dir target This patch adds an Hadrian target "binary-dist-dir". Compared to "binary-dist", it only builds a binary distribution directory without creating the Tar archive. It makes the use/test of the bindist installation script easier. - - - - - 6febc444 by Krzysztof Gogolewski at 2019-04-14T05:20:29Z Fix assertion failures reported in #16533 - - - - - edcef7b3 by Artem Pyanykh at 2019-04-14T05:26:35Z codegen: unroll memcpy calls for small bytearrays - - - - - 6094d43f by Artem Pyanykh at 2019-04-14T05:26:35Z docs: mention memcpy optimization for ByteArrays in 8.10.1-notes - - - - - d2271fe4 by Simon Jakobi at 2019-04-14T12:43:17Z Ord docs: Add explanation on 'min' and 'max' operator interactions [ci skip] - - - - - e7cad16c by Krzysztof Gogolewski at 2019-04-14T12:49:23Z Add a safeguard to Core Lint Lint returns a pair (Maybe a, WarnsAndErrs). The Maybe monad allows to handle an unrecoverable failure. In case of such a failure, the error should be added to the second component of the pair. If this is not done, Lint will silently accept bad programs. This situation actually happened during development of linear types. This adds a safeguard. - - - - - c54a093f by Ben Gamari at 2019-04-14T12:55:29Z CODEOWNERS: Add simonmar as owner of rts/linker I suspect this is why @simonmar wasn't notified of !706. [skip ci] - - - - - 24 changed files: - .gitlab/linters/check-version-number.sh - CODEOWNERS - README.md - compiler/cmm/CmmExpr.hs - compiler/codeGen/StgCmmPrim.hs - compiler/coreSyn/CoreLint.hs - compiler/nativeGen/X86/CodeGen.hs - compiler/typecheck/TcCanonical.hs - compiler/typecheck/TcSigs.hs - compiler/types/OptCoercion.hs - compiler/types/Type.hs - docs/users_guide/8.10.1-notes.rst - hadrian/hadrian.cabal - hadrian/src/Hadrian/Utilities.hs - hadrian/src/Rules.hs - hadrian/src/Rules/BinaryDist.hs - hadrian/src/Rules/Register.hs - + hadrian/src/Rules/Rts.hs - libraries/base/System/Timeout.hs - libraries/ghc-prim/GHC/Classes.hs - testsuite/tests/codeGen/should_gen_asm/all.T - + testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.asm - + testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.hs - testsuite/tests/dynlibs/Makefile Changes: ===================================== .gitlab/linters/check-version-number.sh ===================================== @@ -2,5 +2,5 @@ set -e -grep -e -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || +grep -E -q '\[[0-9]+\.[0-9]+\.[0-9]+\]' configure.ac || ( echo "error: configure.ac: GHC version number must have three components."; exit 1 ) ===================================== CODEOWNERS ===================================== @@ -12,7 +12,7 @@ # RTS-like things /rts/ @bgamari @simonmar @osa1 @Phyx @angerman -/rts/linker/ @angerman @Phyx +/rts/linker/ @angerman @Phyx @simonmar /includes/ @bgamari @simonmar @osa1 # The compiler ===================================== README.md ===================================== @@ -1,7 +1,7 @@ The Glasgow Haskell Compiler ============================ -[![Build Status](https://api.travis-ci.org/ghc/ghc.svg?branch=master)](http://travis-ci.org/ghc/ghc) +[![pipeline status](https://gitlab.haskell.org/ghc/ghc/badges/master/pipeline.svg?style=flat)](https://gitlab.haskell.org/ghc/ghc/commits/master) This is the source tree for [GHC][1], a compiler and interactive environment for the Haskell functional programming language. ===================================== compiler/cmm/CmmExpr.hs ===================================== @@ -5,7 +5,7 @@ {-# LANGUAGE UndecidableInstances #-} module CmmExpr - ( CmmExpr(..), cmmExprType, cmmExprWidth, maybeInvertCmmExpr + ( CmmExpr(..), cmmExprType, cmmExprWidth, cmmExprAlignment, maybeInvertCmmExpr , CmmReg(..), cmmRegType, cmmRegWidth , CmmLit(..), cmmLitType , LocalReg(..), localRegType @@ -43,6 +43,8 @@ import Unique import Data.Set (Set) import qualified Data.Set as Set +import BasicTypes (Alignment, mkAlignment, alignmentOf) + ----------------------------------------------------------------------------- -- CmmExpr -- An expression. Expressions have no side effects. @@ -239,6 +241,13 @@ cmmLabelType dflags lbl cmmExprWidth :: DynFlags -> CmmExpr -> Width cmmExprWidth dflags e = typeWidth (cmmExprType dflags e) +-- | Returns an alignment in bytes of a CmmExpr when it's a statically +-- known integer constant, otherwise returns an alignment of 1 byte. +-- The caller is responsible for using with a sensible CmmExpr +-- argument. +cmmExprAlignment :: CmmExpr -> Alignment +cmmExprAlignment (CmmLit (CmmInt intOff _)) = alignmentOf (fromInteger intOff) +cmmExprAlignment _ = mkAlignment 1 -------- --- Negation for conditional branches ===================================== compiler/codeGen/StgCmmPrim.hs ===================================== @@ -2035,8 +2035,8 @@ doCopyByteArrayOp = emitCopyByteArray copy where -- Copy data (we assume the arrays aren't overlapping since -- they're of different types) - copy _src _dst dst_p src_p bytes = - emitMemcpyCall dst_p src_p bytes 1 + copy _src _dst dst_p src_p bytes align = + emitMemcpyCall dst_p src_p bytes align -- | Takes a source 'MutableByteArray#', an offset in the source -- array, a destination 'MutableByteArray#', an offset into the @@ -2050,22 +2050,26 @@ doCopyMutableByteArrayOp = emitCopyByteArray copy -- The only time the memory might overlap is when the two arrays -- we were provided are the same array! -- TODO: Optimize branch for common case of no aliasing. - copy src dst dst_p src_p bytes = do + copy src dst dst_p src_p bytes align = do dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair - (getCode $ emitMemmoveCall dst_p src_p bytes 1) - (getCode $ emitMemcpyCall dst_p src_p bytes 1) + (getCode $ emitMemmoveCall dst_p src_p bytes align) + (getCode $ emitMemcpyCall dst_p src_p bytes align) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopyByteArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr - -> FCode ()) + -> Alignment -> FCode ()) -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> FCode () emitCopyByteArray copy src src_off dst dst_off n = do dflags <- getDynFlags + let byteArrayAlignment = wordAlignment dflags + srcOffAlignment = cmmExprAlignment src_off + dstOffAlignment = cmmExprAlignment dst_off + align = minimum [byteArrayAlignment, srcOffAlignment, dstOffAlignment] dst_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags dst (arrWordsHdrSize dflags)) dst_off src_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags src (arrWordsHdrSize dflags)) src_off - copy src dst dst_p src_p n + copy src dst dst_p src_p n align -- | Takes a source 'ByteArray#', an offset in the source array, a -- destination 'Addr#', and the number of bytes to copy. Copies the given @@ -2075,7 +2079,7 @@ doCopyByteArrayToAddrOp src src_off dst_p bytes = do -- Use memcpy (we are allowed to assume the arrays aren't overlapping) dflags <- getDynFlags src_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags src (arrWordsHdrSize dflags)) src_off - emitMemcpyCall dst_p src_p bytes 1 + emitMemcpyCall dst_p src_p bytes (mkAlignment 1) -- | Takes a source 'MutableByteArray#', an offset in the source array, a -- destination 'Addr#', and the number of bytes to copy. Copies the given @@ -2092,7 +2096,7 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do -- Use memcpy (we are allowed to assume the arrays aren't overlapping) dflags <- getDynFlags dst_p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags dst (arrWordsHdrSize dflags)) dst_off - emitMemcpyCall dst_p src_p bytes 1 + emitMemcpyCall dst_p src_p bytes (mkAlignment 1) -- ---------------------------------------------------------------------------- @@ -2107,9 +2111,7 @@ doSetByteArrayOp ba off len c = do dflags <- getDynFlags let byteArrayAlignment = wordAlignment dflags -- known since BA is allocated on heap - offsetAlignment = case off of - CmmLit (CmmInt intOff _) -> alignmentOf (fromInteger intOff) - _ -> mkAlignment 1 + offsetAlignment = cmmExprAlignment off align = min byteArrayAlignment offsetAlignment p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off @@ -2180,7 +2182,7 @@ doCopyArrayOp = emitCopyArray copy copy _src _dst dst_p src_p bytes = do dflags <- getDynFlags emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags) + (wordAlignment dflags) -- | Takes a source 'MutableArray#', an offset in the source array, a @@ -2198,9 +2200,9 @@ doCopyMutableArrayOp = emitCopyArray copy dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair (getCode $ emitMemmoveCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) (getCode $ emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopyArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> ByteOff @@ -2247,7 +2249,7 @@ doCopySmallArrayOp = emitCopySmallArray copy copy _src _dst dst_p src_p bytes = do dflags <- getDynFlags emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags) + (wordAlignment dflags) doCopySmallMutableArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> WordOff @@ -2261,9 +2263,9 @@ doCopySmallMutableArrayOp = emitCopySmallArray copy dflags <- getDynFlags (moveCall, cpyCall) <- forkAltPair (getCode $ emitMemmoveCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) (getCode $ emitMemcpyCall dst_p src_p (mkIntExpr dflags bytes) - (wORD_SIZE dflags)) + (wordAlignment dflags)) emit =<< mkCmmIfThenElse (cmmEqWord dflags src dst) moveCall cpyCall emitCopySmallArray :: (CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> ByteOff @@ -2328,7 +2330,7 @@ emitCloneArray info_p res_r src src_off n = do (mkIntExpr dflags (arrPtrsHdrSizeW dflags)) src_off) emitMemcpyCall dst_p src_p (mkIntExpr dflags (wordsToBytes dflags n)) - (wORD_SIZE dflags) + (wordAlignment dflags) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) @@ -2365,7 +2367,7 @@ emitCloneSmallArray info_p res_r src src_off n = do (mkIntExpr dflags (smallArrPtrsHdrSizeW dflags)) src_off) emitMemcpyCall dst_p src_p (mkIntExpr dflags (wordsToBytes dflags n)) - (wORD_SIZE dflags) + (wordAlignment dflags) emit $ mkAssign (CmmLocal res_r) (CmmReg arr) @@ -2493,19 +2495,19 @@ doCasByteArray res mba idx idx_ty old new = do -- Helpers for emitting function calls -- | Emit a call to @memcpy at . -emitMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemcpyCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemcpyCall dst src n align = do emitPrimCall [ {-no results-} ] - (MO_Memcpy align) + (MO_Memcpy (alignmentBytes align)) [ dst, src, n ] -- | Emit a call to @memmove at . -emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Int -> FCode () +emitMemmoveCall :: CmmExpr -> CmmExpr -> CmmExpr -> Alignment -> FCode () emitMemmoveCall dst src n align = do emitPrimCall [ {- no results -} ] - (MO_Memmove align) + (MO_Memmove (alignmentBytes align)) [ dst, src, n ] -- | Emit a call to @memset at . The second argument must fit inside an ===================================== compiler/coreSyn/CoreLint.hs ===================================== @@ -2080,7 +2080,7 @@ defaultLintFlags = LF { lf_check_global_ids = False newtype LintM a = LintM { unLintM :: LintEnv -> - WarnsAndErrs -> -- Error and warning messages so far + WarnsAndErrs -> -- Warning and error messages so far (Maybe a, WarnsAndErrs) } -- Result and messages (if any) type WarnsAndErrs = (Bag MsgDoc, Bag MsgDoc) @@ -2189,10 +2189,13 @@ data LintLocInfo | InCo Coercion -- Inside a coercion initL :: DynFlags -> LintFlags -> InScopeSet - -> LintM a -> WarnsAndErrs -- Errors and warnings + -> LintM a -> WarnsAndErrs -- Warnings and errors initL dflags flags in_scope m = case unLintM m env (emptyBag, emptyBag) of - (_, errs) -> errs + (Just _, errs) -> errs + (Nothing, errs@(_, e)) | not (isEmptyBag e) -> errs + | otherwise -> pprPanic ("Bug in Lint: a failure occurred " ++ + "without reporting an error message") empty where env = LE { le_flags = flags , le_subst = mkEmptyTCvSubst in_scope ===================================== compiler/nativeGen/X86/CodeGen.hs ===================================== @@ -1767,12 +1767,11 @@ genCCall -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - --- Unroll memcpy calls if the source and destination pointers are at --- least DWORD aligned and the number of bytes to copy isn't too +-- Unroll memcpy calls if the number of bytes to copy isn't too -- large. Otherwise, call C's memcpy. -genCCall dflags is32Bit (PrimTarget (MO_Memcpy align)) _ +genCCall dflags _ (PrimTarget (MO_Memcpy align)) _ [dst, src, CmmLit (CmmInt n _)] _ - | fromInteger insns <= maxInlineMemcpyInsns dflags && align .&. 3 == 0 = do + | fromInteger insns <= maxInlineMemcpyInsns dflags = do code_dst <- getAnyReg dst dst_r <- getNewRegNat format code_src <- getAnyReg src @@ -1785,7 +1784,9 @@ genCCall dflags is32Bit (PrimTarget (MO_Memcpy align)) _ -- instructions per move. insns = 2 * ((n + sizeBytes - 1) `div` sizeBytes) - format = if align .&. 4 /= 0 then II32 else (archWordFormat is32Bit) + maxAlignment = wordAlignment dflags -- only machine word wide MOVs are supported + effectiveAlignment = min (alignmentOf align) maxAlignment + format = intFormat . widthFromBytes $ alignmentBytes effectiveAlignment -- The size of each move, in bytes. sizeBytes :: Integer ===================================== compiler/typecheck/TcCanonical.hs ===================================== @@ -1015,7 +1015,7 @@ can_eq_nc_forall ev eq_rel s1 s2 -- Done: unify phi1 ~ phi2 go [] subst bndrs2 = ASSERT( null bndrs2 ) - unify loc (eqRelRole eq_rel) phi1' (substTy subst phi2) + unify loc (eqRelRole eq_rel) phi1' (substTyUnchecked subst phi2) go _ _ _ = panic "cna_eq_nc_forall" -- case (s:ss) [] ===================================== compiler/typecheck/TcSigs.hs ===================================== @@ -515,7 +515,7 @@ tcInstSig hs_sig@(PartialSig { psig_hs_ty = hs_ty , sig_inst_skols = tv_prs , sig_inst_wcs = wcs , sig_inst_wcx = wcx - , sig_inst_theta = substTys subst theta + , sig_inst_theta = substTysUnchecked subst theta , sig_inst_tau = substTyUnchecked subst tau } ; traceTc "End partial sig }" (ppr inst_sig) ; return inst_sig } ===================================== compiler/types/OptCoercion.hs ===================================== @@ -118,8 +118,8 @@ optCoercion' env co (Pair in_ty1 in_ty2, in_role) = coercionKindRole co (Pair out_ty1 out_ty2, out_role) = coercionKindRole out_co in - ASSERT2( substTy env in_ty1 `eqType` out_ty1 && - substTy env in_ty2 `eqType` out_ty2 && + ASSERT2( substTyUnchecked env in_ty1 `eqType` out_ty1 && + substTyUnchecked env in_ty2 `eqType` out_ty2 && in_role == out_role , text "optCoercion changed types!" $$ hang (text "in_co:") 2 (ppr co) ===================================== compiler/types/Type.hs ===================================== @@ -1044,7 +1044,7 @@ piResultTys ty orig_args@(arg:args) init_subst = mkEmptyTCvSubst $ mkInScopeSet (tyCoVarsOfTypes (ty:orig_args)) go :: TCvSubst -> Type -> [Type] -> Type - go subst ty [] = substTy subst ty + go subst ty [] = substTyUnchecked subst ty go subst ty all_args@(arg:args) | Just ty' <- coreView ty ===================================== docs/users_guide/8.10.1-notes.rst ===================================== @@ -61,10 +61,11 @@ Compiler :ghc-flag:`-Wredundant-record-wildcards` which warn users when they have redundant or unused uses of a record wildcard match. -- Calls to `memset` are now unrolled more aggressively and the - produced code is more efficient on `x86_64` with added support for - 64-bit `MOV`s. In particular, `setByteArray#` calls that were not - optimized before, now will be. See :ghc-ticket:`16052`. +- Calls to `memset` and `memcpy` are now unrolled more aggressively + and the produced code is more efficient on `x86_64` with added + support for 64-bit `MOV`s. In particular, `setByteArray#` and + `copyByteArray#` calls that were not optimized before, now will + be. See :ghc-ticket:`16052`. Runtime system ~~~~~~~~~~~~~~ ===================================== hadrian/hadrian.cabal ===================================== @@ -66,6 +66,7 @@ executable hadrian , Rules.Nofib , Rules.Program , Rules.Register + , Rules.Rts , Rules.Selftest , Rules.SimpleTargets , Rules.SourceDist @@ -121,7 +122,7 @@ executable hadrian build-depends: base >= 4.8 && < 5 , Cabal >= 3.0 && < 3.1 , containers >= 0.5 && < 0.7 - , directory >= 1.2 && < 1.4 + , directory >= 1.3.1.0 && < 1.4 , extra >= 1.4.7 , filepath , mtl == 2.2.* ===================================== hadrian/src/Hadrian/Utilities.hs ===================================== @@ -16,8 +16,9 @@ module Hadrian.Utilities ( BuildRoot (..), buildRoot, buildRootRules, isGeneratedSource, -- * File system operations - copyFile, copyFileUntracked, fixFile, makeExecutable, moveFile, removeFile, - createDirectory, copyDirectory, moveDirectory, removeDirectory, + copyFile, copyFileUntracked, createFileLinkUntracked, fixFile, + makeExecutable, moveFile, removeFile, createDirectory, copyDirectory, + moveDirectory, removeDirectory, -- * Diagnostic info UseColour (..), Colour (..), ANSIColour (..), putColoured, @@ -288,6 +289,14 @@ infixl 1 <&> isGeneratedSource :: FilePath -> Action Bool isGeneratedSource file = buildRoot <&> (`isPrefixOf` file) +-- | Link a file tracking the source. Create the target directory if missing. +createFileLinkUntracked :: FilePath -> FilePath -> Action () +createFileLinkUntracked linkTarget link = do + let dir = takeDirectory linkTarget + liftIO $ IO.createDirectoryIfMissing True dir + putProgressInfo =<< renderCreateFileLink linkTarget link + quietly . liftIO $ IO.createFileLink linkTarget link + -- | Copy a file tracking the source. Create the target directory if missing. copyFile :: FilePath -> FilePath -> Action () copyFile source target = do @@ -460,8 +469,12 @@ renderAction what input output = do return $ case progressInfo of None -> "" Brief -> "| " ++ what ++ ": " ++ i ++ " => " ++ o - Normal -> renderBox [ what, " input: " ++ i, " => output: " ++ o ] - Unicorn -> renderUnicorn [ what, " input: " ++ i, " => output: " ++ o ] + Normal -> renderBox [ what + , " input: " ++ i + , " => output: " ++ o ] + Unicorn -> renderUnicorn [ what + , " input: " ++ i + , " => output: " ++ o ] where i = unifyPath input o = unifyPath output @@ -478,6 +491,24 @@ renderActionNoOutput what input = do where i = unifyPath input +-- | Render creating a file link. +renderCreateFileLink :: String -> FilePath -> Action String +renderCreateFileLink linkTarget link' = do + progressInfo <- userSetting Brief + let what = "Creating file link" + linkString = link ++ " -> " ++ linkTarget + return $ case progressInfo of + None -> "" + Brief -> "| " ++ what ++ ": " ++ linkString + Normal -> renderBox [ what + , " link name: " ++ link + , " -> link target: " ++ linkTarget ] + Unicorn -> renderUnicorn [ what + , " link name: " ++ link + , " -> link target: " ++ linkTarget ] + where + link = unifyPath link' + -- | Render the successful build of a program. renderProgram :: String -> String -> String -> String renderProgram name bin synopsis = renderBox $ ===================================== hadrian/src/Rules.hs ===================================== @@ -21,6 +21,7 @@ import qualified Rules.Libffi import qualified Rules.Library import qualified Rules.Program import qualified Rules.Register +import qualified Rules.Rts import qualified Rules.SimpleTargets import Settings import Target @@ -158,6 +159,7 @@ buildRules = do Rules.Gmp.gmpRules Rules.Libffi.libffiRules Rules.Library.libraryRules + Rules.Rts.rtsRules packageRules oracleRules :: Rules () ===================================== hadrian/src/Rules/BinaryDist.hs ===================================== @@ -97,7 +97,7 @@ other, the install script: bindistRules :: Rules () bindistRules = do root <- buildRootRules - phony "binary-dist" $ do + phony "binary-dist-dir" $ do -- We 'need' all binaries and libraries targets <- mapM pkgTarget =<< stagePackages Stage1 need targets @@ -150,6 +150,16 @@ bindistRules = do , "ghci-script", "haddock", "hpc", "hp2ps", "hsc2hs" , "runghc"] + + phony "binary-dist" $ do + + need ["binary-dist-dir"] + + version <- setting ProjectVersion + targetPlatform <- setting TargetPlatformFull + + let ghcVersionPretty = "ghc-" ++ version ++ "-" ++ targetPlatform + -- Finally, we create the archive /bindist/ghc-X.Y.Z-platform.tar.xz tarPath <- builderPath (Tar Create) cmd [Cwd $ root -/- "bindist"] tarPath ===================================== hadrian/src/Rules/Register.hs ===================================== @@ -8,6 +8,7 @@ import Hadrian.Haskell.Cabal import Oracles.Setting import Packages import Rules.Gmp +import Rules.Rts import Settings import Target import Utilities @@ -117,6 +118,9 @@ buildConf _ context at Context {..} conf = do Cabal.copyPackage context Cabal.registerPackage context + -- | Dynamic RTS library files need symlinks (Rules.Rts.rtsRules). + when (package == rts) (needRtsSymLinks stage ways) + -- The above two steps produce an entry in the package database, with copies -- of many of the files we have build, e.g. Haskell interface files. We need -- to record this side effect so that Shake can cache these files too. ===================================== hadrian/src/Rules/Rts.hs ===================================== @@ -0,0 +1,54 @@ +module Rules.Rts (rtsRules, needRtsSymLinks) where + +import Packages (rts) +import Hadrian.Utilities +import Settings.Builders.Common + +-- | Dynamic RTS library files need symlinks without the dummy version number. +-- This is for backwards compatibility (the old make build system omitted the +-- dummy version number). +-- This rule has priority 2 to override the general rule for generating share +-- library files (see Rules.Library.libraryRules). +rtsRules :: Rules () +rtsRules = priority 2 $ do + root <- buildRootRules + [ root -/- "//libHSrts_*-ghc*.so", + root -/- "//libHSrts_*-ghc*.dylib", + root -/- "//libHSrts-ghc*.so", + root -/- "//libHSrts-ghc*.dylib"] + |%> \ rtsLibFilePath' -> createFileLinkUntracked + (addRtsDummyVersion $ takeFileName rtsLibFilePath') + rtsLibFilePath' + +-- Need symlinks generated by rtsRules. +needRtsSymLinks :: Stage -> [Way] -> Action () +needRtsSymLinks stage rtsWays + = forM_ (filter (wayUnit Dynamic) rtsWays) $ \ way -> do + let ctx = Context stage rts way + libPath <- libPath ctx + distDir <- distDir stage + rtsLibFile <- takeFileName <$> pkgLibraryFile ctx + need [removeRtsDummyVersion (libPath distDir rtsLibFile)] + +prefix, versionlessPrefix :: String +versionlessPrefix = "libHSrts" +prefix = versionlessPrefix ++ "-1.0" + +-- removeRtsDummyVersion "a/libHSrts-1.0-ghc1.2.3.4.so" +-- == "a/libHSrts-ghc1.2.3.4.so" +removeRtsDummyVersion :: FilePath -> FilePath +removeRtsDummyVersion = replaceLibFilePrefix prefix versionlessPrefix + +-- addRtsDummyVersion "a/libHSrts-ghc1.2.3.4.so" +-- == "a/libHSrts-1.0-ghc1.2.3.4.so" +addRtsDummyVersion :: FilePath -> FilePath +addRtsDummyVersion = replaceLibFilePrefix versionlessPrefix prefix + +replaceLibFilePrefix :: String -> String -> FilePath -> FilePath +replaceLibFilePrefix oldPrefix newPrefix oldFilePath = let + oldFileName = takeFileName oldFilePath + newFileName = maybe + (error $ "Expected RTS library file to start with " ++ oldPrefix) + (newPrefix ++) + (stripPrefix oldPrefix oldFileName) + in replaceFileName oldFilePath newFileName \ No newline at end of file ===================================== libraries/base/System/Timeout.hs ===================================== @@ -16,7 +16,7 @@ -- ------------------------------------------------------------------------------- -module System.Timeout ( timeout ) where +module System.Timeout ( Timeout, timeout ) where #if !defined(mingw32_HOST_OS) import Control.Monad @@ -35,7 +35,11 @@ import Data.Unique (Unique, newUnique) -- interrupt the running IO computation when the timeout has -- expired. -newtype Timeout = Timeout Unique deriving Eq -- ^ @since 4.0 +-- | An exception thrown to a thread by 'timeout' to interrupt a timed-out +-- computation. +-- +-- @since 4.0 +newtype Timeout = Timeout Unique deriving Eq -- | @since 4.0 instance Show Timeout where @@ -67,20 +71,25 @@ instance Exception Timeout where -- another thread. -- -- A tricky implementation detail is the question of how to abort an @IO@ --- computation. This combinator relies on asynchronous exceptions internally. --- The technique works very well for computations executing inside of the --- Haskell runtime system, but it doesn't work at all for non-Haskell code. --- Foreign function calls, for example, cannot be timed out with this --- combinator simply because an arbitrary C function cannot receive --- asynchronous exceptions. When @timeout@ is used to wrap an FFI call that --- blocks, no timeout event can be delivered until the FFI call returns, which --- pretty much negates the purpose of the combinator. In practice, however, --- this limitation is less severe than it may sound. Standard I\/O functions --- like 'System.IO.hGetBuf', 'System.IO.hPutBuf', Network.Socket.accept, or --- 'System.IO.hWaitForInput' appear to be blocking, but they really don't --- because the runtime system uses scheduling mechanisms like @select(2)@ to --- perform asynchronous I\/O, so it is possible to interrupt standard socket --- I\/O or file I\/O using this combinator. +-- computation. This combinator relies on asynchronous exceptions internally +-- (namely throwing the computation the 'Timeout' exception). The technique +-- works very well for computations executing inside of the Haskell runtime +-- system, but it doesn't work at all for non-Haskell code. Foreign function +-- calls, for example, cannot be timed out with this combinator simply because +-- an arbitrary C function cannot receive asynchronous exceptions. When +-- @timeout@ is used to wrap an FFI call that blocks, no timeout event can be +-- delivered until the FFI call returns, which pretty much negates the purpose +-- of the combinator. In practice, however, this limitation is less severe than +-- it may sound. Standard I\/O functions like 'System.IO.hGetBuf', +-- 'System.IO.hPutBuf', Network.Socket.accept, or 'System.IO.hWaitForInput' +-- appear to be blocking, but they really don't because the runtime system uses +-- scheduling mechanisms like @select(2)@ to perform asynchronous I\/O, so it +-- is possible to interrupt standard socket I\/O or file I\/O using this +-- combinator. +--- +-- Note that 'timeout' cancels the computation by throwing it the 'Timeout' +-- exception. Consequently blanket exception handlers (e.g. catching +-- 'SomeException') within the computation will break the timeout behavior. timeout :: Int -> IO a -> IO (Maybe a) timeout n f | n < 0 = fmap Just f ===================================== libraries/ghc-prim/GHC/Classes.hs ===================================== @@ -331,6 +331,10 @@ instance Ord TyCon where -- 7. @min x y == if x <= y then x else y@ = 'True' -- 8. @max x y == if x >= y then x else y@ = 'True' -- +-- Note that (7.) and (8.) do /not/ require 'min' and 'max' to return either of +-- their arguments. The result is merely required to /equal/ one of the +-- arguments in terms of '(==)'. +-- -- Minimal complete definition: either 'compare' or '<='. -- Using 'compare' can be more efficient for complex types. -- ===================================== testsuite/tests/codeGen/should_gen_asm/all.T ===================================== @@ -8,3 +8,4 @@ test('memcpy-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('memcpy-unroll-conprop', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('memset-unroll', is_amd64_codegen, compile_cmp_asm, ['cmm', '']) test('bytearray-memset-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) +test('bytearray-memcpy-unroll', is_amd64_codegen, compile_grep_asm, ['hs', True, '']) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.asm ===================================== @@ -0,0 +1,8 @@ +movw 0(%rax),%dx +movw %dx,0(%rcx) +movw 2(%rax),%dx +movw %dx,2(%rcx) +movw 4(%rax),%dx +movw %dx,4(%rcx) +movw 6(%rax),%ax +movw %ax,6(%rcx) ===================================== testsuite/tests/codeGen/should_gen_asm/bytearray-memcpy-unroll.hs ===================================== @@ -0,0 +1,19 @@ +{-# language MagicHash #-} +{-# language UnboxedTuples #-} + +module CopyArray + ( smallCopy + ) where + +import GHC.Exts +import GHC.IO + +data ByteArray = ByteArray ByteArray# + +-- Does an 8 byte copy with sub-word (2 bytes) alignment +-- Should be unrolled into 4 aligned stores (MOVWs) +smallCopy :: ByteArray -> IO ByteArray +smallCopy (ByteArray ba) = IO $ \s0 -> case newByteArray# 8# s0 of + (# s1, mut #) -> case copyByteArray# ba 2# mut 0# 8# s1 of + s2 -> case unsafeFreezeByteArray# mut s2 of + (# s3, frozen #) -> (# s3, ByteArray frozen #) ===================================== testsuite/tests/dynlibs/Makefile ===================================== @@ -9,6 +9,11 @@ T3807: $(RM) T3807-export.o T3807-load.o $(RM) T3807test.so $(RM) T3807-load + + # GHC does not automatically link with the RTS when building shared + # libraries. This is done to allow the RTS flavour to be chosen later (i.e. + # when linking an executable). + # Hence we must explicitly linking with the RTS here. '$(TEST_HC)' $(filter-out -rtsopts,$(TEST_HC_OPTS)) -v0 --make -dynamic -fPIC -shared T3807Export.hs T3807-export.c -o T3807test.so -lHSrts-ghc`'$(TEST_HC)' $(TEST_HC_OPTS) --numeric-version` '$(TEST_HC)' $(filter-out -rtsopts,$(TEST_HC_OPTS)) -no-auto-link-packages -no-hs-main T3807-load.c -o T3807-load -ldl ./T3807-load View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/b7343e192c6425b84e02fcf3849fc61e1de586db...c54a093ffb46b2efa0cc1511797fe8e01987ae87 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/compare/b7343e192c6425b84e02fcf3849fc61e1de586db...c54a093ffb46b2efa0cc1511797fe8e01987ae87 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 13:01:32 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 09:01:32 -0400 Subject: [Git][ghc/ghc][master] CODEOWNERS: Add simonmar as owner of rts/linker Message-ID: <5cb32f2c97ab6_62b33fa2eb21edc834063e1@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: c54a093f by Ben Gamari at 2019-04-14T12:55:29Z CODEOWNERS: Add simonmar as owner of rts/linker I suspect this is why @simonmar wasn't notified of !706. [skip ci] - - - - - 1 changed file: - CODEOWNERS Changes: ===================================== CODEOWNERS ===================================== @@ -12,7 +12,7 @@ # RTS-like things /rts/ @bgamari @simonmar @osa1 @Phyx @angerman -/rts/linker/ @angerman @Phyx +/rts/linker/ @angerman @Phyx @simonmar /includes/ @bgamari @simonmar @osa1 # The compiler View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/c54a093ffb46b2efa0cc1511797fe8e01987ae87 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/c54a093ffb46b2efa0cc1511797fe8e01987ae87 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Sun Apr 14 13:07:43 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Sun, 14 Apr 2019 09:07:43 -0400 Subject: [Git][ghc/ghc][master] Hadrian: don't accept p_dyn for executables, to fix --flavour=prof Message-ID: <5cb3309f1094f_62b33fa2e4c027c434098fc@gitlab.haskell.org.mail> Marge Bot pushed to branch master at Glasgow Haskell Compiler / GHC Commits: 1825f50d by Alp Mestanogullari at 2019-04-14T13:01:38Z Hadrian: don't accept p_dyn for executables, to fix --flavour=prof - - - - - 2 changed files: - hadrian/src/Settings.hs - hadrian/src/Settings/Flavours/Profiled.hs Changes: ===================================== hadrian/src/Settings.hs ===================================== @@ -60,10 +60,14 @@ programContext :: Stage -> Package -> Action Context programContext stage pkg = do profiled <- ghcProfiled <$> flavour dynGhcProgs <- dynamicGhcPrograms =<< flavour - return . Context stage pkg . wayFromUnits . concat $ - [ [ Profiling | pkg == ghc && profiled && stage > Stage0 ] - , [ Dynamic | dynGhcProgs && stage > Stage0 ] - ] + return $ Context stage pkg (wayFor profiled dynGhcProgs) + + where wayFor prof dyn + | prof && dyn = + error "programContext: profiling+dynamic not supported" + | pkg == ghc && prof && stage > Stage0 = profiling + | dyn && stage > Stage0 = dynamic + | otherwise = vanilla -- TODO: switch to Set Package as the order of packages should not matter? -- Otherwise we have to keep remembering to sort packages from time to time. ===================================== hadrian/src/Settings/Flavours/Profiled.hs ===================================== @@ -10,7 +10,8 @@ profiledFlavour :: Flavour profiledFlavour = defaultFlavour { name = "prof" , args = defaultBuilderArgs <> profiledArgs <> defaultPackageArgs - , ghcProfiled = True } + , ghcProfiled = True + , dynamicGhcPrograms = pure False } profiledArgs :: Args profiledArgs = sourceArgs SourceArgs View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1825f50d1736401724ef644e4d481cc26e8f47e5 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/1825f50d1736401724ef644e4d481cc26e8f47e5 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 15 02:56:41 2019 From: gitlab at gitlab.haskell.org (Ben Gamari) Date: Sun, 14 Apr 2019 22:56:41 -0400 Subject: [Git][ghc/ghc][wip/run-nofib] gitlab-ci: Run nofib on binary distributions Message-ID: <5cb3f2e976262_62b33fa2ef501b6834577da@gitlab.haskell.org.mail> Ben Gamari pushed to branch wip/run-nofib at Glasgow Haskell Compiler / GHC Commits: d8d980a7 by Ben Gamari at 2019-04-15T02:56:33Z gitlab-ci: Run nofib on binary distributions Updates docker images to ensure that the `time` utility is available. - - - - - 1 changed file: - .gitlab-ci.yml Changes: ===================================== .gitlab-ci.yml ===================================== @@ -2,7 +2,7 @@ variables: GIT_SSL_NO_VERIFY: "1" # Commit of ghc/ci-images repository from which to pull Docker images - DOCKER_REV: cefaee3c742af193e0f7783f87edb0d35374515c + DOCKER_REV: 6014fdf2843e07185a1762a95dce6bdedb544f55 # Sequential version number capturing the versions of all tools fetched by # .gitlab/win32-init.sh. @@ -21,7 +21,8 @@ stages: - full-build # Build all the things - cleanup # See Note [Cleanup on Windows] - packaging # Source distribution, etc. - - hackage # head.hackage testing + - testing # head.hackage correctness and compiler performance testing + - nofib - deploy # push documentation .only-default: &only-default @@ -732,7 +733,7 @@ source-tarball: .hackage: <<: *only-default - stage: hackage + stage: testing image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb9:$DOCKER_REV" tags: - x86_64-linux @@ -758,6 +759,47 @@ nightly-hackage: variables: - $NIGHTLY +############################################################ +# Nofib testing +############################################################ + +perf-nofib: + stage: nofib + dependencies: + - validate-x86_64-linux-deb9 + image: "registry.gitlab.haskell.org/ghc/ci-images/x86_64-linux-deb9:$DOCKER_REV" + only: + refs: + - merge_requests + - master + - /ghc-[0-9]+\.[0-9]+/ + tags: + - x86_64-linux + script: + - root=$(pwd)/ghc + - | + mkdir tmp + tar -xf ghc-x86_64-deb9-linux.tar.xz -C tmp + pushd tmp/ghc-*/ + ./configure --prefix=$root + make install + popd + rm -Rf tmp + - export BOOT_HC=$(which ghc) + - cabal update; cabal install -w $BOOT_HC regex-compat + - export PATH=$root/bin:$PATH + - make -C nofib boot mode=fast -j$CPUS + - "make -C nofib EXTRA_RUNTEST_OPTS='-cachegrind +RTS -V0 -RTS' NoFibRuns=1 mode=fast -j$CPUS 2>&1 > nofib.log" + artifacts: + expire_in: 12 week + when: always + paths: + - nofib.log + +############################################################ +# Documentation deployment via GitLab Pages +############################################################ + pages: stage: deploy dependencies: View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d8d980a792562a28434fb9d14df6dddacae49771 -- View it on GitLab: https://gitlab.haskell.org/ghc/ghc/commit/d8d980a792562a28434fb9d14df6dddacae49771 You're receiving this email because of your account on gitlab.haskell.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gitlab at gitlab.haskell.org Mon Apr 15 05:40:15 2019 From: gitlab at gitlab.haskell.org (Marge Bot) Date: Mon, 15 Apr 2019 01:40:15 -0400 Subject: [Git][ghc/ghc][wip/marge_bot_batch_merge_job] 10 commits: Ord docs: Add explanation on 'min' and 'max' operator interactions Message-ID: <5cb4193fe041a_62b33fa2eedc21f4347151c@gitlab.haskell.org.mail> Marge Bot pushed to branch wip/marge_bot_batch_merge_job at Glasgow Haskell Compiler / GHC Commits: d2271fe4 by Simon Jakobi at 2019-04-14T12:43:17Z Ord docs: Add explanation on 'min' and 'max' operator interactions [ci skip] - - - - - e7cad16c by Krzysztof Gogolewski at 2019-04-14T12:49:23Z Add a safeguard to Core Lint Lint returns a pair (Maybe a, WarnsAndErrs). The Maybe monad allows to handle an unrecoverable failure. In case of such a failure, the error should be added to the second component of the pair. If this is not done, Lint will silently accept bad programs. This situation actually happened during development of linear types. This adds a safeguard. - - - - - c54a093f by Ben Gamari at 2019-04-14T12:55:29Z CODEOWNERS: Add simonmar as owner of rts/linker I suspect this is why @simonmar wasn't notified of !706. [skip ci] - - - - - 1825f50d by Alp Mestanogullari at 2019-04-14T13:01:38Z Hadrian: don't accept p_dyn for executables, to fix --flavour=prof - - - - - fab24336 by Giles Anderson at 2019-04-15T05:40:09Z Document how -O3 is handled by GHC -O2 is the highest value of optimization. -O3 will be reverted to -O2. - - - - - b5deeebe by Giles Anderson at 2019-04-15T05:40:09Z Apply suggestion to docs/users_guide/using-optimisation.rst - - - - - ded0d8f4 by Fraser Tweedale at 2019-04-15T05:40:10Z GHCi: fix load order of .ghci files Directives in .ghci files in the current directory ("local .ghci") can be overridden by global files. Change the order in which the configs are loaded: global and $HOME/.ghci first, then local. Also introduce a new field to GHCiState to control whether local .ghci gets sourced or ignored. This commit does not add a way to set this value (a subsequent commit will add this), but the .ghci sourcing routine respects its value. Fixes: https://gitlab.haskell.org/ghc/ghc/issues/14689 Related: https://gitlab.haskell.org/ghc/ghc/issues/6017 Related: https://gitlab.haskell.org/ghc/ghc/issues/14250 - - - - - 91170328 by Fraser Tweedale at 2019-04-15T05:40:10Z users-guide: update startup script order Update users guide to match the new startup script order. Also clarify that -ignore-dot-ghci does not apply to scripts specified via the -ghci-script option. Part of: https://gitlab.haskell.org/ghc/ghc/issues/14689 - - - - - a1c6a6c6 by Fraser Tweedale at 2019-04-15T05:40:10Z GHCi: add 'local-config' setting Add the ':set local-config { source | ignore }' setting to control whether .ghci file in current directory will be sourced or not. The directive can be set in global config or $HOME/.ghci, which are processed before local .ghci files. The default is "source", preserving current behaviour. Related: https://gitlab.haskell.org/ghc/ghc/issues/6017 Related: https://gitlab.haskell.org/ghc/ghc/issues/14250 - - - - - 852107ad by Fraser Tweedale at 2019-04-15T05:40:10Z users-guide: document :set local-config Document the ':set local-config' command and add a warning about sourcing untrusted local .ghci scripts. Related: https://gitlab.haskell.org/ghc/ghc/issues/6017 Related: https://gitlab.haskell.org/ghc/ghc/issues/14250 - - - - - 9 changed files: - CODEOWNERS - compiler/coreSyn/CoreLint.hs - docs/users_guide/ghci.rst - docs/users_guide/using-optimisation.rst - ghc/GHCi/UI.hs - ghc/GHCi/UI/Monad.hs - hadrian/src/Settings.hs - hadrian/src/Settings/Flavours/Profiled.hs - libraries/ghc-prim/GHC/Classes.hs Changes: ===================================== CODEOWNERS ===================================== @@ -12,7 +12,7 @@ # RTS-like things /rts/ @bgamari @simonmar @osa1 @Phyx @angerman -/rts/linker/ @angerman @Phyx +/rts/linker/ @angerman @Phyx @simonmar /includes/ @bgamari @simonmar @osa1 # The compiler ===================================== compiler/coreSyn/CoreLint.hs ===================================== @@ -2080,7 +2080,7 @@ defaultLintFlags = LF { lf_check_global_ids = False newtype LintM a = LintM { unLintM :: LintEnv -> - WarnsAndErrs -> -- Error and warning messages so far + WarnsAndErrs -> -- Warning and error messages so far (Maybe a, WarnsAndErrs) } -- Result and messages (if any) type WarnsAndErrs = (Bag MsgDoc, Bag MsgDoc) @@ -2189,10 +2189,13 @@ data LintLocInfo | InCo Coercion -- Inside a coercion initL :: DynFlags -> LintFlags -> InScopeSet - -> LintM a -> WarnsAndErrs -- Errors and warnings + -> LintM a -> WarnsAndErrs -- Warnings and errors initL dflags flags in_scope m = case unLintM m env (emptyBag, emptyBag) of - (_, errs) -> errs + (Just _, errs) -> errs + (Nothing, errs@(_, e)) | not (isEmptyBag e) -> errs + | otherwise -> pprPanic ("Bug in Lint: a failure occurred " ++ + "without reporting an error message") empty where env = LE { le_flags = flags , le_subst = mkEmptyTCvSubst in_scope ===================================== docs/users_guide/ghci.rst ===================================== @@ -2649,6 +2649,17 @@ commonly used commands. Sets the command used by :ghci-cmd:`:edit` to ⟨cmd⟩. +.. ghci-cmd:: :set local-config; ⟨source|ignore⟩ + + If ``ignore``, :file:`./.ghci` files will be ignored (sourcing + untrusted local scripts is a security risk). The default is + ``source``. Set this directive in your user :file:`.ghci` + script, i.e. before the local script would be sourced. + + Even when set to ``ignore``, a local script will still be + processed if given by :ghc-flag:`-ghci-script` on the command + line, or sourced via :ghci-cmd:`:script`. + .. ghci-cmd:: :set prog; ⟨prog⟩ .. index:: @@ -3101,15 +3112,14 @@ When it starts, unless the :ghc-flag:`-ignore-dot-ghci` flag is given, GHCi reads and executes commands from the following files, in this order, if they exist: -1. :file:`./.ghci` +1. :file:`{ghcappdata}/ghci.conf`, where ⟨ghcappdata⟩ depends on + your system, but is usually something like :file:`$HOME/.ghc` on + Unix or :file:`C:/Documents and Settings/user/Application + Data/ghc` on Windows. -2. :file:`{appdata}/ghc/ghci.conf`, where ⟨appdata⟩ depends on your system, - but is usually something like - :file:`C:/Documents and Settings/user/Application Data` +2. :file:`$HOME/.ghci` -3. On Unix: :file:`$HOME/.ghc/ghci.conf` - -4. :file:`$HOME/.ghci` +3. :file:`./.ghci` The :file:`ghci.conf` file is most useful for turning on favourite options (e.g. ``:set +s``), and defining useful macros. @@ -3134,6 +3144,12 @@ three subdirectories A, B and C, you might put the following lines in fact it works to set it using :ghci-cmd:`:set` like this. The changes won't take effect until the next :ghci-cmd:`:load`, though.) +.. warning:: + Sourcing untrusted :file:`./.ghci` files is a security risk. + They can contain arbitrary commands that will be executed as the + user. Use :ghci-cmd:`:set local-config` to inhibit the + processing of :file:`./.ghci` files. + Once you have a library of GHCi macros, you may want to source them from separate files, or you may want to source your ``.ghci`` file into your running GHCi session while debugging it @@ -3166,8 +3182,9 @@ read: :type: dynamic :category: - Read a specific file after the usual startup files. Maybe be + Read a specific file after the usual startup files. May be specified repeatedly for multiple inputs. + :ghc-flag:`-ignore-dot-ghci` does not apply to these files. When defining GHCi macros, there is some important behavior you should be aware of when names may conflict with built-in commands, especially ===================================== docs/users_guide/using-optimisation.rst ===================================== @@ -88,6 +88,17 @@ So, for example, ``ghc -c Foo.hs`` runtime or space *worse* if you're unlucky. They are normally turned on or off individually. +.. ghc-flag:: -O⟨n⟩ + :shortdesc: Any -On where n > 2 is the same as -O2. + :type: dynamic + :reverse: -O0 + :category: optimization-levels + + .. index:: + single: optimise; aggressively + + Any -On where n > 2 is the same as -O2. + We don't use a ``-O*`` flag for day-to-day work. We use ``-O`` to get respectable speed; e.g., when we want to measure something. When we want to go for broke, we tend to use ``-O2`` (and we go for lots of coffee ===================================== ghc/GHCi/UI.hs ===================================== @@ -102,7 +102,7 @@ import Data.Char import Data.Function import Data.IORef ( IORef, modifyIORef, newIORef, readIORef, writeIORef ) import Data.List ( find, group, intercalate, intersperse, isPrefixOf, nub, - partition, sort, sortBy ) + partition, sort, sortBy, (\\) ) import qualified Data.Set as S import Data.Maybe import Data.Map (Map) @@ -351,13 +351,16 @@ defFullHelpText = "\n" ++ " :set