-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Haddock recompilation avoidance #9177
Haddock recompilation avoidance #9177
Conversation
89cf822
to
35c3992
Compare
df9c87b
to
4cc5ca1
Compare
Is it guaranteed that haddock will not write/overwrite any compilation files? Its ok for haddock to read files from the normal compilation output, but never ok to write them. So how does this work? Does the
I'd feel happier if haddock had a flag that promises to never write compilation files, and instead fail if it would need to do so. This would give us a "belt & braces" solution, in case something gets messed up in future. Otherwise we could end up with some hard-to-debug problems. |
I do think this is guaranteed (for Haddock versions >= 2.28.0, which is when "Hi Haddock" was introduced). Haddock always uses GHC's
This PR doesn't add or remove any calls to tools for the cabal haddock command. In that regard, the cabal haddock command still works just as it did before, meaning it will
The important bit is that with "Hi Haddock", Haddock generates documentation from module interfaces that it gets from the GHC API. The GHC API will do the normal compilation avoidance things while retrieving those interfaces. The problem that this PR addresses is that cabal was passing temporary As an aside: Haddock has been running its GHC session in a temporary directory by default since version 2.23.0, shipped with GHC 8.8, until version 2.29 (pending this change). So, as long as this cabal patch is not used in combination with a GHC 8.6.5 bundled Haddock (version 2.22 or lower), this change should be backwards compatible and avoid overwriting any existing compilation files.
Is the |
It would be really nice if oblivious users could benefit from this change. As far as I can tell, the main reason why that won't be true is Personally, I would be willing to even just deprecate (I took a quick look at some of the search results. It seems like most of the usage is "hide this bit of documentation from Haddock because Haddock has a bug". It seems to me that the version of GHC could often be used as a proxy for the problematic Haddock version in those cases.) |
And
are slightly contradictory. You don't remove any calls, but you do change them, as far as I can see, unconditionally. For old GHCs & haddock combinations I would expect no changes in behaviour. New In other words, I don't see any version checks. There ought to be. |
What would the deprecation strategy look like for this? Could I just change |
I will add a version check that only prevents the tmp dirs being passed for Haddock >= 2.28 (when hi haddock was introduced). |
4cc5ca1
to
bacf8a1
Compare
Putting on my cowboy hat, I'd say I am ok with this, this is one of the behaviours I want to see changing. Setting |
bacf8a1
to
b956a41
Compare
I agree 🤠 I've changed this to avoid defining the |
I kindly ask you to think a bit. As far as I can tell the functionality has only been used to circumvent bugs in haddock: https://hackage-search.serokell.io/?q=__HADDOCK_VERSION__
#if !defined(__HADDOCK_VERSION__)
-- workaround https://github.com/haskell/haddock/issues/680
#if !(defined(__HADDOCK_VERSION__)) || MIN_TOOL_VERSION_haddock(2,14,0)
, Proxy(..) -- hidden from old Haddock versions, because it triggers an internal error
#endif (Note how the tool version is checked, HADDOCK_VERSION is used for whether haddock runs.
#ifdef __HADDOCK_VERSION__
#undef INSPECTION
#endif
#ifdef INSPECTION
{-# LANGUAGE TemplateHaskell #-}
{-# OPTIONS_GHC -fplugin Test.Inspection.Plugin #-}
#endif This case is also important to think through: here it's essential to know whether package is compiled by haddock, as it simply won't work. So having haddock compile the source and not setting Either the way haddock-uses-ghc-lib is buggy, or how haddock processed the result data is buggy. The latter is quite likely to exist also in the future and people may want to change the code for haddock to succeed also with hi, haddock. Do they? In that light, I don't really see a real need for any configuration. You can be conservative and have There could be a way to distinguish whether the source is processed by haddock, but
where for bug circumention it should be assertive as:
I'd like And the There is few examples where people are lazy and add additional imports with |
I note that this can't be project-level configuration. Packages that rely on this may exist on Hackage - how are you supposed to know how to haddock them? How is Hackage supposed to know how to haddock them? If we were going to make it configurable, it seems to me that it would need to be a
This case is indeed interesting, because it's about compiler plugins. Perhaps I'm being naive, but I would hope that the upshot of HI Haddock is that most bugs such as this go away: haddock doesn't need to know how to handle compiler plugins, because GHC has already dealt with it. So hopefully there will be less need to work around Haddock in the future. But also suppose that we made it impossible to ever detect if Haddock is running. That would lose us the ability to work around some Haddock bugs... which would mean that in some edge cases (19! in all of Hackage!) some things might not be able to be documented. Perhaps that's an acceptable cost. |
It's 19 occurences of
So:
I'd say the cost is very low.
|
Hitting memory limits due to haddock, this can't come soon enough. |
@FinleyMcIlwaine Where are we standing with this PR? Anything you need? |
I am told the work will be resumed by someone else at Well-Typed, no need to harass Finley any further on Cabal's side. :) |
791ba5c
to
fe61ff6
Compare
Haddock no longer writes compilation files by default, so we do not need to pass tmp dirs for `-hidir`, `-stubdir`, and `-odir` via `--optghc`. Indeed, we do not *want* to do so, since it results in recompilation for every invocation of Haddock via Cabal. This commit stops this from happening for haddock versions >= 2.28 (when Hi Haddock was introduced). This commit also stops the default definition of the `__HADDOCK_VERSION__` macro when invoking GHC through haddock. Since a very limited set of users may still depend on this macro, we introduce the `--haddock-version-cpp` flag and `haddock-version-cpp:` cabal.project field, which enable the definition of the `__HADDOCK_VERSION__` macro when invoking GHC through Haddock. This will almost guarantee recompilation during documentation generation due to the macro definition. This commit also renames the `--haddock-lib` flag to `--haddock-resources-dir` (and `haddock-lib:` cabal.project field to `haddock-resources-dir:`), and adds this flag to the users guide since it was missing an entry. This also allows us to add this field to `cabal-install:test:integration-tests2`, since it is no longer ambiguous with the `--lib` flag. This commit also causes `documentation: true` or `--enable-documentation` to imply `-haddock` for GHC. Also, since Haddock >= 2.29 is renaming `--lib` to `--resources-dir`, this commit switches the flag provided to Haddock using a backwards compatible condition based on the Haddock version. Adds a changelog entry.
We no longer define the `__HADDOCK_VERSION__` macro when invoking GHC through Haddock, since doing so essentially guarantees recompilation during documentation generation. We audited all uses of `__HADDOCK_VERSION__` in hackage, ensuring there was a reasonable path forward to migrate away from using `__HADDOCK_VERSION__` for each, while generating the same documentation as it did before. If you are a user of `__HADDOCK_VERSION__`, please take a look at the discussion in haskell#9177 and reach out to us if your use case is not covered. Reverts the version-cpp flag introduced in the previous commit to avoid a workaround. Instead, we get rid of the problem (`__HADDOCK_VERSION__`) at its root. See the discussion in haskell#9177
With "Hi Haddock", we want to re-use GHC's compilation output (in particular, the generated interface files) to generate documentation from. However, haddock by default will use a temporary location to build sources, and will therefore fail to find the interface files produced by GHC. To avoid recompilation, we need to instruct Cabal to pass --no-tmp-comp-dir to Haddock so as to not use a temporary directory and thus re-use existing interface files.
In the last commits we started re-using GHC's interface files and objects in haddock in order to avoid recompilation. However, if haddock is run with different flags than GHC (say, for example, `haddock-options: -DSomethingCustom`), it will recompile the interfaces and objects. This commit introduces a guardrail to the process of re-using GHC's compilation files: instead of running haddock directly on the directories where GHC placed its output, copy the directory contents to a temporary directory and point haddock to the objects and interfaces there. Even if recompilation is triggered by haddock, the objects produced by GHC will be left untouched.
b9cf3ac
to
a1e14a7
Compare
Manual QA was successful on WSL (Ubuntu 22.04, Windows 11). Here is the log for anyone interested: Reproduce Bug with cabal latest (3.10.3.0) Resolving dependencies...
^ lib was compiled twice. Reproduce Fix with cabal head Resolving dependencies...
Lib.hs was compiled once ✅ |
Unless I am doing something wrong, QA does not pass on Windows/MSYS2, it seems
|
I don't see an obvious difference in the flags passed to GHC, more than the call to ❯ diff ghcargs optghc-haddockargs
4c4
< dist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build
---
> dist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build
6c6
< dist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build
---
> dist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\haddock-objs-10336
8c8
< dist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build
---
> dist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\haddock-his-10336
10c10
< dist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build\extra-compilation-artifacts\hie
---
> dist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build\\extra-compilation-artifacts\\hie
12c12
< dist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build
---
> dist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build
15,21c15,21
< -idist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build
< -idist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build\autogen
< -idist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build\global-autogen
< -Idist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build\autogen
< -Idist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build\global-autogen
< -Idist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build
< -IC:\msys64\ucrt64\include
---
> -idist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build
> -idist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build\\autogen
> -idist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build\\global-autogen
> -Idist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build\\autogen
> -Idist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build\\global-autogen
> -Idist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build
> -IC:\\msys64\\ucrt64\\include
23c23
< -optPdist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\build\autogen\cabal_macros.h
---
> -optPdist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\build\\autogen\\cabal_macros.h
30c30
< C:\Users\Javier\AppData\Local\cabal\store\ghc-9.8.2-7294\package.db
---
> C:\\Users\\Javier\\AppData\\Local\\cabal\\store\\ghc-9.8.2-7294\\package.db
32c32
< C:\Users\Javier\code\bb\dist-newstyle\packagedb\ghc-9.8.2
---
> C:\\Users\\Javier\\code\\bb\\dist-newstyle\\packagedb\\ghc-9.8.2
34c34
< C:\Users\Javier\code\bb\.\dist-newstyle\build\x86_64-windows\ghc-9.8.2\deps-0.1.0\package.conf.inplace
---
> C:\\Users\\Javier\\code\\bb\\.\\dist-newstyle\\build\\x86_64-windows\\ghc-9.8.2\\deps-0.1.0\\package.conf.inplace However computing the flag fingerprint uses |
Regarding a backport to 3.12, this PR seems to break the 3.12 Cabal the library API, so it seems too late. I'd vote for 3.14. |
* CI: install changelog-d from bindist (#10048) This will avoid build problems when the GHC in the CI environment is updated sooner than expected. Previous breakage: #9177 (comment) (cherry picked from commit d1a6ced) # Conflicts: # .github/workflows/changelogs.yml * !fixup resolve conflicts --------- Co-authored-by: Francesco Gazzetta <[email protected]> Co-authored-by: Artem Pelenitsyn <[email protected]> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
It works with (a locally patched to trace the module flags) GHC 9.11 (HEAD as of today):
|
@jasagredo: so it's good news that it works, but bad news that it didn't help us pinpoint the problem when run with GHC 9.8.2? How could, e.g., @FinleyMcIlwaine or @alt-romes help you diagnose further? |
I will try tomorrow with ghc 9.8. @FinleyMcIlwaine provided me with the tracing patch today at Zurihac. |
Might this be related by chance? https://gitlab.haskell.org/ghc/ghc/-/issues/24964 |
The library had no foreign symbols 🤔. It turns out to be even more puzzling than anticipated. So a locally built GHC 9.8.2 does not show the bug. The bindist 9.8.2 and 9.6.5 GHCs do show the bug, which makes no sense, even more because in Windows it will use an in-tree clang and libs, so it shouldn't depend on the environment in which it is being built. I looked at this with Ben too, with no luck. His suggestion was to build GHC again tomorrow with |
GHC 9.8.2 built with |
@jasagredo: Was the answer to the prayers to your liking? |
@alt-romes and I found some sad issues with this at Zurihac. Notably: Turning on So in practice this only benefits you for changes to your local files, which is a shame. It seems like it gives us the tools to avoid rebuilding dependencies (and your local modules!) when turning on documentation, but we don't currently do things in the right way for that to happen. (Probably this should be in a new issue...) |
While good intention, in practice it doesn't work, and you cannot even override it.
|
Duncan alludes:
With haddocks built from ordinary interface files, that should be more easily doable (but still not easy, as it's new codepaths) |
Yes, Rodrigo and I discussed that as a possibility, I think that would be the most graceful solution. |
It was not unfortunately. So any GHC I build locally, even using the same config options as the CI, says "Module flags unchanged". Every GHC distributed and a GHC I downloaded from a very recent pipeline on GHC's gitlab says "Module flags changed". I think I will try to apply @FinleyMcIlwaine's patch to GHC-head and get a binary with it from CI, as it seems to be the only building environment that can reproduce this behavior. Such a mysterious issue. |
I made a new issue for the recompilation issue, since I think it's distinct from the question of whether |
@FinleyMcIlwaine I got a CI GHC with your patch, and bingo!
So the diff:
|
Upstream issue https://gitlab.haskell.org/ghc/ghc/-/issues/24988 |
We no longer define the `__HADDOCK_VERSION__` macro when invoking GHC through Haddock, since doing so essentially guarantees recompilation during documentation generation. We audited all uses of `__HADDOCK_VERSION__` in hackage, ensuring there was a reasonable path forward to migrate away from using `__HADDOCK_VERSION__` for each, while generating the same documentation as it did before. If you are a user of `__HADDOCK_VERSION__`, please take a look at the discussion in haskell#9177 and reach out to us if your use case is not covered. Reverts the version-cpp flag introduced in the previous commit to avoid a workaround. Instead, we get rid of the problem (`__HADDOCK_VERSION__`) at its root. See the discussion in haskell#9177
Haddock no longer writes compilation files by default, so we do not need to pass tmp dirs for
-hidir
,-stubdir
, and-odir
via--optghc
. Indeed, we do not want to do so, since it results in recompilation for every invocation of Haddock via Cabal. This commit stops this from happening for Haddock versions 2.28 or greater (when Hi Haddock was introduced).This commit also prevents the definition of the
__HADDOCK_VERSION__
macro when invoking GHC through Haddock by default, which necessary to avoid recompilation during documentation generation, since the macro invalidates existing build results. Since a limited set of users may rely on this macro still, we introduce the--haddock-version-cpp
flag andhaddock-version-cpp:
cabal.project field, which enable the definition of the__HADDOCK_VERSION__
macro when invoking GHC through Haddock.This commit also renames the
--haddock-lib
flag to--haddock-resources-dir
(andhaddock-lib:
cabal.project field tohaddock-resources-dir:
), and adds this flag to the users guide since it was missing an entry. This also allows us to add this field tocabal-install:test:integration-tests2
, since it is no longer ambiguous with the--lib
flag.This commit also causes
documentation: true
or--enable-documentation
to imply-haddock
for GHC.Also, since Haddock >= 2.29 is renaming
--lib
to--resources-dir
, this commit switches the flag provided to Haddock using a backwards compatible condition based on the Haddock version.QA Notes
Consider a Haskell package with dependencies. We should be able to simply configure the package so that doing
cabal build
followed bycabal haddock
does not trigger a recompilation. Use the followingdeps.cabal
file:The following
cabal.project
file:And any
Lib.hs
will do:This configuration should avoid recompilation during documentation generation. To verify this, with a package such as above, just run
cabal build
. Since documentation is enabled, the package will first build then generate Haddocks. Since Haddock is passing-ddump-hi-diffs
to GHC, you should see output indicating that no recompilation checks have triggered during the Haddock phase:This means recompilation was successfully avoided.
NOTE: I am working on making
no-tmp-comp-dir
the default behavior for Haddock 2.29, but until then it is necessary to avoid recompilation.Fixes #9175