Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write CEP about virtual packages #103
base: main
Are you sure you want to change the base?
Write CEP about virtual packages #103
Changes from all commits
ebcfa8e
cda0a8c
6f2d60f
9a793b9
9b4b7a7
59462ed
cedde6a
df9d61d
72022f4
7c3b595
e6fcda5
40b4442
bc6d6a6
5bf10f1
7a2c7e5
cd06910
e520336
1b01182
2148654
eaeb222
9a95c6b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The microarch-level packages in conda-forge depend on
__archspec
to provide microarchitecture-level (e.g. x86-64-v2) meta-packages.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we aware of any existing packages that actually depend on
__archspec
?If not, I would rather we not make this virtual package mandatory at this time, mostly because "what microarchitecture is this CPU?" is a question that can get complicated quickly; see, e.g., ARM big.LITTLE, Intel P/E cores, Intel Xeon 6, etc. To me, this virtual package is still (pseudo-)experimental, in the sense that we still need to work out how package maintainers should/want to use this package. IMO,
__archspec
should be its own CEP or set of CEPs (cf. #59).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#103 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im wondering why this must always be present. For some target platform it doesnt seem to make sense (any platform where the archspec should be reported as
0
basically). Shouldn't we instead just omit the archspec in that case?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An interesting observation from this table is that there are a number of entries here that are not present in the archspec database (armv6l, armv7l, s390x, arm64) and are thus not actually existing microarchitectures. I guess we can make something up. But for
arm64
I think it would be more appropriate to useaarch64
instead.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For rattler we have the following behavior:
win-arm64 -> aarch64
osx-arm64 -> m1
I think this makes sense because these are actual existing microarchitectures and they are also both the lowest possible supported microarchitectures on these platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should
riscv32
also be added?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add
wasm32
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuDriverGetVersion
returns the version of the installed driver, not the oldest CUDA version supported by the detected drivers. Due to backwards compatibility support newer drivers also support older versions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this returns the CUDA version as seen by the driver. The docs are not super clear. There's also
cudaRuntimeGetVersion
At least this is whatconda/conda
uses.@jakirkham could you help clarify this? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think the driver CUDA version might not be the "oldest CUDA version supported by the detected driver". E.g. If my driver is 12.6 it might also support older CUDA versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asking offline to see if others have thoughts here
That said, here are some rough thoughts. Starting with some background
When the CUDA Toolkit is typically shipped (like in the installer), it ships with a full suite of things including the driver library. However in Conda, we ship pretty much everything else except the driver library. So with Conda, the user is on the hook for ensuring the driver library is installed, which would happen when they install the driver
So this raises a question. How does Conda decide what version of the driver is compatible with a particular CUDA Toolkit?
This is what
__cuda
is solving. It is giving us the driver library's associated CUDA Toolkit version. IOW if that driver library was shipped in an installer, this is that installer's CUDA Toolkit versionSo now when we try to install the CUDA Toolkit libraries, we can answer the question, will the driver be able to handle these?
Originally the answer was the CUDA Toolkit libraries we install could be no newer than the CUDA Toolkit the driver came from. We had checked this down to the major minor version (patch version was allowed to float). So if this driver check reported CUDA 10.1, we could only use CUDA Toolkit libraries from 10.1 or older. Eventually we relaxed this to major only for 11.2+ and 12
So really
__cuda
's version is the upper bound on the libraries we can useSorry for the long winded answer. Though hopefully the additional details provide context. Perhaps this can be incorporated into the text above somehow (after some discussion and Q&A)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"...if the target platform's C standard library is not the GNU C Library". Not sure if we codify the assumption of all Linux systems inherently provide GNU
libc
or that only Linux uses GNUlibc
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current conda implementation assumes the latter "only Linux uses GNU libc". See https://github.com/conda/conda/blob/e74a2b9d8a74837afc3bcbef609fe4bd29572e16/conda/plugins/virtual_packages/linux.py#L14-L16.
If that's incorrect, (1) we need to fix that in
conda/conda
, and (2) I will update this paragraph.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we provide some guidance here on how the default value should be selected? If we don't, I could see situations arising where two different conda install tools or two different versions of the same install tool provide different default values on the same system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conda/conda
sets it to2.5
(yes, I know).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rattler also skips the virtual package if a libc implementation cannot be found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Linux kernel upstream only defines
{major}.{minor}.{micro}
; anything beyond that (including the{patch}
component you wrote) is part of the distribution kernel's version string. I don't think conda tooling should expose those components since their semantics of those will vary from distribution to distributon; e.g., patch 42 on Fedora may differ from patch 42 on Ubuntu.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this contradicts this comment in
conda/conda
. I'm happy to trim to three fields, but we'll need a source.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ha! Since Linux 3.0, "[mainline] kernels with 3-part versions are -stable kernels"; however, the mainline 2.6 series had 4-part version numbers. The existing "3- or 4-part" comment in
conda/conda
comes from the fact that 2.6 is the upstream for the CentOS/RHEL 6 kernel, which was actively supported by Anaconda and conda-forge at the time I wrote that patch.In light of that, I suggest replacing:
with something like:
I think that captures the intent of my previous comment, which was to ensure:
__linux
virtual package; and(That said, my suggested language does allow for non-release/development/
-rcN
kernels, but I suspect that user base is relatively small and able to cope if something goes horribly wrong.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will return the SYSTEM_VERSION_COMPAT version if the Python interpreter running the command was built against the 10.15 SDK or earlier.
See https://eclecticlight.co/2020/08/13/macos-version-numbering-isnt-so-simple/ for details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICR, if you start the Python interpreter with SYSTEM_VERSION_COMPAT=0 it returns the 11.x based version, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, running setting the environment variable will return a >=11 version.
My main concern here the "MUST NOT" language around the SYSTEM_VERSION_COMPAT workaround. I agree with the idea but this is not the case for the current version of conda (see conda/conda#13832). The example an this issue still reports a 10.16 version for
__osx
. Changing this to "SHOULD" would be reasonable, especially given that there are many releases of tools/packages that will report the compatible version that already exist.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could consider that a bug in conda and submit a fix for 25.1 (as we are doing for __win). I can survey public repodata for __osx usage in the wild if that helps inform this decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See conda/conda@
d94af02
(#14449)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than list the specific target platforms now and having to adjust this CEP if/when new platforms are adopted, I would suggest language like "when the target platform is sufficiently POSIX-y" and list the attributes necessary for that to be true (e.g., uses
/
for path delimiters, supportsfork(3)
, etc.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the sentiment, but I'm going to need some help or references to compile that list 😬