-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Split off pyarrow-*
builds
#1381
Comments
🔁 😀 It sounds like the current situation is putting an undue strain on the common resources so +1 from me. We will sync the recipes after the split as always or any specific objections @raulcd ? |
I'm fine with splitting things up again. In the future, when we have rattler-build support, we can reevaluate again, but for now, I feel that missing automation and the massive rerendering times are a bigger strain than maintaining two separate feedstocks. |
Ok with consensus here Though I think there have been a few changes in conda-build, which are about to be released in 24.5.0 ( conda/conda-build#5319 ) to help with some performance issues. So it may be worth seeing how that goes |
I think the fact that changes to the tooling are necessary to make the current setup work (better) is a pretty good indication that changes are needed ^^ |
OK, thanks for the inputs so far! Another open question: do we want to do this from v16 onwards, or for all currently supported versions? I'd start with v16 for now, and if we want we can still create extra branches on the pyarrow feedstock later. |
Which branches/versions are having issues currently? |
I don't know if there are open migrations where the bot is failing to open a PR, but the rerender times mainly went up with v14, where we split |
We should do the split for v16 now and for the v14,15 later if we see the need for it. Splitting already released versions could more likely break. Let's first get some experience with doing it for v16. |
Yes, we have open migrations like the |
Does someone else from upstream arrow still want to comment before we move ahead? @kkraus14 @raulcd @kou @pitrou Thanks @assignUser for chiming in already! |
The pyarrow feedstock was now unarchived, and I have a PR that moves over the python bits from #1376 to conda-forge/pyarrow-feedstock#111. Neither are merged yet, but that way it's hopefully easier to see how the split would look like. |
In my recent experience rerendering the currently supported branches (conda-forge/conda-forge-pinning-feedstock#5815 (comment)), I don't think it would be worth the effort to port |
Splitting the feedstocks from 16.0.0 onwards sounds like a sensible approach to me. As per the older versions I would agree that unless required we should continue with the status quo. We have to update those but we haven't done lately. |
OK, it seems everyone is in agreement then! :) I'll get to merging then, thanks for the inputs everyone! |
The more things change, the more they stay the same...
Almost 4 years ago, the https://github.com/conda-forge/pyarrow-feedstock feedstock was archived and the builds were moved here in #146. The package split alluded to in #93 and clarified in #862 took a bit longer to materialize (in #875). With the impending #1376, we're now getting a very hefty 30(!) artefacts per CI job,
List of artefacts as of v16 +
pyarrow{,-core,-all}
...which is also pushing conda-smithy and the rerender bots to its limits (in terms of rendering time), c.f. conda-forge/conda-forge-pinning-feedstock#5815. This also spurred some performance improvements in conda-build, but fundamentally the issue remains that this is getting very large. Quoting @beckermr from the pinning issue:
At first I thought this was not going to work, but after a closer look and especially with the split after #875, there's actually a pretty clean separation between the C++
libarrow*
side and the pythonpyarrow*
bits. In short, I think there's no technical barrier to do this.Here's some pros/cons as I see them.
Cons:
For example, if we were to enable orc-for-pyarrow on windows (which will be possible as of orc 2.0.1), we'd have to avoid on the pyarrow-side that a too-old libarrow gets pulled in which doesn't have support yet.In this case it works because we've had enabled orc-support inlibarrow
for a long time.Pros:
libarrow
bits are migrated very often without actually requiring a rebuild ofpyarrow
; in the current recipe, this always leads to pushing new pyarrow builds as well)Assuming we want to do this, we could unarchive pyarrow, and update it to v16. In that case, it would make sense to split the pyarrow bits off from #1376. I'm not sure we would want to touch any of the older still-supported versions for this, but that'd be possible as well (maybe from v13, as we're about to drop v12 once we migrate v16).
Thoughts @conda-forge/arrow-cpp?
CC @conda-forge/core
Footnotes
arguably also related to the fact that we're not running the test suite for libarrow, which is hard because it unconditionally depends on the very-hard-to-package-because-incompatible-with-our-pinnings testbench ↩
The text was updated successfully, but these errors were encountered: