-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generics example #70
Comments
The funniest part is that the requirement overflow happens in the impl that's marked as (so maybe those |
@aldanor This is exciting, thanks for working on this! I ran into a lot of those overflow errors when developing this crate, and had to be strategic about the trait bounds and you're absolutely right GATs in 1.65 will result in a massive rewrite and simplification of this crate, and I'm personally looking forward to it. Having said that, I'm surprised it's not working in its current from. Please give me a few days and I'll come up with a hand written example (or at least an explanation of where I get stuck), in case you want to keep working on this (though might be strategic to wait until GATs come in and see how it impacts this crate). |
I've posted two examples here, maybe it will save you some time: https://gist.github.com/aldanor/de7accf7bb40e83bec6e1e51f08f1190 (one without Yea, with GATs things should probably get rewritten, but all the generics boilerplate in proc-macros will stay largely the same regardless, so it needs to be done... (there are also various minor bugs which I've fixed along the way - e.g., if you have a field called "data_type" or "validity" in your struct, compilation will fail, etc). So maybe we can manage getting generics (at least struct generics) in first anyway. And with GATs, we don't have to wait till 1.65 - it should be fine on the beta toolchain, so it could land soon after 1.65 is released because why not :) // I'm still a bit confused re: the |
Just to mention it, there's another problem I have (which shows up in one of the gists above, the one with bounds enabled) - in I have a suspicion that this wasn't properly working in the non-generic version as well, but I may be wrong. Would appreciate if you could clarify that as well. E.g.: 115 | impl<__T: std::borrow::Borrow<Foo<A, B>>, A, B> arrow2::array::TryPush<Option<__T>>
| - this type parameter
...
132 | <A as arrow2_convert::serialize::ArrowSerialize>::arrow_serialize(
| ----------------------------------------------------------------- arguments to this function are incorrect
133 | i.a.borrow(),
| ^^^^^^^^^^^^ expected associated type, found type parameter `A`
|
= note: expected reference `&<A as ArrowField>::Type`
found reference `&A` |
Thanks for the thoughts and questions @aldanor.
If you want to check these in as part of another issue/PR please feel free to create one and we can get it reviewed/merged.
I'm working on an updated example for you @aldanor but essentially A and B (or any generic types for that matter) don't need to implement the arrow2_convert traits. For example consider the following:
In this case With regards to the errors + readability, as we discussed above GATs will clean up the code. Part of the reason the code got a bit convoluted is to supported nested type conversions, specifically I'll post an example shortly. If that doesn't help clarify then happy to discuss this further and offer as much clarity as needed. |
This is hard to achieve automatically - note, for example, that even standard libraries traits like One solution to this is to add an attr (container attr), marking this type as "foreign" / "ignored". Serde also has custom bounds for scenarios like this (because it's impossible to figure it out automatically in some cases) - https://serde.rs/container-attrs.html#bound. E.g. (this can be done differently, this is just one possible way): #[derive(ArrowField)]
#[arrow_field(ignore_bound = "CustomAlloc")]
struct StructWithCustomAllocators<A, B, CustomAlloc>
where CustomAlloc: Allocator
{
a: Vec<A, CustomAlloc>,
b: Vec<B, CustomAlloc>,
} |
Re: #[automatically_derived]
impl<A: ::core::fmt::Debug, B: ::core::fmt::Debug,
CustomAlloc: ::core::fmt::Debug> ::core::fmt::Debug for
StructWithCustomAllocators<A, B, CustomAlloc> where CustomAlloc: Allocator
{
fn fmt(&self, f: &mut ::core::fmt::Formatter) -> ::core::fmt::Result {
::core::fmt::Formatter::debug_struct_field2_finish(f,
"StructWithCustomAllocators", "a", &&self.a, "b", &&self.b)
}
} Does it make sense? Not really... but that's how it is (in fact, as you can see in the gist I shared, I had to implement |
@aldanor I'm able to reproduce the overflow error from your example, I'll need some time to think on it. The |
@ncpenke Sounds good. I've tried to simplify it further, here's a playground that fails: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=3577126d5818fd7678a9343bd29f6569 |
And here's an example of the same thing (I think?) that compiles on 1.65 beta: https://play.rust-lang.org/?version=beta&mode=debug&edition=2021&gist=a3fb2e3e7e8248b7a6c9c721a57fc351 Wonder if that's something like what you had in mind with GATs? :) (note that we no longer have to carry around those stupid |
I just skimmed through it, and it's along the lines of how I was hoping it would work. Thanks for taking the time to post it. Honestly, it was pretty painful getting the crate this far without GATs. There are various points at which the overflow error creeps up (sometimes for legitimate use-cases where there are no circular dependencies). My proposal to you is we'll create a feature branch for the next release, which will merge to main after rust 1.65 is stable. I'll move all development except hot fixes to this feature branch. You can merge your generics changes there. If you also want to take on the refactoring of the crate to use GATs please let me know, otherwise I should be able to get to it next week. I think that will be far more practical/time-efficient than trying to get this working without GATs. I'm sure we could do it if we really tried, but with the release of 1.65 at most a few weeks away it doesn't seem worthwhile? |
Yea, I agree. A few weeks before GATs land spending much time on making the code work that we'll throw out anyway doesn't seem too reasonable. I could certainly try to take on it (I'll have to stash my current generics branch then, try and do the GATs refactor, and then get back to generics again) - this is why I've posted the |
Awesome, I just created https://github.com/DataEngineeringLabs/arrow2-convert/tree/feature/v0.4.0. The IterRef sketch looked good to me. I remember when I was thinking through this that the ArrowSerialize, and ArrowField could be cleaned up as well with GATs |
@ncpenke Hey, apologies for falling out for a bit, was way too busy with life and work, but finally got some time to join back if it suits :) Wonder what's the latest story, and is it still the plan to continue on with generics + GATs? (will take me a little while to read through all of my own notes and sketch code to get back into context though) |
@aldanor Great to hear from you. No problem at all, I've been busy as well. Yes the plan is still to continue with generics + GATs. The project had a few contributions come in, so it was worth doing a release. Once I get some breathing room I'll rebase and wrap up my generic changes, so that you can make the GAT changes. I'm sorry this has stretched so long but let's get it done. |
@ncpenke Hey - I'm in the process of adding generics support (to structs, as a first step), which is a pretty painful process to say the least 🤣
As a matter of fact, most of it compiles, except a few weird quirks. It would really help if you could provide a hand-written example of how it's supposed to work so I wouldn't be guessing blindly.
First, deserialization. There's this bound in the
ArrowDeserialize
trait that seems to be causing problems:Basically, if for some struct
Foo<A, B>
for all implementations we requirethis leads to
However, if we require
this results in
Wonder if you'd have any ideas on this, or maybe provide a trivial working example? 🤔
(I have a feeling that those trait bounds together and all the
for<'_>
are messing things up big time; perhaps this could be rewritten once GATs are stabilized in less than a month from now in 1.65?... idk)The text was updated successfully, but these errors were encountered: