-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 3.5.3, .NET 8, CoGrouped UDFs, Fixes, Dependencies and Documentation #1178
base: main
Are you sure you want to change the base?
Conversation
…4.x, 3.3.4+ as well.
@dotnet-policy-service agree |
…alization in Worker.
Can you share how many of the unit tests pass The UDF unit tests have not been updated. |
Hello @GeorgeS2019 , they do. Saw your issue, probably my env uses UTF8 by default. |
What is the status of this PR? |
It works, the tests pass, and performance-wise, it's the best solution I've found for integrating .NET with Spark. The next steps are on Microsoft's side. I'm also working on implementing CoGrouped UDFs, and I plan to push those updates here as well |
Can you investigate if your solution work in polyglot .NET notebook interactive? Previously we all had problem with the UDF after making adjustment to migrate to .net6.
|
Any idea who is "in charge" of this repo? |
I can take a look, but only if a lonely evening with bad weather rolls around :) No promises, as this isn’t my primary focus. There are two suggestions from developers that might help. The first is for a separate code cell, and the second is for a separate environment variable. Have you tried both approaches, and does the issue still persist? |
… users with configuration
Changes:
Tested with:
Spark
Each time on stop there's an exception, that doesn't affect execution
ERROR DotnetBackendHandler: Exception caught: java.net.SocketException: Connection reset at java.base/sun.nio.ch.SocketChannelImpl.throwConnectionReset(SocketChannelImpl.java:394)
Databricks:
On Databricks, UseArrow is always true, and Vector UDFs don't work because spark divides recordbatch to a collection of batches, and code expects a single batch.
Affected tickets: