Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests: Set PySpark driver host to localhost #1466

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

smaheshwar-pltr
Copy link

@smaheshwar-pltr smaheshwar-pltr commented Dec 23, 2024

This let me (personally) run integration tests locally. Before, I was getting

py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. [...]
E                   : java.io.IOException: Failed to connect to [IP-REDACTED]
E                   	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:294)
E                   	at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:214)
...

@smaheshwar-pltr
Copy link
Author

Not sure if useful. @kevinjqliu, mind taking a peek?

@kevinjqliu
Copy link
Contributor

interesting, thats the first time ive seen this issue. do you have remote dev environment?
The typical set up is running the integration test docker containers on local laptop and pyspark will automatically find the driver.
That said i think explicitly binding to the localhost should still be fine

@kevinjqliu kevinjqliu requested a review from Fokko December 23, 2024 18:50
@Fokko
Copy link
Contributor

Fokko commented Dec 24, 2024

@smaheshwar-pltr thanks for raising this. I haven't seen this before either. Can you check if your local hostname is configured correctly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants