-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add <JDBCConnection> backend #1359
Comments
I did some digging. The warning is referring entirely to an issue in dbplyr itself. It essentially saying "the developers of dbplyr haven't yet written a backend for JDBC that is up to their own current standards". Details
|
Currently, dbplyr doesn't support JDBC but only ODBC. |
I actually would really like to. I'm not sure how big of a lift it is for me, my experience in developing R packages is pretty limited. Despite not having a specific backend, I've found that dbplyr already works fantastically well over JDBC. So far the only issue I've found is due to a problem in RJDBC, not dbplyr. There are some places where the generated SQL doesn't work, but I'm not sure that can be solved with a JDBC backend, because JDBC is just a way to connect to a variety of databases with different SQL dialects. In my recent experience, I find SQL generation glitches can be easily routed around by using From this perspective, I'm not sure there a JDBC backend has anything of value to do other than silence this warning. And maybe warn the user that |
@shearerpmm I think those are still valuable contributions 😄 |
The little I'm able to do is worth my time. IMHO dbplyr is the best tool available today for building complex data pipelines in SQL databases. I know about the alternatives and the competition is not close. |
Big +1 on this request and dbplyr being invaluable I've had to jump over to RJDBC to handle testing an enterprise db migration from an old DB2 to a new SQL Server. i would have used odbc connections, but the recent issues with odbc and db2 meant we were stuck using R 4.1.1 and thus hit memory limits etc (dealing with many tables that have tens of millions of entries and a dozen rows) there are definitely still a lot of db2 databases out there and I imagine that other folks have had to do similar transitions if they wanted to update R itself, so would expect that there's likely a pick-up in need compared to a few years ago when you could use any old ODBC to connect to a db2 so wouldn't hit this problem. so far i've been able to generally just hot-swap the source between the db2 and sqlserver but i've just run into this issue of not having a "proper" backend as i've been trying to push more compute towards the database. in particular, i'm trying to do things like inlining tables to avoid collecting the massive dataset just to do a join (rather, i'd like to inline the table and then join it before running a collect statement), and this is what isn't working without the backend. (getting the same warning as OP posted). i'm not really adept at the backend coding, although i would be up for learning to support this, but i definitely have multiple use cases and would be happy to test things in the wild or provide reprexes if either would be helpful |
@juniperlsimonis what are the problems with odbc and DB2? Could you file an issue in https://github.com/r-dbi/odbc? |
@hadley see ibm's support page there have been multiple issues in odbc that were filed like this one . the solutions folks have been able to implement haven't worked on my end given enterprise limitations. |
To get rJava working on my arm64 mac:
|
I don't think there's much dbplyr can do with RJDBC since it doesn't appear to expose a way to figure out the underlying database type from the driver/connection: library(RJDBC)
#> Loading required package: DBI
#> Loading required package: rJava
# Download from https://github.com/xerial/sqlite-jdbc
class_path <- c(
"~/Downloads/sqlite-jdbc-3.45.1.0.jar",
"~/Downloads/slf4j-api-1.7.36.jar"
)
con <- DBI::dbConnect(
JDBC("org.sqlite.JDBC", class_path),
url = "jdbc:sqlite::memory:"
)
dbGetInfo(con)
#> list() Created on 2024-02-22 with reprex v2.1.0 |
Since the user generally knows, could they have the option to tell dbplyr what the database is when the connection is JDBC? I thought all it really affects is the SQL dialect used? |
@shearerpmm I think we could in principle do that, but the idea that the connection defines the database type is deeply intwined in the guts of dbplyr, so fixing that would be quite a lot of work. Yes, it only affects the SQL dialect used but given that dbplyr is all about converting R code to SQL, that's really important! |
Hi,
I am using “pool” with “RJDBC” and “dbplyr” packages to connect to Athena tables.
However I see the following warning.
This has been slightly tricky to pinpoint. Initially I thought it had something to do with RJDBC but it does not seem to be the case, as seen here.
I subsequently reported the issue to Pool, but it does not seem to be originating from there as well as I could replicate the issue using a DBI connection as well as Pool.
It was easier to generate a reprex using PostgreSQL Docker and JDBC, and the issue does not seem to be database specific. Please let me know if I am doing something incorrectly or need to report this elsewhere.
Steps to reproduce:
1. Create docker image to connect to using the official image:
docker run --network=host --name postgresdb_reprex -e POSTGRES_PASSWORD=mysecretpassword -e POSTGRES_USER=postgres -d postgres
2. Download JAR file from here
Alternatively you can use the attached Project which contains the script and the JAR file for Windows.
pool_issue.zip
3. Run reprex code
reprex.R contains the actual code. Running through
reprex::reprex(input = "reprex.R", venue = "gh", session_info = TRUE)
Reprex Content:
Created on 2023-09-04 with reprex v2.0.2
Session info
4. Stop and remove Docker image
Please let me know if you need any further details. Thanks!
The text was updated successfully, but these errors were encountered: