Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting a 404 with using a connection string to azure container #72

Closed
DrewScatterday opened this issue Aug 17, 2024 · 2 comments
Closed

Comments

@DrewScatterday
Copy link

I think this is potentially related to #39 and #21

Delta scan on a delta table locally with node js duckdb:

I downloaded my small test delta table from my azure blob container locally

const { Database } = require("duckdb-async");

async function main() {
	const db = await Database.create(":memory:");
	await db.run(`force install delta from core_nightly;LOAD delta;
				INSTALL azure;LOAD azure;`);
	const rows = await db.all(`
		CREATE TABLE testdata AS 
		SELECT *
		FROM delta_scan('./hundredThousandPoints')`);
	console.log(rows);
	console.log("🦆 DuckDB initialized 🦆");
}
main();

Code runs and outputs as expected:

[ { Count: 100000n } ]
🦆 DuckDB initialized 🦆

Delta scan on delta table in azure:

My delta table is stored in a Blob Container (ADLS Gen2). I am using a connection string for my storage container. Something that looks like this in a .env file:

DefaultEndpointsProtocol=https;AccountName=myaccount;AccountKey=122342314fsdfsdf;EndpointSuffix=core.windows.net
const { Database } = require("duckdb-async");
require('dotenv').config();

async function main() {
	const db = await Database.create(":memory:");
	await db.run(`force install delta from core_nightly;LOAD delta;
				INSTALL azure;LOAD azure;
				CREATE SECRET secret1 (TYPE AZURE, CONNECTION_STRING 'abfss://${process.env.AZURE_CONN_STR}');`);
	const rows = await db.all(`
		CREATE TABLE testdata AS 
		SELECT *
		FROM delta_scan('abfss://myaccount/testdata/us_points/hundredThousandPoints')`);
	console.log(rows);
	console.log("🦆 DuckDB initialized 🦆");
}
main();

I then get an error:

[Error: IO Error: Hit DeltaKernel FFI error (from: While trying to read from delta table: 
'abfss://myaccount/testdata/us_points/hundredThousandPoints/'): 
Hit error: 8 (ObjectStoreError) with message (Error interacting with object store: 
Generic MicrosoftAzure error: Error performing list request: 
Client error with status 404 Not Found: <?xml version="1.0" encoding="utf-8"?><Error><Code>ContainerNotFound</Code><Message>The specified container does not exist.

I downloaded my data locally from azure so I know it exists. Are connection strings supported or should I use a service principal credential? Or is there something silly I am missing?

@DrewScatterday
Copy link
Author

As a sanity test I added the delta table to a different blob storage container that I have more control over that I created (not one that my team uses). After doing so I was able to use a connection string and read the delta table with no issues

I could be missing something but it may be worth adding a connection string test case inside of test/sql/cloud/azure tests but thank you for your hard work on this extension!!

@DrewScatterday
Copy link
Author

FYI I figured it out the issue was:
This abfss://myaccount/testdata/us_points/hundredThousandPoints needed to be changed to
abfss://testdata/us_points/hundredThousandPoints with the account named removed since the connection string includes the account name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant