This is a Trino connector to access local file (e.g. csv, tsv). Please keep in mind that this is not production ready and it was created for tests.
- hdfs
- s3a
- file
- http
- https
You need to specify file type by schema name and use absolute path.
SELECT * FROM
storage.csv."file:///tmp/numbers-2.csv";
SELECT * FROM
storage.csv."https://raw.githubusercontent.com/snowlift/trino-storage/master/src/test/resources/example-data/numbers-2.csv";
Supported schemas are below.
tsv
csv
txt
raw
excel
orc
json
tsv
plugin extract each line with \t
delimiter. Currently first line is used as column names.
SELECT * FROM
storage.tsv."https://raw.githubusercontent.com/snowlift/trino-storage/master/src/test/resources/example-data/numbers.tsv";
one | 1
-------+---
two | 2
three | 3
(2 rows)
csv
plugin extract each line with ,
delimiter. Currently first line is used as column names.
SELECT * FROM
storage.csv."https://raw.githubusercontent.com/snowlift/trino-storage/master/src/test/resources/example-data/numbers-2.csv";
ten | 10
--------+----
eleven | 11
twelve | 12
(2 rows)
txt
plugin doesn't extract each line. Currently column name is always value
.
SELECT * FROM
storage.txt."https://raw.githubusercontent.com/snowlift/trino-storage/master/src/test/resources/example-data/numbers.tsv";
value
--------
one 1
two 2
three 3
(3 rows)
raw
plugin doesn't extract each line. Currently column name is always data
. This connector is similar to txt
plugin.
The main difference is txt
plugin may return multiple rows, but raw
plugin always return only one row.
SELECT * FROM
storage.raw."https://raw.githubusercontent.com/snowlift/trino-storage/master/src/test/resources/example-data/numbers.tsv";
data
--------
one 1
two 2
three 3
(1 row)
excel
plugin currently read first sheet.
SELECT * FROM
storage.excel."https://raw.githubusercontent.com/snowlift/trino-storage/master/src/test/resources/example-data/sample.xlsx";
data
--------
one 1
two 2
three 3
(1 row)
Run all the unit test classes.
./mvnw test
Build without running tests
./mvnw clean install -DskipTests
Note: tests include intergration tests, that will run Minio and HDFS as Docker containers. They need to pull their images, which can take a while. If you see the tests getting stuck, try pulling these images before starting tests, to see the progress. Look for image names and versions in
TestingMinioServer
andTestingHadoopServer
test classes.
Unarchive trino-storage-{version}.zip and copy jar files in target directory to use storage connector in your Trino cluster.