Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add chsql_native 0.0.1 #240

Merged
merged 1 commit into from
Dec 22, 2024
Merged

Add chsql_native 0.0.1 #240

merged 1 commit into from
Dec 22, 2024

Conversation

lmangani
Copy link
Contributor

@lmangani lmangani commented Dec 22, 2024

Description

chsql_native is an extension for chsql adding optional functions for reading ClickHouse Native files.

This is an independent clear room binary format reader without any clickhouse code or libraries

Status

  • Experimental Release
  • Potentially Unstable 🔥
  • Needs Testers & Contributors

Input

Generate some native files with clickhouse-local or clickhouse-server

--- simple w/ one row, two columns
SELECT version(), number FROM numbers(1) INTO OUTFILE '/tmp/numbers.clickhouse' FORMAT Native;
--- simple w/ one column, 100000 rows
SELECT number FROM numbers(100000) INTO OUTFILE '/tmp/100000.clickhouse' FORMAT Native;
--- complex w/ multiple types
SELECT * FROM system.functions INTO OUTFILE '/tmp/functions.clickhouse' FORMAT Native;

Usage

Read ClickHouse Native files with DuckDB. Reads are full-scans at this time.

D SELECT * FROM clickhouse_native('/tmp/numbers.clickhouse');
┌──────────────┬─────────┐
│  version()   │ number  │
│   varchar    │  int32  │
├──────────────┼─────────┤
│ 24.12.1.12730       │
└──────────────┴─────────┘
D SELECT count(*), max(number) FROM clickhouse_native('/tmp/100000.clickhouse');
┌──────────────┬─────────────┐
│ count_star() │ max(number) │
│    int64     │    int32    │
├──────────────┼─────────────┤
│       10000099999 │
└──────────────┴─────────────┘
D SELECT * FROM clickhouse_native('/tmp/functions.clickhouse') WHERE alias_to != '' LIMIT 10;
┌────────────────────┬──────────────┬──────────────────┬──────────────────────┬──────────────┬─────────┬───┬─────────┬───────────┬────────────────┬──────────┬────────────┐
│        name        │ is_aggregate │ case_insensitive │       alias_to       │ create_query │ origin  │ … │ syntax  │ arguments │ returned_value │ examples │ categories │
│      varchar       │    int32     │      int32       │       varcharvarcharvarchar │   │ varcharvarcharvarcharvarcharvarchar   │
├────────────────────┼──────────────┼──────────────────┼──────────────────────┼──────────────┼─────────┼───┼─────────┼───────────┼────────────────┼──────────┼────────────┤
│ connection_id      │            01 │ connectionID         │              │ System  │ … │         │           │                │          │            │
│ rand32             │            00 │ rand                 │              │ System  │ … │         │           │                │          │            │
│ INET6_ATON         │            01 │ IPv6StringToNum      │              │ System  │ … │         │           │                │          │            │
│ INET_ATON          │            01 │ IPv4StringToNum      │              │ System  │ … │         │           │                │          │            │
│ truncate           │            01 │ trunc                │              │ System  │ … │         │           │                │          │            │
│ ceiling            │            01 │ ceil                 │              │ System  │ … │         │           │                │          │            │
│ replace            │            01 │ replaceAll           │              │ System  │ … │         │           │                │          │            │
│ from_utc_timestamp │            01 │ fromUTCTimestamp     │              │ System  │ … │         │           │                │          │            │
│ mapFromString      │            00 │ extractKeyValuePairs │              │ System  │ … │         │           │                │          │            │
│ str_to_map         │            01 │ extractKeyValuePairs │              │ System  │ … │         │           │                │          │            │
├────────────────────┴──────────────┴──────────────────┴──────────────────────┴──────────────┴─────────┴───┴─────────┴───────────┴────────────────┴──────────┴────────────┤
│ 10 rows                                                                                                                                           12 columns (11 shown) │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

chsql_native is an extension for chsql adding ClickHouse native file reading capabilities
@lmangani lmangani marked this pull request as ready for review December 22, 2024 19:14
@lmangani
Copy link
Contributor Author

lmangani commented Dec 22, 2024

🎄happy holidays team DuckDB Labs 🎄

@carlopi
Copy link
Collaborator

carlopi commented Dec 22, 2024

If sqlite_scanner reads sqlite files, this is a... chsql_scanner!

Looks good to me, thanks!

@carlopi carlopi merged commit cf07643 into duckdb:main Dec 22, 2024
24 checks passed
@lmangani
Copy link
Contributor Author

If sqlite_scanner reads sqlite files, this is a... chsql_scanner!

Looks good to me, thanks!

You're so right.... might rename this in the future so we can also test removing an extension LOL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants