Skip to content

quackscience/duckdb-extension-clickhouse-sql

Repository files navigation

DuckDB ClickHouse SQL extension

The DuckDB chsql community extension implements popular ClickHouse SQL syntax macros and functions,
making it easier for users to transition between the two database systems ⭐ designed for Quackpipe


Installation

chsql is distributed as a DuckDB Community Extension and can be installed using SQL:

INSTALL chsql FROM community;
LOAD chsql;

If you previously installed the chsql extension, upgrade using the FORCE command

FORCE INSTALL chsql FROM community;
LOAD chsql;

Usage Examples

Once installed, the macro functions provided by the extension can be used just like built-in functions.

Here's a random example out of 100s using the IPv4StringToNum and IPv4NumToString functions:

--- Install and load chsql
D INSTALL chsql FROM community;
D LOAD chsql;

--- Use any of the 100+ ClickHouse Macros
D SELECT IPv4StringToNum('127.0.0.1'), IPv4NumToString(2130706433);
┌──────────────────────────────┬─────────────────────────────┐
│ ipv4stringtonum('127.0.0.1') │ ipv4numtostring(2130706433) │
│            int32             │           varchar           │
├──────────────────────────────┼─────────────────────────────┤
│                   2130706433127.0.0.1                   │
└──────────────────────────────┴─────────────────────────────┘

Remote Queries

The built-in ch_scan function can be used to query remote ClickHouse servers using the HTTP/s API

--- Set optional X-Header Authentication
D CREATE SECRET extra_http_headers (
      TYPE HTTP,
      EXTRA_HTTP_HEADERS MAP{
          'X-ClickHouse-User': 'user',
          'X-ClickHouse-Key': 'password'
      }
  );
--- Query using the HTTP API
D SELECT * FROM ch_scan("SELECT number * 2 FROM numbers(10)", "https://play.clickhouse.com");

Supported Functions

👉 The list of supported aliases is available on the dedicated extension page
👉 The combined list of supported functions can be obtained using an SQL Join


Motivation

Why is the DuckDB + chsql combo fun and useful

✔ DuckDB SQL is awesome and full of great functions.
✔ ClickHouse SQL is awesome and full of great functions.

✔ The DuckDB library is ~51M and modular. Can LOAD extensions.
❌ The ClickHouse monolith is ~551M and growing. No extensions.

✔ DuckDB is open source and protected by a no-profit foundation.
❌ ClickHouse is open core and controlled by for-profit corporation.

✔ DuckDB embedded is fast, mature and elegantly integrated in many languages.
❌ chdb is still experimental, unstable and currently only supports Python.



Functions

function fun_type description comment example
IPv4NumToString macro Cast IPv4 address from numeric to string format SELECT IPv4NumToString(2130706433);
IPv4StringToNum macro Cast IPv4 address from string to numeric format SELECT IPv4StringToNum('127.0.0.1');
arrayExists macro Check if any element of the array satisfies the condition SELECT arrayExists(x -> x = 1, [1, 2, 3]);
arrayJoin macro Unroll an array into multiple rows SELECT arrayJoin([1, 2, 3]);
arrayMap macro Applies a function to each element of an array SELECT arrayMap(x -> x + 1, [1, 2, 3]);
bitCount macro Counts the number of set bits in an integer SELECT bitCount(15);
ch_scan table_macro Query a remote ClickHouse server using HTTP/s API Returns the query results SELECT * FROM ch_scan('SELECT version()','https://play.clickhouse.com', format := 'parquet');
domain macro Extracts the domain from a URL SELECT domain('https://clickhouse.com/docs');
empty macro Check if a string is empty SELECT empty('');
extractAllGroups macro Extracts all matching groups from a string using a regular expression SELECT extractAllGroups('(\d+)', 'abc123');
formatDateTime macro Formats a DateTime value into a string SELECT formatDateTime(now(), '%Y-%m-%d');
generateUUIDv4 macro Generate a UUID v4 value SELECT generateUUIDv4();
ifNull macro Returns the first argument if not NULL, otherwise the second SELECT ifNull(NULL, 'default');
intDiv macro Performs integer division SELECT intDiv(10, 3);
intDivOZero macro Performs integer division but returns zero instead of throwing an error for division by zero SELECT intDivOZero(10, 0);
intDivOrNull macro Performs integer division but returns NULL instead of throwing an error for division by zero SELECT intDivOrNull(10, 0);
leftPad macro Pads a string on the left to a specified length SELECT leftPad('abc', 5, '*');
lengthUTF8 macro Returns the length of a string in UTF-8 characters SELECT lengthUTF8('Привет');
match macro Performs a regular expression match on a string SELECT match('abc123', '\d+');
minus macro Performs subtraction of two numbers SELECT minus(5, 3);
modulo macro Calculates the remainder of division (modulus) SELECT modulo(10, 3);
moduloOrZero macro Calculates modulus but returns zero instead of error on division by zero SELECT moduloOrZero(10, 0);
notEmpty macro Check if a string is not empty SELECT notEmpty('abc');
numbers table_macro Generates a sequence of numbers starting from 0 Returns a table with a single column (UInt64) SELECT * FROM numbers(10);
parseURL macro Extracts parts of a URL SELECT parseURL('https://clickhouse.com', 'host');
path macro Extracts the path from a URL SELECT path('https://clickhouse.com/docs');
plus macro Performs addition of two numbers SELECT plus(5, 3);
protocol macro Extracts the protocol from a URL SELECT protocol('https://clickhouse.com');
read_parquet_mergetree function Merge parquet files using a primary sorting key for fast range queries experimental COPY (SELECT * FROM read_parquet_mergetree(['/folder/*.parquet'], 'sortkey') TO 'sorted.parquet';
rightPad macro Pads a string on the right to a specified length SELECT rightPad('abc', 5, '*');
splitByChar macro Splits a string by a given character SELECT splitByChar(',', 'a,b,c');
toDayOfMonth macro Extracts the day of the month from a date SELECT toDayOfMonth('2023-09-10');
toFixedString macro Converts a value to a fixed-length string SELECT toFixedString('abc', 5);
toFloat macro Converts a value to a float SELECT toFloat('123.45');
toFloatOrNull macro Converts a value to float or returns NULL if the conversion fails SELECT toFloatOrNull('abc');
toFloatOrZero macro Converts a value to float or returns zero if the conversion fails SELECT toFloatOrZero('abc');
toHour macro Extracts the hour from a DateTime value SELECT toHour(now());
toInt128 macro Converts a value to a 128-bit integer SELECT toInt128('123456789012345678901234567890');
toInt128OrNull macro Converts to a 128-bit integer or returns NULL on failure SELECT toInt128OrNull('abc');
toInt128OrZero macro Converts to a 128-bit integer or returns zero on failure SELECT toInt128OrZero('abc');
toInt16 macro Converts a value to a 16-bit integer SELECT toInt16('123');
toInt16OrNull macro Converts to a 16-bit integer or returns NULL on failure SELECT toInt16OrNull('abc');
toInt16OrZero macro Converts to a 16-bit integer or returns zero on failure SELECT toInt16OrZero('abc');
toInt256 macro Converts a value to a 256-bit integer SELECT toInt256('12345678901234567890123456789012345678901234567890123456789012345678901234567890');
toInt256OrNull macro Converts to a 256-bit integer or returns NULL on failure SELECT toInt256OrNull('abc');
toInt256OrZero macro Converts to a 256-bit integer or returns zero on failure SELECT toInt256OrZero('abc');
toInt32 macro Converts a value to a 32-bit integer SELECT toInt32('123');
toInt32OrNull macro Converts to a 32-bit integer or returns NULL on failure SELECT toInt32OrNull('abc');
toInt32OrZero macro Converts to a 32-bit integer or returns zero on failure SELECT toInt32OrZero('abc');
toInt64 macro Converts a value to a 64-bit integer SELECT toInt64('123');
toInt64OrNull macro Converts to a 64-bit integer or returns NULL on failure SELECT toInt64OrNull('abc');
toInt64OrZero macro Converts to a 64-bit integer or returns zero on failure SELECT toInt64OrZero('abc');
toInt8 macro Converts a value to an 8-bit integer SELECT toInt8('123');
toInt8OrNull macro Converts to an 8-bit integer or returns NULL on failure SELECT toInt8OrNull('abc');
toInt8OrZero macro Converts to an 8-bit integer or returns zero on failure SELECT toInt8OrZero('abc');
toMinute macro Extracts the minute from a DateTime value SELECT toMinute(now());
toMonth macro Extracts the month from a Date value SELECT toMonth('2023-09-10');
toSecond macro Extracts the second from a DateTime value SELECT toSecond(now());
toString macro Converts a value to a string SELECT toString(123);
toUInt16 macro Converts a value to an unsigned 16-bit integer SELECT toUInt16('123');
toUInt16OrNull macro Converts to an unsigned 16-bit integer or returns NULL on failure SELECT toUInt16OrNull('abc');
toUInt16OrZero macro Converts to an unsigned 16-bit integer or returns zero on failure SELECT toUInt16OrZero('abc');
toUInt32 macro Converts a value to an unsigned 32-bit integer SELECT toUInt32('123');
toUInt32OrNull macro Converts to an unsigned 32-bit integer or returns NULL on failure SELECT toUInt32OrNull('abc');
toUInt32OrZero macro Converts to an unsigned 32-bit integer or returns zero on failure SELECT toUInt32OrZero('abc');
toUInt64 macro Converts a value to an unsigned 64-bit integer SELECT toUInt64('123');
toUInt64OrNull macro Converts to an unsigned 64-bit integer or returns NULL on failure SELECT toUInt64OrNull('abc');
toUInt64OrZero macro Converts to an unsigned 64-bit integer or returns zero on failure SELECT toUInt64OrZero('abc');
toUInt8 macro Converts a value to an unsigned 8-bit integer SELECT toUInt8('123');
toUInt8OrNull macro Converts to an unsigned 8-bit integer or returns NULL on failure SELECT toUInt8OrNull('abc');
toUInt8OrZero macro Converts to an unsigned 8-bit integer or returns zero on failure SELECT toUInt8OrZero('abc');
toYYYYMM macro Formats a Date to 'YYYYMM' string format SELECT toYYYYMM('2023-09-10');
toYYYYMMDD macro Formats a Date to 'YYYYMMDD' string format SELECT toYYYYMMDD('2023-09-10');
toYYYYMMDDhhmmss macro Formats a DateTime to 'YYYYMMDDhhmmss' string format SELECT toYYYYMMDDhhmmss(now());
toYear macro Extracts the year from a Date or DateTime value SELECT toYear('2023-09-10');
topLevelDomain macro Extracts the top-level domain (TLD) from a URL SELECT topLevelDomain('https://example.com');
tupleConcat macro Concatenates two tuples into one tuple SELECT tupleConcat((1, 'a'), (2, 'b'));
tupleDivide macro Performs element-wise division between two tuples SELECT tupleDivide((10, 20), (2, 5));
tupleDivideByNumber macro Divides each element of a tuple by a number SELECT tupleDivideByNumber((10, 20), 2);
tupleIntDiv macro Performs element-wise integer division between two tuples SELECT tupleIntDiv((10, 20), (3, 4));
tupleIntDivByNumber macro Performs integer division of each element of a tuple by a number SELECT tupleIntDivByNumber((10, 20), 3);
tupleMinus macro Performs element-wise subtraction between two tuples SELECT tupleMinus((10, 20), (5, 3));
tupleModulo macro Performs element-wise modulus between two tuples SELECT tupleModulo((10, 20), (3, 6));
tupleModuloByNumber macro Calculates the modulus of each element of a tuple by a number SELECT tupleModuloByNumber((10, 20), 3);
tupleMultiply macro Performs element-wise multiplication between two tuples SELECT tupleMultiply((10, 20), (2, 5));
tupleMultiplyByNumber macro Multiplies each element of a tuple by a number SELECT tupleMultiplyByNumber((10, 20), 3);
tuplePlus macro Performs element-wise addition between two tuples SELECT tuplePlus((1, 2), (3, 4));
url table_macro Performs queries against remote URLs using the specified format Supports JSON, CSV, PARQUET, TEXT, BLOB SELECT * FROM url('https://urleng.com/test','JSON');
JSONExtract macro Extracts JSON data based on key from a JSON object SELECT JSONExtract(json_column, 'user.name');
JSONExtractString macro Extracts JSON data as a VARCHAR from a JSON object SELECT JSONExtractString(json_column, 'user.email');
JSONExtractUInt macro Extracts JSON data as an unsigned integer from a JSON object SELECT JSONExtractUInt(json_column, 'user.age');
JSONExtractInt macro Extracts JSON data as a 32-bit integer from a JSON object SELECT JSONExtractInt(json_column, 'user.balance');
JSONExtractFloat macro Extracts JSON data as a double from a JSON object SELECT JSONExtractFloat(json_column, 'user.score');
JSONExtractRaw macro Extracts raw JSON data based on key from a JSON object SELECT JSONExtractRaw(json_column, 'user.address');
JSONHas macro Checks if a JSON key exists and is not null SELECT JSONHas(json_column, 'user.active');
JSONLength macro Returns the length of a JSON array SELECT JSONLength(json_column, 'items');
JSONType macro Determines the type of JSON element at the given path SELECT JSONType(json_column, 'user.data');
JSONExtractKeys macro Extracts keys from a JSON object SELECT JSONExtractKeys(json_column);
JSONExtractValues macro Extracts all values as text from a JSON object SELECT JSONExtractValues(json_column);
equals macro Checks if two values are equal SELECT equals(column_a, column_b);
notEquals macro Checks if two values are not equal SELECT notEquals(column_a, column_b);
less macro Checks if one value is less than another SELECT less(column_a, column_b);
greater macro Checks if one value is greater than another SELECT greater(column_a, column_b);
lessOrEquals macro Checks if one value is less than or equal to another SELECT lessOrEquals(column_a, column_b);
greaterOrEquals macro Checks if one value is greater than or equal to another SELECT greaterOrEquals(column_a, column_b);
dictGet macro Retrieves an attribute from a VARIABLE string or MAP SELECT dictGet('dictionary_name', 'attribute');
Disclaimer

DuckDB ® is a trademark of DuckDB Foundation. ClickHouse® is a trademark of ClickHouse Inc. All trademarks, service marks, and logos mentioned or depicted are the property of their respective owners. The use of any third-party trademarks, brand names, product names, and company names is purely informative or intended as parody and does not imply endorsement, affiliation, or association with the respective owners.