-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: clickhouse-buffer #2672
feat: clickhouse-buffer #2672
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
2 Skipped Deployments
|
|
📝 WalkthroughWalkthroughThe changes introduced in this pull request primarily enhance the configuration and functionality of the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Thank you for following the naming conventions for pull request titles! 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
🧹 Outside diff range and nitpick comments (21)
apps/engineering/content/docs/architecture/meta.json (1)
6-6
: Consider consolidating ClickHouse documentation pages.The pages array includes both "clickhouse" and "clickhouse-proxy" which might lead to fragmented or duplicated information. Consider:
- Consolidating these into a single page if they cover related topics
- Using more distinct names if they cover different aspects (e.g., "clickhouse-overview" and "clickhouse-proxy")
Also, while the separator "---Services---" works, consider using a more JSON-native approach for section organization.
- "pages": ["index", "---Services---", "vault", "clickhouse", "clickhouse-proxy"] + "pages": ["index", "services/vault", "services/clickhouse"]apps/engineering/app/layout.tsx (1)
14-14
: Consider accessibility implicationsIf the Banner contained important announcements or notifications, ensure this information is still accessible to users through alternative means.
Consider:
- Moving important announcements to a different location in the UI
- Using ARIA live regions for dynamic notifications
- Implementing a different notification system if needed
apps/dashboard/lib/hooks/useDelayLoader.tsx (1)
8-10
: Consider a more robust type handling solution.Instead of suppressing the type error, consider one of these alternatives:
- Use a specific type assertion
- Configure TypeScript to prefer one runtime's types
- // @ts-expect-error bun types are colliding with node - timeout = setTimeout(() => { + timeout = (setTimeout as typeof global.setTimeout)(() => {apps/engineering/package.json (1)
14-14
: Consider using fixed version for consistencyThe new dependency
fumadocs-typescript
uses a caret (^) version while other fumadocs packages use fixed versions. Consider using a fixed version for consistency:- "fumadocs-typescript": "^3.0.2", + "fumadocs-typescript": "3.0.2",apps/chproxy/Dockerfile (1)
12-13
: Optimize Docker context with .dockerignoreCopying the entire context with
COPY . .
could include unnecessary files like node_modules, .git, etc., which increases build time and context size.Consider adding a
.dockerignore
file with common exclusions:+# New file: .dockerignore +node_modules +.git +.env +*.log +dist +coverageapps/api/src/pkg/analytics.ts (1)
9-18
: LGTM! Good architectural improvement for write/read separation.The separation of insert and query URLs allows for better load distribution and follows ClickHouse best practices.
Consider adding URL format validation:
constructor(opts: { clickhouseUrl: string; clickhouseInsertUrl?: string; }) { + const validateUrl = (url: string) => { + try { + new URL(url); + } catch (e) { + throw new Error(`Invalid URL: ${url}`); + } + }; + validateUrl(opts.clickhouseUrl); + if (opts.clickhouseInsertUrl) { + validateUrl(opts.clickhouseInsertUrl); + } if (opts.clickhouseInsertUrl) { this.clickhouse = new ClickHouse({apps/api/src/pkg/env.ts (1)
34-34
: Document the relationship between ClickHouse URLs.Consider adding a comment explaining the purpose of having separate URLs for queries and inserts, and whether both are required for the buffer functionality to work correctly.
CLICKHOUSE_URL: z.string(), + // CLICKHOUSE_INSERT_URL allows configuring a separate endpoint for write operations, + // useful when using a ClickHouse proxy to buffer writes CLICKHOUSE_INSERT_URL: z.string().optional(),apps/www/components/animated-list.tsx (3)
Line range hint
34-38
: Consider alternative approaches to handle type conflictsThe
@ts-expect-error
comment suggests underlying type system conflicts between Bun and Node. Consider these alternatives:
- Use
window.setTimeout
to avoid runtime-specific typing- Create a typed wrapper for setTimeout that works with both runtimes
-// @ts-expect-error bun types are colliding with node -timeoutRef.current = setTimeout(() => { +timeoutRef.current = window.setTimeout(() => { setIndex(-1); }, 1000);
Line range hint
49-53
: Potential race condition in timeout managementMultiple timeouts could be created and stored in
timeoutRef.current
, potentially leading to race conditions. Consider clearing existing timeouts before setting new ones.+if (timeoutRef.current) { + clearTimeout(timeoutRef.current); +} -// @ts-expect-error bun types are colliding with node timeoutRef.current = setTimeout(() => { setIndex(0); }, 1000);
Line range hint
1-89
: Consider optimizing the animation cycle implementationThe current implementation uses a combination of
setInterval
,setTimeout
, and state management that could be simplified. Consider these improvements:
- Use a single animation cycle with
useReducer
instead of multiple timeouts- Implement a more declarative state machine for the animation states
- Use CSS transitions for smoother animations
Would you like me to provide a complete example of a simplified implementation?
deployment/docker-compose.yaml (2)
141-141
: Consider improving configuration managementThe API service configuration could be enhanced:
- The CLICKHOUSE_INSERT_URL contains hardcoded credentials. Consider using Docker secrets or environment files.
- The command array is getting quite long. Consider moving these configurations to a dedicated environment file for better maintainability.
Example refactor using an env file:
api: build: context: .. dockerfile: ./apps/api/Dockerfile.dev ports: - 8787:8787 + env_file: + - .env.api - command: - [ - "pnpm", - "wrangler", - "dev", - "--ip=0.0.0.0", - "--var=VERSION:1", - # ... other vars - ] + command: ["pnpm", "wrangler", "dev", "--ip=0.0.0.0"]Also applies to: 149-149
71-82
: Document architectural decisions for ClickHouse proxy layerThe introduction of
chproxy
as a separate service for handling ClickHouse insert operations adds complexity to the architecture. While this separation allows for specific handling of insert operations (potentially for rate limiting), it should be documented:
- Consider adding architecture decision records (ADR) explaining:
- Why inserts need to be proxied separately
- The rate limiting requirements
- The security implications
- Consider adding a diagram in the documentation showing the new data flow
Would you like me to help create a template for the architectural documentation?
Also applies to: 141-141, 149-149
apps/chproxy/tsconfig.json (2)
42-42
: Enable JSON module resolution and isolated modules.Since this service likely needs to handle configuration files and needs to be safely transpiled:
Enable these options:
- // "resolveJsonModule": true, /* Enable importing .json files. */ - // "isolatedModules": true, /* Ensure that each file can be safely transpiled without relying on other imports. */ + "resolveJsonModule": true, /* Enable importing .json files. */ + "isolatedModules": true, /* Ensure that each file can be safely transpiled without relying on other imports. */Also applies to: 75-75
52-53
: Consider generating declaration files.If this proxy service might be used as a dependency by other packages in the monorepo:
Enable declaration file generation:
- // "declaration": true, /* Generate .d.ts files from TypeScript and JavaScript files in your project. */ - // "declarationMap": true, /* Create sourcemaps for d.ts files. */ + "declaration": true, /* Generate .d.ts files from TypeScript and JavaScript files in your project. */ + "declarationMap": true, /* Create sourcemaps for d.ts files. */apps/api/src/routes/v1_ratelimits_limit.ts (1)
328-343
: Consider enhancing error context for better debuggingThe error handling implementation is good, using
waitUntil()
for non-blocking analytics and appropriate error logging. However, consider adding more context to the error log for easier debugging.Consider this enhancement:
logger.error("inserting ratelimit event failed", { error: err.message, + context: { + workspace_id: rootKey.authorizedWorkspaceId, + namespace_id: namespace.id, + request_id: c.get("requestId"), + identifier: req.identifier, + }, + errorCode: err.code, // if available });internal/clickhouse/src/index.ts (2)
25-35
: Consider improving 'ClickHouseConfig' type definition for better type safetyCurrently,
ClickHouseConfig
uses optional properties andnever
types to enforce mutually exclusive configurations. Consider using a discriminated union with a discriminator property to enhance type safety and clarity.Apply this diff to adjust the type definition:
export type ClickHouseConfig = - | { - url?: string; - insertUrl?: never; - queryUrl?: never; - } - | { - url?: never; - insertUrl: string; - queryUrl: string; - }; + | { mode: 'single'; url: string } + | { mode: 'split'; insertUrl: string; queryUrl: string };This change introduces a
mode
discriminator, making configurations explicit and improving type checking.
62-62
: Duplicate methodslogs
andlatest
inverifications
getterBoth
logs
andlatest
in theverifications
getter referencegetLatestVerifications(this.querier)
. If they serve the same purpose, consider removing one to avoid redundancy. If they are intended to be different, ensure they call distinct functions.Apply this diff if they are duplicates:
return { insert: insertVerification(this.inserter), - logs: getLatestVerifications(this.querier), perHour: getVerificationsPerHour(this.querier), perDay: getVerificationsPerDay(this.querier), perMonth: getVerificationsPerMonth(this.querier), - latest: getLatestVerifications(this.querier), + latest: getLatestVerifications(this.querier), };Or update
logs
to reference the correct function if needed.Also applies to: 66-66
apps/chproxy/src/main.ts (3)
81-83
: Use case-insensitive comparison for Authorization headerHTTP headers are case-insensitive. Comparing the
Authorization
header using strict equality may fail if the header's casing differs.Adjust the code to perform a case-insensitive comparison:
-if (req.headers.get("Authorization") !== requiredAuthorization) { +if (req.headers.get("authorization") !== requiredAuthorization) {Alternatively, normalize the header name:
const authorizationHeader = req.headers.get("Authorization") || req.headers.get("authorization"); if (authorizationHeader !== requiredAuthorization) { return new Response("unauthorized", { status: 401 }); }
96-96
: Handle empty request bodies gracefullyWhen splitting the request body by newlines, an empty body will result in an array with a single empty string, which may not be the intended behavior.
Add a check to handle empty request bodies:
-const rows = (await req.text()).split("\n"); +const bodyText = await req.text(); +if (!bodyText) { + return new Response("empty body", { status: 400 }); +} +const rows = bodyText.split("\n");
121-127
: Ensure all pending operations complete before exitingCalling
process.exit(0);
immediately afterawait flush(true);
may not guarantee that all asynchronous operations have completed, potentially leading to data loss.Consider removing
process.exit(0);
and allowing the process to exit gracefully, or ensure all pending promises are resolved before exiting.packages/api/src/openapi.d.ts (1)
79-90
: Ensure consistent naming conventions for operation IDsIn the added endpoints, there's an inconsistency between the path names and the operation IDs regarding singular and plural forms of "ratelimit". For example, the path
"/v1/ratelimits.setOverride"
uses "ratelimits" (plural), while the operation IDoperations["ratelimit.setOverride"]
uses "ratelimit" (singular). For clarity and consistency, consider standardizing the naming to either singular or plural in both paths and operation IDs.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (27)
apps/api/src/pkg/analytics.ts
(1 hunks)apps/api/src/pkg/env.ts
(1 hunks)apps/api/src/pkg/middleware/init.ts
(1 hunks)apps/api/src/routes/v1_ratelimits_limit.ts
(2 hunks)apps/chproxy/Dockerfile
(1 hunks)apps/chproxy/package.json
(1 hunks)apps/chproxy/src/main.ts
(1 hunks)apps/chproxy/tsconfig.json
(1 hunks)apps/dashboard/lib/hooks/useDelayLoader.tsx
(1 hunks)apps/engineering/app/(home)/layout.tsx
(1 hunks)apps/engineering/app/docs/layout.tsx
(1 hunks)apps/engineering/app/layout.config.tsx
(1 hunks)apps/engineering/app/layout.tsx
(1 hunks)apps/engineering/content/docs/architecture/clickhouse-proxy.mdx
(1 hunks)apps/engineering/content/docs/architecture/clickhouse.mdx
(1 hunks)apps/engineering/content/docs/architecture/index.mdx
(1 hunks)apps/engineering/content/docs/architecture/meta.json
(1 hunks)apps/engineering/content/docs/meta.json
(1 hunks)apps/engineering/content/docs/services/meta.json
(0 hunks)apps/engineering/content/docs/test.mdx
(0 hunks)apps/engineering/package.json
(1 hunks)apps/engineering/tsconfig.json
(1 hunks)apps/www/components/animated-list.tsx
(2 hunks)apps/www/package.json
(1 hunks)deployment/docker-compose.yaml
(2 hunks)internal/clickhouse/src/index.ts
(1 hunks)packages/api/src/openapi.d.ts
(2 hunks)
💤 Files with no reviewable changes (2)
- apps/engineering/content/docs/services/meta.json
- apps/engineering/content/docs/test.mdx
✅ Files skipped from review due to trivial changes (5)
- apps/chproxy/package.json
- apps/engineering/app/(home)/layout.tsx
- apps/engineering/app/docs/layout.tsx
- apps/engineering/app/layout.config.tsx
- apps/engineering/content/docs/architecture/index.mdx
🧰 Additional context used
🪛 LanguageTool
apps/engineering/content/docs/architecture/clickhouse-proxy.mdx
[style] ~8-~8: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ew seconds or when the buffer is full. It's available at clickhouse.unkey.cloud
...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~16-~16: To make your text as clear as possible to all readers, do not use this foreign term. Possible alternatives are “below” or “further on” (in a document).
Context: ...ouse proxy is fully managed in unkeyed/infra. #...
(INFRA)
apps/engineering/content/docs/architecture/clickhouse.mdx
[grammar] ~8-~8: It seems that “to” is missing before the verb.
Context: ... perfect fit for us, for it's very high ingest capabilities and easy of querying throu...
(MISSING_TO_BETWEEN_BE_AND_VB)
[uncategorized] ~14-~14: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ickHouse inside our docker-compose setup and it will be autostarted and migrated for...
(COMMA_COMPOUND_SENTENCE)
[grammar] ~16-~16: Probable usage error. Use “and” after ‘both’.
Context: ...house) package contains both the schema as well as a typescript client for inserting and q...
(BOTH_AS_WELL_AS)
[uncategorized] ~30-~30: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ns` Raw verifications are ingested here and we run hourly, daily and monthly aggreg...
(COMMA_COMPOUND_SENTENCE)
[uncategorized] ~30-~30: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...rly, daily and monthly aggregations for user facing analytics. ### ratelimits
Raw ratel...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[uncategorized] ~34-~34: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...imits` Raw ratelimits are ingested here and we run hourly, daily and monthly aggreg...
(COMMA_COMPOUND_SENTENCE)
[uncategorized] ~34-~34: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...rly, daily and monthly aggregations for user facing analytics. ### billing
Aggregates d...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[style] ~51-~51: To make your text as clear as possible to all readers, do not use this foreign term. Possible alternatives are “below” or “further on” (in a document).
Context: ... clusters are fully managed in unkeyed/infra.
(INFRA)
🔇 Additional comments (17)
apps/engineering/content/docs/architecture/meta.json (1)
2-5
: LGTM! Clear and well-structured metadata.
The title, description, icon, and root configuration provide clear context for the architecture documentation.
apps/engineering/app/layout.tsx (1)
14-14
: Verify the Banner removal impact
The removal of the Banner component might affect user experience if it contained important announcements or notifications.
Let's check if the Banner component is used elsewhere or if its functionality has been moved:
apps/dashboard/lib/hooks/useDelayLoader.tsx (1)
8-8
: LGTM as a temporary fix.
The @ts-expect-error
comment addresses the type collision between Bun and Node.js types for the setTimeout
function.
apps/engineering/tsconfig.json (2)
13-13
: Verify TypeScript version compatibility
The change to "moduleResolution": "bundler"
requires TypeScript >= 5.0. This is a more modern and stricter resolution strategy that better aligns with bundler expectations.
✅ Verification successful
TypeScript version compatibility confirmed for moduleResolution: "bundler"
All TypeScript dependencies in the project are using version ^5.5.3 or higher, which fully supports the "bundler" moduleResolution option introduced in TypeScript 5.0.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check TypeScript version in package.json files
# Expect: Version should be >= 5.0
# Find all package.json files and check typescript version
fd package.json --type f --exec sh -c 'echo "=== $1 ==="; cat "$1" | jq -r ".dependencies.typescript, .devDependencies.typescript"' sh {}
Length of output: 2208
13-13
: Verify import statement compatibility
The bundler moduleResolution strategy is stricter than node and may affect how imports are resolved. Please ensure all import statements in the codebase are compatible with this change.
✅ Verification successful
Based on the analysis of the engineering app's imports and dependencies, I can now provide a final response:
The moduleResolution change to "bundler" is safe to proceed with
The codebase follows Next.js and TypeScript module resolution best practices:
- All imports are using standard patterns supported by the bundler strategy
- No problematic implicit index imports or deep imports from node_modules found
- Dependencies are modern packages (fumadocs, next.js, react) that support bundler resolution
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Look for potential import issues that might be affected by the moduleResolution change
# Common patterns that might need attention:
# - Implicit index imports
# - Deep imports from node_modules
# - Imports without extensions
# Search for import statements
rg --type typescript "^import .* from ['\"](?![@\.]).+['\"]" --no-filename
# Search for require statements
rg --type typescript "require\(['\"](?![@\.]).+['\"]\)" --no-filename
Length of output: 662
Script:
#!/bin/bash
# Let's try a different approach to find import statements
# First, let's look for all TypeScript files
fd -e ts -e tsx | while read -r file; do
echo "=== $file ==="
# Show import statements from node_modules (non-relative)
rg "^import .* from ['\"](?!\.|@)" "$file" || true
done
# Also check for package.json to understand the project's dependencies
if [ -f "apps/engineering/package.json" ]; then
echo -e "\n=== Dependencies from apps/engineering/package.json ==="
cat "apps/engineering/package.json" | jq '.dependencies, .devDependencies'
fi
Length of output: 157209
Script:
#!/bin/bash
# Let's check for TypeScript import patterns that might be affected by the bundler moduleResolution
# Specifically looking for:
# 1. Imports without extensions
# 2. Index imports
# 3. Package.json imports field usage
# First, let's look at all TypeScript imports in the engineering app
echo "=== TypeScript imports in engineering app ==="
cd apps/engineering && find . -type f -name "*.ts" -o -name "*.tsx" | while read -r file; do
echo "=== $file ==="
grep -E "^import .* from .*" "$file" || true
done
# Check if there's a package.json "imports" field
echo -e "\n=== Checking for package.json imports field ==="
if [ -f "apps/engineering/package.json" ]; then
cat "apps/engineering/package.json" | jq '.imports'
fi
Length of output: 2968
apps/engineering/package.json (1)
12-15
: Verify breaking changes in fumadocs major version upgrades
The updates include major version bumps for multiple fumadocs packages which might introduce breaking changes:
- fumadocs-core: 13.4.0 → 14.4.0
- fumadocs-mdx: 10.0.2 → 11.1.1
- fumadocs-ui: 13.4.10 → 14.4.0
apps/chproxy/Dockerfile (1)
16-29
: Well-structured build stage implementation
The installer stage follows Docker best practices:
- Separates dependency installation from build
- Leverages build caching effectively
- Uses proper workspace filtering with turbo
apps/api/src/pkg/analytics.ts (1)
Line range hint 22-29
: Verify ClickHouse client version compatibility.
The change from client.insert
to inserter.insert
suggests a breaking change in the ClickHouse client API.
Also, consider adding JSDoc comments to document the schema fields:
public get insertSdkTelemetry() {
return this.clickhouse.inserter.insert({
table: "telemetry.raw_sdks_v1",
schema: z.object({
+ /** Unique identifier for the request */
request_id: z.string(),
+ /** Unix timestamp in milliseconds */
time: z.number().int(),
+ /** Runtime environment (e.g., 'node', 'deno') */
runtime: z.string(),
+ /** Platform information */
platform: z.string(),
+ /** Array of version strings */
versions: z.array(z.string()),
}),
});
}
apps/www/package.json (2)
38-38
: Verify relevance of this dependency update to the PR
This change appears unrelated to the PR's main objective of implementing ClickHouse buffer functionality. Consider moving this dependency update to a separate PR focused on documentation updates.
38-38
: Verify breaking changes in fumadocs-core major version update
The update from v13 to v14 is a major version bump which may include breaking changes.
Run the following to check for potential breaking changes:
apps/api/src/pkg/middleware/init.ts (2)
101-101
: LGTM! Good separation of insert and query endpoints.
The addition of clickhouseInsertUrl
follows the best practice of separating write and read operations in distributed systems, which can improve performance and scalability.
101-101
: Verify environment variable configuration.
Since this change relies on the new CLICKHOUSE_INSERT_URL
environment variable, let's verify its configuration across the codebase.
✅ Verification successful
Environment variable CLICKHOUSE_INSERT_URL
is properly configured
The environment variable is well-configured across the codebase:
- Properly defined in the environment schema at
apps/api/src/pkg/env.ts
as an optional string - Documented in the architecture docs at
apps/engineering/content/docs/architecture/clickhouse-proxy.mdx
- Configured in Docker deployment at
deployment/docker-compose.yaml
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the environment variable configuration and usage
# Check if the environment variable is properly defined in environment schemas
rg -A 2 "CLICKHOUSE_INSERT_URL"
# Check Docker/deployment configurations for this variable
fd -e yaml -e yml -e env | xargs rg "CLICKHOUSE_INSERT_URL"
Length of output: 1092
deployment/docker-compose.yaml (1)
71-82
: Consider enhancing security configuration for chproxy service
A few security and configuration considerations:
- Basic auth credentials are hardcoded in the environment variables. Consider using Docker secrets or environment files for sensitive data.
- The service exposes port 7123 to the host machine. If this proxy is only needed by internal services, consider removing the port mapping and use Docker's internal networking.
- The purpose and configuration of this proxy service should be documented.
Let's verify if this proxy is only used internally:
✅ Verification successful
Security configuration for chproxy service is appropriate
The current configuration is suitable for the intended use case:
- Port exposure (7123) is necessary as it's used internally by the API service for batch processing of ClickHouse inserts, as evidenced by the documentation and code.
- Basic auth credentials are appropriate for local development environment:
- In production, the service runs on AWS AppRunner with proper security controls
- The proxy is only accessible internally through Docker networking
- The service's purpose is well documented in
apps/engineering/content/docs/architecture/clickhouse-proxy.mdx
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any references to port 7123 or chproxy in the codebase
rg -g '!deployment/*' "7123|chproxy"
Length of output: 517
Script:
#!/bin/bash
# Let's check the documentation to understand the purpose of chproxy
cat apps/engineering/content/docs/architecture/clickhouse-proxy.mdx
# Also check the main.ts file to understand the internal usage
cat apps/chproxy/src/main.ts
# Let's also check if there are any external references to this proxy
rg "CLICKHOUSE_INSERT_URL" -A 2 -B 2
Length of output: 5543
apps/chproxy/tsconfig.json (1)
14-14
: LGTM: Basic compilation settings are appropriate.
The ES2016 target and CommonJS module system are suitable choices for a Node.js service.
Also applies to: 28-28
apps/api/src/routes/v1_ratelimits_limit.ts (1)
151-151
: LGTM: Good addition of logger service
Adding the logger service improves observability for error handling.
internal/clickhouse/src/index.ts (2)
38-39
: LGTM!
The introduction of separate querier
and inserter
properties enhances clarity and allows for distinct configurations for querying and inserting.
78-84
: LGTM!
The updates in the ratelimits
getter correctly utilize this.inserter
and this.querier
, ensuring that insertions and queries are handled by the appropriate clients.
@@ -31,6 +31,7 @@ export const zEnv = z.object({ | |||
AGENT_TOKEN: z.string(), | |||
|
|||
CLICKHOUSE_URL: z.string(), | |||
CLICKHOUSE_INSERT_URL: z.string().optional(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add URL validation for ClickHouse configuration.
Both CLICKHOUSE_URL
and CLICKHOUSE_INSERT_URL
should validate that they are proper URLs for consistency and to catch configuration errors early.
- CLICKHOUSE_URL: z.string(),
+ CLICKHOUSE_URL: z.string().url(),
- CLICKHOUSE_INSERT_URL: z.string().optional(),
+ CLICKHOUSE_INSERT_URL: z.string().url().optional(),
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
CLICKHOUSE_INSERT_URL: z.string().optional(), | |
CLICKHOUSE_URL: z.string().url(), | |
CLICKHOUSE_INSERT_URL: z.string().url().optional(), |
apps/chproxy/src/main.ts
Outdated
const buffer = new Map<string, Batch>(); | ||
|
||
async function flush(force?: boolean): Promise<void> { | ||
const now = Date.now(); | ||
|
||
for (const [key, batch] of buffer.entries()) { | ||
if (force || now >= batch.createdAt + MAX_BATCH_TIME) { | ||
await persist(key); | ||
} | ||
} | ||
} | ||
|
||
// persist inserts the data into clickhouse and removes it from the buffer | ||
async function persist(key: string): Promise<void> { | ||
const batch = buffer.get(key); | ||
if (!batch) { | ||
return; | ||
} | ||
|
||
const url = new URL(env.CLICKHOUSE_URL); | ||
batch.params.forEach((v, k) => { | ||
url.searchParams.set(k, v); | ||
}); | ||
|
||
const res = await fetch(url, { | ||
method: "POST", | ||
headers: { | ||
"Content-Type": "text/plain", | ||
Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`, | ||
}, | ||
body: batch.rows.join("\n"), | ||
}); | ||
if (res.ok) { | ||
buffer.delete(key); | ||
console.info(`persisted ${batch.rows.length} rows`); | ||
} else { | ||
console.error("unable to persist", await res.text(), JSON.stringify(batch)); | ||
} | ||
} | ||
|
||
setInterval(flush, 1000); | ||
|
||
const server = Bun.serve({ | ||
port: env.PORT, | ||
fetch: async (req: Request): Promise<Response> => { | ||
const url = new URL(req.url); | ||
|
||
if (url.pathname === "/v1/liveness") { | ||
return new Response("I'm alive"); | ||
} | ||
|
||
if (req.headers.get("Authorization") !== requiredAuthorization) { | ||
return new Response("unauthorized", { status: 401 }); | ||
} | ||
|
||
const query = url.searchParams.get("query"); | ||
if (!query || !query.startsWith("INSERT INTO")) { | ||
return new Response("wrong query", { status: 400 }); | ||
} | ||
|
||
const params = url.searchParams; | ||
params.delete("query_id"); | ||
params.sort(); | ||
|
||
const key = params.toString(); | ||
|
||
const rows = (await req.text()).split("\n"); | ||
|
||
const existing = buffer.get(key); | ||
if (existing) { | ||
const size = existing.rows.push(...rows); | ||
if (size >= MAX_BATCH_SIZE) { | ||
await persist(key); | ||
} | ||
} else { | ||
buffer.set(key, { | ||
createdAt: Date.now(), | ||
params, | ||
rows, | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Synchronize access to the buffer to prevent race conditions
Since both the fetch
handler and the flush
function access and modify the buffer
, concurrent access may lead to race conditions, especially under high load or with multiple CPU cores.
Consider using a mutex or lock mechanism to synchronize access to the buffer
. Alternatively, use concurrent data structures or design patterns that handle asynchronous access safely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javascript runs in a single thread, this is not parallel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chronark, thank you for the clarification. You're correct—since JavaScript runs in a single thread, race conditions due to concurrent access to buffer
are not a concern here. I'll keep this in mind for future reviews.
✏️ Learnings added
Learnt from: chronark
PR: unkeyed/unkey#2672
File: apps/chproxy/src/main.ts:30-110
Timestamp: 2024-11-21T20:12:33.365Z
Learning: In `apps/chproxy/src/main.ts`, JavaScript runs in a single thread, so race conditions due to concurrent access to shared variables like `buffer` are not a concern.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
"application/json": components["schemas"]["ErrUnauthorized"]; | ||
}; | ||
}; | ||
/** @description The client does not have access rights to the content; that is, it is unauthorized, so the server is refusing to give the requested resource. Unlike 401 Unauthorized, the client's identity is known to the server. */ | ||
403: { | ||
content: { | ||
"application/json": components["schemas"]["ErrForbidden"]; | ||
}; | ||
}; | ||
/** @description The server cannot find the requested resource. In the browser, this means the URL is not recognized. In an API, this can also mean that the endpoint is valid but the resource itself does not exist. Servers may also send this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client. This response code is probably the most well known due to its frequent occurrence on the web. */ | ||
404: { | ||
content: { | ||
"application/json": components["schemas"]["ErrNotFound"]; | ||
}; | ||
}; | ||
/** @description This response is sent when a request conflicts with the current state of the server. */ | ||
409: { | ||
content: { | ||
"application/json": components["schemas"]["ErrConflict"]; | ||
}; | ||
}; | ||
/** @description The user has sent too many requests in a given amount of time ("rate limiting") */ | ||
429: { | ||
content: { | ||
"application/json": components["schemas"]["ErrTooManyRequests"]; | ||
}; | ||
}; | ||
/** @description The server has encountered a situation it does not know how to handle. */ | ||
500: { | ||
content: { | ||
"application/json": components["schemas"]["ErrInternalServerError"]; | ||
}; | ||
}; | ||
}; | ||
}; | ||
getOverride: { | ||
parameters: { | ||
query: { | ||
namespaceId?: string; | ||
namespaceName?: string; | ||
identifier: string; | ||
}; | ||
}; | ||
responses: { | ||
/** @description Details of the override for the given identifier */ | ||
200: { | ||
content: { | ||
"application/json": { | ||
id: string; | ||
identifier: string; | ||
limit: number; | ||
duration: number; | ||
async?: boolean | null; | ||
}; | ||
}; | ||
}; | ||
/** @description The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing). */ | ||
400: { | ||
content: { | ||
"application/json": components["schemas"]["ErrBadRequest"]; | ||
}; | ||
}; | ||
/** @description Although the HTTP standard specifies "unauthorized", semantically this response means "unauthenticated". That is, the client must authenticate itself to get the requested response. */ | ||
401: { | ||
content: { | ||
"application/json": components["schemas"]["ErrUnauthorized"]; | ||
}; | ||
}; | ||
/** @description The client does not have access rights to the content; that is, it is unauthorized, so the server is refusing to give the requested resource. Unlike 401 Unauthorized, the client's identity is known to the server. */ | ||
403: { | ||
content: { | ||
"application/json": components["schemas"]["ErrForbidden"]; | ||
}; | ||
}; | ||
/** @description The server cannot find the requested resource. In the browser, this means the URL is not recognized. In an API, this can also mean that the endpoint is valid but the resource itself does not exist. Servers may also send this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client. This response code is probably the most well known due to its frequent occurrence on the web. */ | ||
404: { | ||
content: { | ||
"application/json": components["schemas"]["ErrNotFound"]; | ||
}; | ||
}; | ||
/** @description This response is sent when a request conflicts with the current state of the server. */ | ||
409: { | ||
content: { | ||
"application/json": components["schemas"]["ErrConflict"]; | ||
}; | ||
}; | ||
/** @description The user has sent too many requests in a given amount of time ("rate limiting") */ | ||
429: { | ||
content: { | ||
"application/json": components["schemas"]["ErrTooManyRequests"]; | ||
}; | ||
}; | ||
/** @description The server has encountered a situation it does not know how to handle. */ | ||
500: { | ||
content: { | ||
"application/json": components["schemas"]["ErrInternalServerError"]; | ||
}; | ||
}; | ||
}; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enforce mutual exclusivity of 'namespaceId' and 'namespaceName' in type definitions
The descriptions indicate that either namespaceId
or namespaceName
must be provided, but not both. Currently, both are optional, allowing for the possibility that neither is provided or both are provided, which could lead to runtime errors.
Consider updating the type definitions to enforce that exactly one of namespaceId
or namespaceName
is provided by using a union type.
Apply this diff to enforce mutual exclusivity:
requestBody: {
content: {
"application/json": {
- namespaceId?: string;
- namespaceName?: string;
+ XOR<{ namespaceId: string }, { namespaceName: string }>;
identifier: string;
// rest of the fields
};
};
};
Similarly, update the query parameters for the getOverride
operation:
parameters: {
query: {
- namespaceId?: string;
- namespaceName?: string;
+ XOR<{ namespaceId: string }, { namespaceName: string }>;
identifier: string;
};
};
This change enforces that exactly one of the two properties is required, aligning with the intended usage.
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (7)
apps/engineering/content/docs/api-design/meta.json (1)
3-3
: Consider enhancing the description to be more informative.The current description "Intuitive and helpful" is quite generic. Consider expanding it to better describe what developers will learn from this section, such as "Learn best practices for designing intuitive and consistent APIs" or "Guidelines for building developer-friendly REST and RPC APIs".
- "description": "Intuitive and helpful", + "description": "Learn best practices for designing intuitive and consistent APIs",apps/engineering/app/(home)/page.tsx (1)
3-11
: Consider improving layout structure and semanticsThe current implementation has several areas for improvement:
- The negative margin (
-mt-16
) could cause layout issuesw-screen
might cause horizontal scrolling- Multiple spaces in className strings
- Could use more semantic HTML structure
Consider applying these improvements:
- <div className="min-h-screen border text-center -mt-16 pt-16 flex items-center w-screen justify-center "> - <div> - <h1 className="text-7xl md:text-8xl font-bold leading-none uppercase tracking-tight"> + <main className="min-h-screen border text-center pt-16 flex items-center w-full justify-center"> + <section> + <h1 className="text-7xl md:text-8xl font-bold leading-none uppercase tracking-tight"> BUILD BETTER <br /> APIS FASTER </h1> - <p className="text-xl mt-8 font-light ">How we work</p> - </div> - </div> + <h2 className="text-xl mt-8 font-light">How we work</h2> + </section> + </main>Changes explained:
- Replaced outer
div
with semanticmain
tag- Replaced inner
div
with semanticsection
- Changed
w-screen
tow-full
to prevent horizontal scrolling- Removed negative margin that could cause layout issues
- Cleaned up multiple spaces in className strings
- Changed the subheading
p
toh2
for better document outlineapps/engineering/content/docs/contributing/index.mdx (5)
11-14
: Enhance the Clerk requirement noteThe callout about making Clerk optional would be more helpful with additional context about the timeline or alternative authentication methods being considered.
<Callout> - We're currently working on making this optional. + We're currently working on making Clerk optional. Alternative authentication methods will be supported in a future release. </Callout>
19-21
: Add version requirements for prerequisitesConsider specifying minimum version requirements for the tools to prevent compatibility issues.
-[Node](https://nodejs.org) -[pnpm](https://pnpm.io/installation) -[Docker](https://www.docker.com/products/docker-desktop) +[Node](https://nodejs.org) (v18 or higher) +[pnpm](https://pnpm.io/installation) (v8 or higher) +[Docker](https://www.docker.com/products/docker-desktop) (required for local databases)
53-55
: Enhance Docker requirement warningThe warning could be more helpful by specifying which services will be started.
<Callout type="warn"> - Docker must be running for us to spin up the databases and supporting services. + Docker must be running for us to spin up required services: + - PostgreSQL database + - Redis cache + - ClickHouse for analytics </Callout>
79-81
: Enhance command options documentationThe options documentation could be more detailed and include default values.
- - `--service=<service>`: Specifies which part of the application to develop. The values are `dashboard`, `api`, or `www` - - `--skip-env`: Skips the environment setup prompt if specified. + - `--service=<service>`: Specifies which part of the application to develop: + - `dashboard`: Run the admin dashboard (default) + - `api`: Run the API server + - `www`: Run the public website + - `--skip-env`: Skip the environment setup prompt. Use this when your .env files are already configured. + - `--help`: Display all available options and their descriptions🧰 Tools
🪛 LanguageTool
[uncategorized] ~79-~79: Loose punctuation mark.
Context: ...ilable options: ---service=<service>
: Specifies which part of the application...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~81-~81: Loose punctuation mark.
Context: ...hboard,
api, or
www-
--skip-env`: Skips the environment setup prompt if s...(UNLIKELY_OPENING_PUNCTUATION)
87-87
: Add a troubleshooting sectionConsider adding a troubleshooting section to help developers resolve common setup issues. This should include:
- Common Docker-related issues and solutions
- Environment configuration problems
- Known platform-specific challenges
Would you like me to help draft a troubleshooting section for the documentation?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (9)
apps/engineering/app/(home)/page.tsx
(1 hunks)apps/engineering/app/docs/layout.tsx
(1 hunks)apps/engineering/app/layout.config.tsx
(2 hunks)apps/engineering/content/docs/api-design/meta.json
(1 hunks)apps/engineering/content/docs/contributing/index.mdx
(1 hunks)apps/engineering/content/docs/contributing/meta.json
(1 hunks)apps/engineering/content/docs/contributing/sdk-development.mdx
(1 hunks)apps/engineering/content/docs/contributing/testing.mdx
(1 hunks)apps/engineering/content/docs/contributing/versions.mdx
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- apps/engineering/content/docs/contributing/meta.json
- apps/engineering/content/docs/contributing/sdk-development.mdx
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/engineering/app/docs/layout.tsx
🧰 Additional context used
🪛 LanguageTool
apps/engineering/content/docs/contributing/index.mdx
[uncategorized] ~59-~59: Loose punctuation mark.
Context: ...ollowing values: - CLERK_SECRET_KEY
: Your Clerk secret key. - `NEXT_PUBLIC...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~60-~60: Loose punctuation mark.
Context: ... key. - NEXT_PUBLIC_CLERK_PUBLIC_KEY
: Your Clerk public key. We will cre...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~79-~79: Loose punctuation mark.
Context: ...ilable options: - --service=<service>
: Specifies which part of the application...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~81-~81: Loose punctuation mark.
Context: ...hboard,
api, or
www -
--skip-env`: Skips the environment setup prompt if s...
(UNLIKELY_OPENING_PUNCTUATION)
apps/engineering/content/docs/contributing/testing.mdx
[grammar] ~20-~20: Did you mean the adjective “end-to-end” (spelled with hyphens)?
Context: ...gration tests: ### integration
Full end to end tests require a running API either loca...
(END_TO_END_HYPHEN)
apps/engineering/content/docs/contributing/versions.mdx
[style] ~13-~13: Consider removing “of” to be more concise
Context: ...ent versioning. ## Releasing packages All of the npm packages are versioned via changese...
(ALL_OF_THE)
🔇 Additional comments (5)
apps/engineering/content/docs/api-design/meta.json (1)
2-2
: LGTM! Well-structured metadata properties.
The title is clear, the icon choice is appropriate for API documentation, and the root property correctly indicates this as a top-level documentation page.
Also applies to: 4-5
apps/engineering/app/(home)/page.tsx (1)
1-1
: LGTM! Component name follows Next.js conventions
The rename from HomePage
to Page
aligns with Next.js 13+ app directory naming conventions.
apps/engineering/app/layout.config.tsx (3)
12-12
: LGTM: Title change aligns with rebranding
The simplified title "Unkey" aligns with the broader rebranding effort and provides a more focused brand presentation.
1-1
: Verify the new fumadocs-ui import path
The import path has been updated to match the new package structure. Let's verify this change is documented and stable.
✅ Verification successful
Import path change is consistently applied
The new import path fumadocs-ui/layouts/home
is used consistently across the codebase in both type imports and component imports, with no instances of the old path remaining. The change appears to be properly implemented.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for other usages of the old and new import paths
# to ensure consistent updates across the codebase
echo "Checking for any remaining old import paths..."
rg "fumadocs-ui/home-layout"
echo "Verifying consistent usage of new import path..."
rg "fumadocs-ui/layouts/home"
Length of output: 466
17-23
: Verify new documentation paths and consider navigation structure
The new documentation structure looks more organized, but let's verify:
- The existence of the new documentation paths
- The consistency of using "nested-url" for active state
✅ Verification successful
Based on the search results, I can now generate the final response:
Documentation paths exist and navigation structure is consistent
The new documentation structure is properly organized:
- Contributing docs exist at
apps/engineering/content/docs/contributing/index.mdx
- Architecture docs exist at
apps/engineering/content/docs/architecture/index.mdx
- The use of "nested-url" for active state is consistent with other navigation items in
apps/engineering/app/layout.config.tsx
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the existence of new documentation paths
# and check for consistent navigation patterns
echo "Checking for contributing documentation..."
fd "contributing.mdx" docs
echo "Checking for architecture documentation..."
fd "architecture.mdx" docs
echo "Verifying consistent usage of active state..."
rg "active: \"nested-url\"" --type tsx
Length of output: 595
Script:
#!/bin/bash
# Let's try a different approach to find documentation files and navigation patterns
# Find all documentation files
echo "Searching for documentation files..."
fd -e md -e mdx
# Check for navigation patterns in TSX files
echo "Checking navigation patterns..."
fd -e tsx | xargs rg "active:"
# Look for documentation-related directories
echo "Looking for documentation directories..."
fd -t d
Length of output: 32738
You will be prompted to enter the following values: | ||
|
||
- `CLERK_SECRET_KEY`: Your Clerk secret key. | ||
- `NEXT_PUBLIC_CLERK_PUBLIC_KEY`: Your Clerk public key. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add instructions for obtaining Clerk keys
The documentation should guide users on how to obtain the required Clerk keys.
You will be prompted to enter the following values:
- - `CLERK_SECRET_KEY`: Your Clerk secret key.
- - `NEXT_PUBLIC_CLERK_PUBLIC_KEY`: Your Clerk public key.
+ - `CLERK_SECRET_KEY`: Your Clerk secret key (obtain from [Clerk Dashboard](https://dashboard.clerk.com) → API Keys)
+ - `NEXT_PUBLIC_CLERK_PUBLIC_KEY`: Your Clerk public key (obtain from [Clerk Dashboard](https://dashboard.clerk.com) → API Keys)
+
+ For detailed instructions on setting up your Clerk account and obtaining these keys, visit our [Authentication Setup Guide](link-to-auth-guide).
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~59-~59: Loose punctuation mark.
Context: ...ollowing values: - CLERK_SECRET_KEY
: Your Clerk secret key. - `NEXT_PUBLIC...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~60-~60: Loose punctuation mark.
Context: ... key. - NEXT_PUBLIC_CLERK_PUBLIC_KEY
: Your Clerk public key. We will cre...
(UNLIKELY_OPENING_PUNCTUATION)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (5)
apps/chproxy/src/main.ts (5)
3-4
: Consider making batch parameters configurable via environment variablesThe hardcoded batch parameters might not be optimal for all use cases. Consider making these configurable:
- MAX_BATCH_SIZE: Different workloads might require different batch sizes
- MAX_BATCH_TIME: 3 seconds might be too aggressive for low-traffic scenarios
-const MAX_BATCH_SIZE = 10000; -const MAX_BATCH_TIME = 3000; // milliseconds +const MAX_BATCH_SIZE = Number(process.env.MAX_BATCH_SIZE ?? "10000"); +const MAX_BATCH_TIME = Number(process.env.MAX_BATCH_TIME ?? "3000"); // milliseconds
6-10
: Add JSDoc comments to document the Batch typeAdding documentation will improve code maintainability and help other developers understand the purpose of each field.
+/** + * Represents a batch of rows to be inserted into ClickHouse + * @property createdAt - Timestamp when the batch was created + * @property rows - Array of strings, each representing a row to insert + * @property params - URL parameters for the ClickHouse query + */ type Batch = { createdAt: number; rows: string[]; params: URLSearchParams; };
23-24
: Improve documentation for BASIC_AUTH formatThe comment could be more explicit about the expected format and encoding process:
- // user:pass - // We receive a base64 of this in the authorization header ie: `Basic ZGVmYXVsdDo=` + // BASIC_AUTH should be in format "username:password" + // This will be base64 encoded and used in the Authorization header + // Example: If BASIC_AUTH="user:pass", the header will be "Basic dXNlcjpwYXNz"
70-70
: Make flush interval configurableThe flush interval should be configurable and aligned with MAX_BATCH_TIME.
-setInterval(flush, 1000); +const FLUSH_INTERVAL = Number(process.env.FLUSH_INTERVAL ?? "1000"); +setInterval(flush, FLUSH_INTERVAL);
121-127
: Enhance signal handling for graceful shutdownConsider handling additional signals and adding a shutdown timeout:
-process.on("SIGTERM", async (s) => { +async function shutdown(signal: string) { console.warn("Received signal", s); + console.info("Starting graceful shutdown..."); + + const timeoutPromise = new Promise((_, reject) => + setTimeout(() => reject(new Error("Shutdown timeout")), 10000) + ); server.stop(); - await flush(true); + try { + await Promise.race([flush(true), timeoutPromise]); + console.info("Graceful shutdown completed"); + } catch (error) { + console.error("Error during shutdown:", error); + } process.exit(0); -}); +} + +process.on("SIGTERM", () => shutdown("SIGTERM")); +process.on("SIGINT", () => shutdown("SIGINT"));
const query = url.searchParams.get("query"); | ||
if (!query || !query.toLowerCase().startsWith("insert into")) { | ||
return new Response("wrong query", { status: 400 }); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance query validation
The current query validation is basic and case-sensitive. Consider using a more robust approach:
- if (!query || !query.toLowerCase().startsWith("insert into")) {
- return new Response("wrong query", { status: 400 });
+ if (!query) {
+ return new Response("missing query parameter", { status: 400 });
+ }
+ const normalizedQuery = query.toLowerCase().trim();
+ if (!normalizedQuery.startsWith("insert into")) {
+ return new Response("only INSERT queries are allowed", { status: 400 });
+ }
+ // Optional: Add more validation like table name whitelist
+ const allowedTables = ["events", "metrics"]; // configure via env
+ const tableMatch = normalizedQuery.match(/insert into\s+([^\s(]+)/i);
+ if (!tableMatch || !allowedTables.includes(tableMatch[1])) {
+ return new Response("invalid table name", { status: 400 });
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const query = url.searchParams.get("query"); | |
if (!query || !query.toLowerCase().startsWith("insert into")) { | |
return new Response("wrong query", { status: 400 }); | |
} | |
const query = url.searchParams.get("query"); | |
if (!query) { | |
return new Response("missing query parameter", { status: 400 }); | |
} | |
const normalizedQuery = query.toLowerCase().trim(); | |
if (!normalizedQuery.startsWith("insert into")) { | |
return new Response("only INSERT queries are allowed", { status: 400 }); | |
} | |
// Optional: Add more validation like table name whitelist | |
const allowedTables = ["events", "metrics"]; // configure via env | |
const tableMatch = normalizedQuery.match(/insert into\s+([^\s(]+)/i); | |
if (!tableMatch || !allowedTables.includes(tableMatch[1])) { | |
return new Response("invalid table name", { status: 400 }); | |
} |
const existing = buffer.get(key); | ||
if (existing) { | ||
const size = existing.rows.push(...rows); | ||
if (size >= MAX_BATCH_SIZE) { | ||
await persist(key); | ||
} | ||
} else { | ||
buffer.set(key, { | ||
createdAt: Date.now(), | ||
params, | ||
rows, | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Monitor and limit total buffer size
The buffer size should be monitored and limited to prevent memory issues:
+ const currentBufferSize = Array.from(buffer.values()).reduce((sum, batch) => sum + batch.rows.length, 0);
+ const MAX_TOTAL_ROWS = Number(process.env.MAX_TOTAL_ROWS ?? "1000000");
+
+ if (currentBufferSize + rows.length > MAX_TOTAL_ROWS) {
+ return new Response("buffer full", { status: 503 });
+ }
+
const existing = buffer.get(key);
if (existing) {
const size = existing.rows.push(...rows);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const existing = buffer.get(key); | |
if (existing) { | |
const size = existing.rows.push(...rows); | |
if (size >= MAX_BATCH_SIZE) { | |
await persist(key); | |
} | |
} else { | |
buffer.set(key, { | |
createdAt: Date.now(), | |
params, | |
rows, | |
}); | |
} | |
const currentBufferSize = Array.from(buffer.values()).reduce((sum, batch) => sum + batch.rows.length, 0); | |
const MAX_TOTAL_ROWS = Number(process.env.MAX_TOTAL_ROWS ?? "1000000"); | |
if (currentBufferSize + rows.length > MAX_TOTAL_ROWS) { | |
return new Response("buffer full", { status: 503 }); | |
} | |
const existing = buffer.get(key); | |
if (existing) { | |
const size = existing.rows.push(...rows); | |
if (size >= MAX_BATCH_SIZE) { | |
await persist(key); | |
} | |
} else { | |
buffer.set(key, { | |
createdAt: Date.now(), | |
params, | |
rows, | |
}); | |
} |
const res = await fetch(url, { | ||
method: "POST", | ||
headers: { | ||
"Content-Type": "text/plain", | ||
Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`, | ||
}, | ||
body: batch.rows.join("\n"), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling in fetch request
The fetch request should handle different types of errors (network, timeout, etc.) and provide more detailed error information.
- const res = await fetch(url, {
- method: "POST",
- headers: {
- "Content-Type": "text/plain",
- Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`,
- },
- body: batch.rows.join("\n"),
- });
+ try {
+ const res = await fetch(url, {
+ method: "POST",
+ headers: {
+ "Content-Type": "text/plain",
+ Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`,
+ },
+ body: batch.rows.join("\n"),
+ // Add timeout to prevent hanging requests
+ signal: AbortSignal.timeout(5000),
+ });
+ if (res.ok) {
+ buffer.delete(key);
+ console.info(`persisted ${batch.rows.length} rows`);
+ } else {
+ console.error(
+ `Failed to persist: ${res.status} ${res.statusText}`,
+ await res.text(),
+ JSON.stringify(batch)
+ );
+ }
+ } catch (error) {
+ if (error.name === 'TimeoutError') {
+ console.error('Request timed out while persisting batch');
+ } else {
+ console.error('Network error while persisting batch:', error);
+ }
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const res = await fetch(url, { | |
method: "POST", | |
headers: { | |
"Content-Type": "text/plain", | |
Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`, | |
}, | |
body: batch.rows.join("\n"), | |
}); | |
try { | |
const res = await fetch(url, { | |
method: "POST", | |
headers: { | |
"Content-Type": "text/plain", | |
Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`, | |
}, | |
body: batch.rows.join("\n"), | |
// Add timeout to prevent hanging requests | |
signal: AbortSignal.timeout(5000), | |
}); | |
if (res.ok) { | |
buffer.delete(key); | |
console.info(`persisted ${batch.rows.length} rows`); | |
} else { | |
console.error( | |
`Failed to persist: ${res.status} ${res.statusText}`, | |
await res.text(), | |
JSON.stringify(batch) | |
); | |
} | |
} catch (error) { | |
if (error.name === 'TimeoutError') { | |
console.error('Request timed out while persisting batch'); | |
} else { | |
console.error('Network error while persisting batch:', error); | |
} | |
} |
Summary by CodeRabbit
Release Notes
New Features
chproxy
, for managing ClickHouse insert operations.CLICKHOUSE_INSERT_URL
support for enhanced configuration.setOverride
,listOverrides
,deleteOverride
, andgetOverride
.Bug Fixes
Documentation
Dependency Updates