Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: clickhouse-buffer #2672

Merged
merged 3 commits into from
Nov 21, 2024
Merged

feat: clickhouse-buffer #2672

merged 3 commits into from
Nov 21, 2024

Conversation

chronark
Copy link
Collaborator

@chronark chronark commented Nov 21, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a new service, chproxy, for managing ClickHouse insert operations.
    • Added optional CLICKHOUSE_INSERT_URL support for enhanced configuration.
    • New API endpoints for rate limit management: setOverride, listOverrides, deleteOverride, and getOverride.
    • New documentation for ClickHouse Proxy and ClickHouse usage within Unkey infrastructure.
    • Comprehensive guide for getting started with the Unkey project and SDK development documentation.
  • Bug Fixes

    • Improved error handling in analytics insertion and rate limit namespace creation.
  • Documentation

    • Added detailed documentation for the ClickHouse Proxy and its functionalities.
    • Updated architecture documentation to reflect new services and configurations.
    • Introduced documentation on testing framework and versioning processes.
  • Dependency Updates

    • Updated several dependencies to their latest versions for improved performance and features.

Copy link

vercel bot commented Nov 21, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
engineering ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 21, 2024 8:16pm
play ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 21, 2024 8:16pm
www ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 21, 2024 8:16pm
2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
dashboard ⬜️ Ignored (Inspect) Visit Preview Nov 21, 2024 8:16pm
workflows ⬜️ Ignored (Inspect) Visit Preview Nov 21, 2024 8:16pm

Copy link

changeset-bot bot commented Nov 21, 2024

⚠️ No Changeset found

Latest commit: 6a26302

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

coderabbitai bot commented Nov 21, 2024

📝 Walkthrough

Walkthrough

The changes introduced in this pull request primarily enhance the configuration and functionality of the Analytics class and its interaction with the ClickHouse database. A new optional parameter, clickhouseInsertUrl, has been added to the Analytics constructor, allowing for more flexible database configurations. Additionally, modifications to the environment configuration, middleware initialization, and error handling in various components are included. New API endpoints related to rate limiting have also been added, alongside updates to documentation and Docker configurations for the new chproxy service.

Changes

File Path Change Summary
apps/api/src/pkg/analytics.ts Updated Analytics class constructor to include clickhouseInsertUrl. Changed telemetry data insertion method from this.clickhouse.client.insert to this.clickhouse.inserter.insert.
apps/api/src/pkg/env.ts Added optional property CLICKHOUSE_INSERT_URL to zEnv schema for environment variable validation.
apps/api/src/pkg/middleware/init.ts Modified init function to include clickhouseInsertUrl when instantiating Analytics.
apps/api/src/routes/v1_ratelimits_limit.ts Enhanced error handling and permission checks for rate limit namespace creation. Added a logger service for better observability.
apps/chproxy/Dockerfile Introduced a multi-stage Docker build process for the chproxy application.
apps/chproxy/package.json Created new package.json for chproxy with dependencies and metadata.
apps/chproxy/src/main.ts Implemented a server application for handling ClickHouse interactions, including data buffering and persistence logic.
apps/chproxy/tsconfig.json Added TypeScript configuration file with specified compiler options.
apps/dashboard/lib/hooks/useDelayLoader.tsx Added TypeScript directive to suppress type errors related to setTimeout.
apps/engineering/app/(home)/layout.tsx Updated import path for HomeLayout component.
apps/engineering/app/docs/layout.tsx Changed import path for DocsLayout component.
apps/engineering/app/layout.config.tsx Updated import path for HomeLayoutProps type.
apps/engineering/app/layout.tsx Removed Banner component from Layout.
apps/engineering/content/docs/architecture/clickhouse-proxy.mdx Added documentation for the ClickHouse Proxy application.
apps/engineering/content/docs/architecture/clickhouse.mdx Introduced documentation regarding ClickHouse usage within Unkey infrastructure.
apps/engineering/content/docs/architecture/index.mdx Added description metadata field to the document header.
apps/engineering/content/docs/architecture/meta.json Enhanced metadata for architecture documentation with title, description, icon, and updated pages.
apps/engineering/content/docs/meta.json Updated pages array to reflect new focus areas.
apps/engineering/content/docs/services/meta.json Removed metadata configuration for "Fumadocs Core" service.
apps/engineering/content/docs/test.mdx Deleted documentation file related to components.
apps/engineering/package.json Updated dependency versions and added a new dependency for fumadocs-typescript.
apps/engineering/tsconfig.json Changed moduleResolution option to bundler.
apps/www/components/animated-list.tsx Added error handling comments and updated logic for managing index state in AnimatedList component.
apps/www/package.json Updated fumadocs-core dependency version.
deployment/docker-compose.yaml Added new chproxy service and updated api service to include CLICKHOUSE_INSERT_URL.
internal/clickhouse/src/index.ts Modified ClickHouseConfig type and updated ClickHouse class to support new client configurations.
packages/api/src/openapi.d.ts Added new API endpoints for managing rate limits: ratelimit.setOverride, listOverrides, deleteOverride, and getOverride.

Possibly related PRs

  • feat: add more cloudflare ratelimits #2636: The changes in this PR involve modifications to the zEnv object schema in apps/api/src/pkg/env.ts, which is directly related to the addition of the CLICKHOUSE_INSERT_URL property in the same file in the main PR. Both PRs enhance the configuration options for ClickHouse, indicating a connection in their objectives.

Suggested reviewers

  • mcstepp
  • perkinsjr
  • MichaelUnkey

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

github-actions bot commented Nov 21, 2024

Thank you for following the naming conventions for pull request titles! 🙏

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🧹 Outside diff range and nitpick comments (21)
apps/engineering/content/docs/architecture/meta.json (1)

6-6: Consider consolidating ClickHouse documentation pages.

The pages array includes both "clickhouse" and "clickhouse-proxy" which might lead to fragmented or duplicated information. Consider:

  1. Consolidating these into a single page if they cover related topics
  2. Using more distinct names if they cover different aspects (e.g., "clickhouse-overview" and "clickhouse-proxy")

Also, while the separator "---Services---" works, consider using a more JSON-native approach for section organization.

-  "pages": ["index", "---Services---", "vault", "clickhouse", "clickhouse-proxy"]
+  "pages": ["index", "services/vault", "services/clickhouse"]
apps/engineering/app/layout.tsx (1)

14-14: Consider accessibility implications

If the Banner contained important announcements or notifications, ensure this information is still accessible to users through alternative means.

Consider:

  1. Moving important announcements to a different location in the UI
  2. Using ARIA live regions for dynamic notifications
  3. Implementing a different notification system if needed
apps/dashboard/lib/hooks/useDelayLoader.tsx (1)

8-10: Consider a more robust type handling solution.

Instead of suppressing the type error, consider one of these alternatives:

  1. Use a specific type assertion
  2. Configure TypeScript to prefer one runtime's types
-      // @ts-expect-error bun types are colliding with node
-      timeout = setTimeout(() => {
+      timeout = (setTimeout as typeof global.setTimeout)(() => {
apps/engineering/package.json (1)

14-14: Consider using fixed version for consistency

The new dependency fumadocs-typescript uses a caret (^) version while other fumadocs packages use fixed versions. Consider using a fixed version for consistency:

-    "fumadocs-typescript": "^3.0.2",
+    "fumadocs-typescript": "3.0.2",
apps/chproxy/Dockerfile (1)

12-13: Optimize Docker context with .dockerignore

Copying the entire context with COPY . . could include unnecessary files like node_modules, .git, etc., which increases build time and context size.

Consider adding a .dockerignore file with common exclusions:

+# New file: .dockerignore
+node_modules
+.git
+.env
+*.log
+dist
+coverage
apps/api/src/pkg/analytics.ts (1)

9-18: LGTM! Good architectural improvement for write/read separation.

The separation of insert and query URLs allows for better load distribution and follows ClickHouse best practices.

Consider adding URL format validation:

 constructor(opts: {
   clickhouseUrl: string;
   clickhouseInsertUrl?: string;
 }) {
+  const validateUrl = (url: string) => {
+    try {
+      new URL(url);
+    } catch (e) {
+      throw new Error(`Invalid URL: ${url}`);
+    }
+  };
+  validateUrl(opts.clickhouseUrl);
+  if (opts.clickhouseInsertUrl) {
+    validateUrl(opts.clickhouseInsertUrl);
+  }
   if (opts.clickhouseInsertUrl) {
     this.clickhouse = new ClickHouse({
apps/api/src/pkg/env.ts (1)

34-34: Document the relationship between ClickHouse URLs.

Consider adding a comment explaining the purpose of having separate URLs for queries and inserts, and whether both are required for the buffer functionality to work correctly.

   CLICKHOUSE_URL: z.string(),
+  // CLICKHOUSE_INSERT_URL allows configuring a separate endpoint for write operations,
+  // useful when using a ClickHouse proxy to buffer writes
   CLICKHOUSE_INSERT_URL: z.string().optional(),
apps/www/components/animated-list.tsx (3)

Line range hint 34-38: Consider alternative approaches to handle type conflicts

The @ts-expect-error comment suggests underlying type system conflicts between Bun and Node. Consider these alternatives:

  1. Use window.setTimeout to avoid runtime-specific typing
  2. Create a typed wrapper for setTimeout that works with both runtimes
-// @ts-expect-error bun types are colliding with node
-timeoutRef.current = setTimeout(() => {
+timeoutRef.current = window.setTimeout(() => {
  setIndex(-1);
}, 1000);

Line range hint 49-53: Potential race condition in timeout management

Multiple timeouts could be created and stored in timeoutRef.current, potentially leading to race conditions. Consider clearing existing timeouts before setting new ones.

+if (timeoutRef.current) {
+  clearTimeout(timeoutRef.current);
+}
-// @ts-expect-error bun types are colliding with node
timeoutRef.current = setTimeout(() => {
  setIndex(0);
}, 1000);

Line range hint 1-89: Consider optimizing the animation cycle implementation

The current implementation uses a combination of setInterval, setTimeout, and state management that could be simplified. Consider these improvements:

  1. Use a single animation cycle with useReducer instead of multiple timeouts
  2. Implement a more declarative state machine for the animation states
  3. Use CSS transitions for smoother animations

Would you like me to provide a complete example of a simplified implementation?

deployment/docker-compose.yaml (2)

141-141: Consider improving configuration management

The API service configuration could be enhanced:

  1. The CLICKHOUSE_INSERT_URL contains hardcoded credentials. Consider using Docker secrets or environment files.
  2. The command array is getting quite long. Consider moving these configurations to a dedicated environment file for better maintainability.

Example refactor using an env file:

  api:
    build:
      context: ..
      dockerfile: ./apps/api/Dockerfile.dev
    ports:
      - 8787:8787
+   env_file:
+     - .env.api
-   command:
-     [
-       "pnpm",
-       "wrangler",
-       "dev",
-       "--ip=0.0.0.0",
-       "--var=VERSION:1",
-       # ... other vars
-     ]
+   command: ["pnpm", "wrangler", "dev", "--ip=0.0.0.0"]

Also applies to: 149-149


71-82: Document architectural decisions for ClickHouse proxy layer

The introduction of chproxy as a separate service for handling ClickHouse insert operations adds complexity to the architecture. While this separation allows for specific handling of insert operations (potentially for rate limiting), it should be documented:

  1. Consider adding architecture decision records (ADR) explaining:
    • Why inserts need to be proxied separately
    • The rate limiting requirements
    • The security implications
  2. Consider adding a diagram in the documentation showing the new data flow

Would you like me to help create a template for the architectural documentation?

Also applies to: 141-141, 149-149

apps/chproxy/tsconfig.json (2)

42-42: Enable JSON module resolution and isolated modules.

Since this service likely needs to handle configuration files and needs to be safely transpiled:

Enable these options:

-    // "resolveJsonModule": true,                        /* Enable importing .json files. */
-    // "isolatedModules": true,                          /* Ensure that each file can be safely transpiled without relying on other imports. */
+    "resolveJsonModule": true,                          /* Enable importing .json files. */
+    "isolatedModules": true,                           /* Ensure that each file can be safely transpiled without relying on other imports. */

Also applies to: 75-75


52-53: Consider generating declaration files.

If this proxy service might be used as a dependency by other packages in the monorepo:

Enable declaration file generation:

-    // "declaration": true,                              /* Generate .d.ts files from TypeScript and JavaScript files in your project. */
-    // "declarationMap": true,                           /* Create sourcemaps for d.ts files. */
+    "declaration": true,                                /* Generate .d.ts files from TypeScript and JavaScript files in your project. */
+    "declarationMap": true,                            /* Create sourcemaps for d.ts files. */
apps/api/src/routes/v1_ratelimits_limit.ts (1)

328-343: Consider enhancing error context for better debugging

The error handling implementation is good, using waitUntil() for non-blocking analytics and appropriate error logging. However, consider adding more context to the error log for easier debugging.

Consider this enhancement:

 logger.error("inserting ratelimit event failed", {
   error: err.message,
+  context: {
+    workspace_id: rootKey.authorizedWorkspaceId,
+    namespace_id: namespace.id,
+    request_id: c.get("requestId"),
+    identifier: req.identifier,
+  },
+  errorCode: err.code, // if available
 });
internal/clickhouse/src/index.ts (2)

25-35: Consider improving 'ClickHouseConfig' type definition for better type safety

Currently, ClickHouseConfig uses optional properties and never types to enforce mutually exclusive configurations. Consider using a discriminated union with a discriminator property to enhance type safety and clarity.

Apply this diff to adjust the type definition:

 export type ClickHouseConfig =
-  | {
-      url?: string;
-      insertUrl?: never;
-      queryUrl?: never;
-    }
-  | {
-      url?: never;
-      insertUrl: string;
-      queryUrl: string;
-    };
+  | { mode: 'single'; url: string }
+  | { mode: 'split'; insertUrl: string; queryUrl: string };

This change introduces a mode discriminator, making configurations explicit and improving type checking.


62-62: Duplicate methods logs and latest in verifications getter

Both logs and latest in the verifications getter reference getLatestVerifications(this.querier). If they serve the same purpose, consider removing one to avoid redundancy. If they are intended to be different, ensure they call distinct functions.

Apply this diff if they are duplicates:

 return {
   insert: insertVerification(this.inserter),
-  logs: getLatestVerifications(this.querier),
   perHour: getVerificationsPerHour(this.querier),
   perDay: getVerificationsPerDay(this.querier),
   perMonth: getVerificationsPerMonth(this.querier),
-  latest: getLatestVerifications(this.querier),
+  latest: getLatestVerifications(this.querier),
 };

Or update logs to reference the correct function if needed.

Also applies to: 66-66

apps/chproxy/src/main.ts (3)

81-83: Use case-insensitive comparison for Authorization header

HTTP headers are case-insensitive. Comparing the Authorization header using strict equality may fail if the header's casing differs.

Adjust the code to perform a case-insensitive comparison:

-if (req.headers.get("Authorization") !== requiredAuthorization) {
+if (req.headers.get("authorization") !== requiredAuthorization) {

Alternatively, normalize the header name:

const authorizationHeader = req.headers.get("Authorization") || req.headers.get("authorization");
if (authorizationHeader !== requiredAuthorization) {
  return new Response("unauthorized", { status: 401 });
}

96-96: Handle empty request bodies gracefully

When splitting the request body by newlines, an empty body will result in an array with a single empty string, which may not be the intended behavior.

Add a check to handle empty request bodies:

-const rows = (await req.text()).split("\n");
+const bodyText = await req.text();
+if (!bodyText) {
+  return new Response("empty body", { status: 400 });
+}
+const rows = bodyText.split("\n");

121-127: Ensure all pending operations complete before exiting

Calling process.exit(0); immediately after await flush(true); may not guarantee that all asynchronous operations have completed, potentially leading to data loss.

Consider removing process.exit(0); and allowing the process to exit gracefully, or ensure all pending promises are resolved before exiting.

packages/api/src/openapi.d.ts (1)

79-90: Ensure consistent naming conventions for operation IDs

In the added endpoints, there's an inconsistency between the path names and the operation IDs regarding singular and plural forms of "ratelimit". For example, the path "/v1/ratelimits.setOverride" uses "ratelimits" (plural), while the operation ID operations["ratelimit.setOverride"] uses "ratelimit" (singular). For clarity and consistency, consider standardizing the naming to either singular or plural in both paths and operation IDs.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 039cf73 and 500e8c1.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (27)
  • apps/api/src/pkg/analytics.ts (1 hunks)
  • apps/api/src/pkg/env.ts (1 hunks)
  • apps/api/src/pkg/middleware/init.ts (1 hunks)
  • apps/api/src/routes/v1_ratelimits_limit.ts (2 hunks)
  • apps/chproxy/Dockerfile (1 hunks)
  • apps/chproxy/package.json (1 hunks)
  • apps/chproxy/src/main.ts (1 hunks)
  • apps/chproxy/tsconfig.json (1 hunks)
  • apps/dashboard/lib/hooks/useDelayLoader.tsx (1 hunks)
  • apps/engineering/app/(home)/layout.tsx (1 hunks)
  • apps/engineering/app/docs/layout.tsx (1 hunks)
  • apps/engineering/app/layout.config.tsx (1 hunks)
  • apps/engineering/app/layout.tsx (1 hunks)
  • apps/engineering/content/docs/architecture/clickhouse-proxy.mdx (1 hunks)
  • apps/engineering/content/docs/architecture/clickhouse.mdx (1 hunks)
  • apps/engineering/content/docs/architecture/index.mdx (1 hunks)
  • apps/engineering/content/docs/architecture/meta.json (1 hunks)
  • apps/engineering/content/docs/meta.json (1 hunks)
  • apps/engineering/content/docs/services/meta.json (0 hunks)
  • apps/engineering/content/docs/test.mdx (0 hunks)
  • apps/engineering/package.json (1 hunks)
  • apps/engineering/tsconfig.json (1 hunks)
  • apps/www/components/animated-list.tsx (2 hunks)
  • apps/www/package.json (1 hunks)
  • deployment/docker-compose.yaml (2 hunks)
  • internal/clickhouse/src/index.ts (1 hunks)
  • packages/api/src/openapi.d.ts (2 hunks)
💤 Files with no reviewable changes (2)
  • apps/engineering/content/docs/services/meta.json
  • apps/engineering/content/docs/test.mdx
✅ Files skipped from review due to trivial changes (5)
  • apps/chproxy/package.json
  • apps/engineering/app/(home)/layout.tsx
  • apps/engineering/app/docs/layout.tsx
  • apps/engineering/app/layout.config.tsx
  • apps/engineering/content/docs/architecture/index.mdx
🧰 Additional context used
🪛 LanguageTool
apps/engineering/content/docs/architecture/clickhouse-proxy.mdx

[style] ~8-~8: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ew seconds or when the buffer is full. It's available at clickhouse.unkey.cloud...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~16-~16: To make your text as clear as possible to all readers, do not use this foreign term. Possible alternatives are “below” or “further on” (in a document).
Context: ...ouse proxy is fully managed in unkeyed/infra. #...

(INFRA)

apps/engineering/content/docs/architecture/clickhouse.mdx

[grammar] ~8-~8: It seems that “to” is missing before the verb.
Context: ... perfect fit for us, for it's very high ingest capabilities and easy of querying throu...

(MISSING_TO_BETWEEN_BE_AND_VB)


[uncategorized] ~14-~14: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ickHouse inside our docker-compose setup and it will be autostarted and migrated for...

(COMMA_COMPOUND_SENTENCE)


[grammar] ~16-~16: Probable usage error. Use “and” after ‘both’.
Context: ...house) package contains both the schema as well as a typescript client for inserting and q...

(BOTH_AS_WELL_AS)


[uncategorized] ~30-~30: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ns` Raw verifications are ingested here and we run hourly, daily and monthly aggreg...

(COMMA_COMPOUND_SENTENCE)


[uncategorized] ~30-~30: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...rly, daily and monthly aggregations for user facing analytics. ### ratelimits Raw ratel...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~34-~34: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...imits` Raw ratelimits are ingested here and we run hourly, daily and monthly aggreg...

(COMMA_COMPOUND_SENTENCE)


[uncategorized] ~34-~34: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...rly, daily and monthly aggregations for user facing analytics. ### billing Aggregates d...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[style] ~51-~51: To make your text as clear as possible to all readers, do not use this foreign term. Possible alternatives are “below” or “further on” (in a document).
Context: ... clusters are fully managed in unkeyed/infra.

(INFRA)

🔇 Additional comments (17)
apps/engineering/content/docs/architecture/meta.json (1)

2-5: LGTM! Clear and well-structured metadata.

The title, description, icon, and root configuration provide clear context for the architecture documentation.

apps/engineering/app/layout.tsx (1)

14-14: Verify the Banner removal impact

The removal of the Banner component might affect user experience if it contained important announcements or notifications.

Let's check if the Banner component is used elsewhere or if its functionality has been moved:

apps/dashboard/lib/hooks/useDelayLoader.tsx (1)

8-8: LGTM as a temporary fix.

The @ts-expect-error comment addresses the type collision between Bun and Node.js types for the setTimeout function.

apps/engineering/tsconfig.json (2)

13-13: Verify TypeScript version compatibility

The change to "moduleResolution": "bundler" requires TypeScript >= 5.0. This is a more modern and stricter resolution strategy that better aligns with bundler expectations.

✅ Verification successful

TypeScript version compatibility confirmed for moduleResolution: "bundler"

All TypeScript dependencies in the project are using version ^5.5.3 or higher, which fully supports the "bundler" moduleResolution option introduced in TypeScript 5.0.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check TypeScript version in package.json files
# Expect: Version should be >= 5.0

# Find all package.json files and check typescript version
fd package.json --type f --exec sh -c 'echo "=== $1 ==="; cat "$1" | jq -r ".dependencies.typescript, .devDependencies.typescript"' sh {}

Length of output: 2208


13-13: Verify import statement compatibility

The bundler moduleResolution strategy is stricter than node and may affect how imports are resolved. Please ensure all import statements in the codebase are compatible with this change.

✅ Verification successful

Based on the analysis of the engineering app's imports and dependencies, I can now provide a final response:

The moduleResolution change to "bundler" is safe to proceed with

The codebase follows Next.js and TypeScript module resolution best practices:

  • All imports are using standard patterns supported by the bundler strategy
  • No problematic implicit index imports or deep imports from node_modules found
  • Dependencies are modern packages (fumadocs, next.js, react) that support bundler resolution
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Look for potential import issues that might be affected by the moduleResolution change
# Common patterns that might need attention:
# - Implicit index imports
# - Deep imports from node_modules
# - Imports without extensions

# Search for import statements
rg --type typescript "^import .* from ['\"](?![@\.]).+['\"]" --no-filename

# Search for require statements
rg --type typescript "require\(['\"](?![@\.]).+['\"]\)" --no-filename

Length of output: 662


Script:

#!/bin/bash
# Let's try a different approach to find import statements
# First, let's look for all TypeScript files
fd -e ts -e tsx | while read -r file; do
  echo "=== $file ==="
  # Show import statements from node_modules (non-relative)
  rg "^import .* from ['\"](?!\.|@)" "$file" || true
done

# Also check for package.json to understand the project's dependencies
if [ -f "apps/engineering/package.json" ]; then
  echo -e "\n=== Dependencies from apps/engineering/package.json ==="
  cat "apps/engineering/package.json" | jq '.dependencies, .devDependencies'
fi

Length of output: 157209


Script:

#!/bin/bash
# Let's check for TypeScript import patterns that might be affected by the bundler moduleResolution
# Specifically looking for:
# 1. Imports without extensions
# 2. Index imports
# 3. Package.json imports field usage

# First, let's look at all TypeScript imports in the engineering app
echo "=== TypeScript imports in engineering app ==="
cd apps/engineering && find . -type f -name "*.ts" -o -name "*.tsx" | while read -r file; do
  echo "=== $file ==="
  grep -E "^import .* from .*" "$file" || true
done

# Check if there's a package.json "imports" field
echo -e "\n=== Checking for package.json imports field ==="
if [ -f "apps/engineering/package.json" ]; then
  cat "apps/engineering/package.json" | jq '.imports'
fi

Length of output: 2968

apps/engineering/package.json (1)

12-15: Verify breaking changes in fumadocs major version upgrades

The updates include major version bumps for multiple fumadocs packages which might introduce breaking changes:

  • fumadocs-core: 13.4.0 → 14.4.0
  • fumadocs-mdx: 10.0.2 → 11.1.1
  • fumadocs-ui: 13.4.10 → 14.4.0
apps/chproxy/Dockerfile (1)

16-29: Well-structured build stage implementation

The installer stage follows Docker best practices:

  • Separates dependency installation from build
  • Leverages build caching effectively
  • Uses proper workspace filtering with turbo
apps/api/src/pkg/analytics.ts (1)

Line range hint 22-29: Verify ClickHouse client version compatibility.

The change from client.insert to inserter.insert suggests a breaking change in the ClickHouse client API.

Also, consider adding JSDoc comments to document the schema fields:

 public get insertSdkTelemetry() {
   return this.clickhouse.inserter.insert({
     table: "telemetry.raw_sdks_v1",
     schema: z.object({
+      /** Unique identifier for the request */
       request_id: z.string(),
+      /** Unix timestamp in milliseconds */
       time: z.number().int(),
+      /** Runtime environment (e.g., 'node', 'deno') */
       runtime: z.string(),
+      /** Platform information */
       platform: z.string(),
+      /** Array of version strings */
       versions: z.array(z.string()),
     }),
   });
 }
apps/www/package.json (2)

38-38: Verify relevance of this dependency update to the PR

This change appears unrelated to the PR's main objective of implementing ClickHouse buffer functionality. Consider moving this dependency update to a separate PR focused on documentation updates.


38-38: Verify breaking changes in fumadocs-core major version update

The update from v13 to v14 is a major version bump which may include breaking changes.

Run the following to check for potential breaking changes:

apps/api/src/pkg/middleware/init.ts (2)

101-101: LGTM! Good separation of insert and query endpoints.

The addition of clickhouseInsertUrl follows the best practice of separating write and read operations in distributed systems, which can improve performance and scalability.


101-101: Verify environment variable configuration.

Since this change relies on the new CLICKHOUSE_INSERT_URL environment variable, let's verify its configuration across the codebase.

✅ Verification successful

Environment variable CLICKHOUSE_INSERT_URL is properly configured

The environment variable is well-configured across the codebase:

  • Properly defined in the environment schema at apps/api/src/pkg/env.ts as an optional string
  • Documented in the architecture docs at apps/engineering/content/docs/architecture/clickhouse-proxy.mdx
  • Configured in Docker deployment at deployment/docker-compose.yaml
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the environment variable configuration and usage

# Check if the environment variable is properly defined in environment schemas
rg -A 2 "CLICKHOUSE_INSERT_URL" 

# Check Docker/deployment configurations for this variable
fd -e yaml -e yml -e env | xargs rg "CLICKHOUSE_INSERT_URL"

Length of output: 1092

deployment/docker-compose.yaml (1)

71-82: Consider enhancing security configuration for chproxy service

A few security and configuration considerations:

  1. Basic auth credentials are hardcoded in the environment variables. Consider using Docker secrets or environment files for sensitive data.
  2. The service exposes port 7123 to the host machine. If this proxy is only needed by internal services, consider removing the port mapping and use Docker's internal networking.
  3. The purpose and configuration of this proxy service should be documented.

Let's verify if this proxy is only used internally:

✅ Verification successful

Security configuration for chproxy service is appropriate

The current configuration is suitable for the intended use case:

  1. Port exposure (7123) is necessary as it's used internally by the API service for batch processing of ClickHouse inserts, as evidenced by the documentation and code.
  2. Basic auth credentials are appropriate for local development environment:
    • In production, the service runs on AWS AppRunner with proper security controls
    • The proxy is only accessible internally through Docker networking
  3. The service's purpose is well documented in apps/engineering/content/docs/architecture/clickhouse-proxy.mdx
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any references to port 7123 or chproxy in the codebase
rg -g '!deployment/*' "7123|chproxy"

Length of output: 517


Script:

#!/bin/bash
# Let's check the documentation to understand the purpose of chproxy
cat apps/engineering/content/docs/architecture/clickhouse-proxy.mdx

# Also check the main.ts file to understand the internal usage
cat apps/chproxy/src/main.ts

# Let's also check if there are any external references to this proxy
rg "CLICKHOUSE_INSERT_URL" -A 2 -B 2

Length of output: 5543

apps/chproxy/tsconfig.json (1)

14-14: LGTM: Basic compilation settings are appropriate.

The ES2016 target and CommonJS module system are suitable choices for a Node.js service.

Also applies to: 28-28

apps/api/src/routes/v1_ratelimits_limit.ts (1)

151-151: LGTM: Good addition of logger service

Adding the logger service improves observability for error handling.

internal/clickhouse/src/index.ts (2)

38-39: LGTM!

The introduction of separate querier and inserter properties enhances clarity and allows for distinct configurations for querying and inserting.


78-84: LGTM!

The updates in the ratelimits getter correctly utilize this.inserter and this.querier, ensuring that insertions and queries are handled by the appropriate clients.

apps/engineering/content/docs/meta.json Show resolved Hide resolved
apps/chproxy/Dockerfile Show resolved Hide resolved
apps/chproxy/Dockerfile Show resolved Hide resolved
apps/chproxy/Dockerfile Show resolved Hide resolved
@@ -31,6 +31,7 @@ export const zEnv = z.object({
AGENT_TOKEN: z.string(),

CLICKHOUSE_URL: z.string(),
CLICKHOUSE_INSERT_URL: z.string().optional(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add URL validation for ClickHouse configuration.

Both CLICKHOUSE_URL and CLICKHOUSE_INSERT_URL should validate that they are proper URLs for consistency and to catch configuration errors early.

-  CLICKHOUSE_URL: z.string(),
+  CLICKHOUSE_URL: z.string().url(),
-  CLICKHOUSE_INSERT_URL: z.string().optional(),
+  CLICKHOUSE_INSERT_URL: z.string().url().optional(),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CLICKHOUSE_INSERT_URL: z.string().optional(),
CLICKHOUSE_URL: z.string().url(),
CLICKHOUSE_INSERT_URL: z.string().url().optional(),

Comment on lines 30 to 110
const buffer = new Map<string, Batch>();

async function flush(force?: boolean): Promise<void> {
const now = Date.now();

for (const [key, batch] of buffer.entries()) {
if (force || now >= batch.createdAt + MAX_BATCH_TIME) {
await persist(key);
}
}
}

// persist inserts the data into clickhouse and removes it from the buffer
async function persist(key: string): Promise<void> {
const batch = buffer.get(key);
if (!batch) {
return;
}

const url = new URL(env.CLICKHOUSE_URL);
batch.params.forEach((v, k) => {
url.searchParams.set(k, v);
});

const res = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "text/plain",
Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`,
},
body: batch.rows.join("\n"),
});
if (res.ok) {
buffer.delete(key);
console.info(`persisted ${batch.rows.length} rows`);
} else {
console.error("unable to persist", await res.text(), JSON.stringify(batch));
}
}

setInterval(flush, 1000);

const server = Bun.serve({
port: env.PORT,
fetch: async (req: Request): Promise<Response> => {
const url = new URL(req.url);

if (url.pathname === "/v1/liveness") {
return new Response("I'm alive");
}

if (req.headers.get("Authorization") !== requiredAuthorization) {
return new Response("unauthorized", { status: 401 });
}

const query = url.searchParams.get("query");
if (!query || !query.startsWith("INSERT INTO")) {
return new Response("wrong query", { status: 400 });
}

const params = url.searchParams;
params.delete("query_id");
params.sort();

const key = params.toString();

const rows = (await req.text()).split("\n");

const existing = buffer.get(key);
if (existing) {
const size = existing.rows.push(...rows);
if (size >= MAX_BATCH_SIZE) {
await persist(key);
}
} else {
buffer.set(key, {
createdAt: Date.now(),
params,
rows,
});
}
Copy link
Contributor

@coderabbitai coderabbitai bot Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Synchronize access to the buffer to prevent race conditions

Since both the fetch handler and the flush function access and modify the buffer, concurrent access may lead to race conditions, especially under high load or with multiple CPU cores.

Consider using a mutex or lock mechanism to synchronize access to the buffer. Alternatively, use concurrent data structures or design patterns that handle asynchronous access safely.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javascript runs in a single thread, this is not parallel

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chronark, thank you for the clarification. You're correct—since JavaScript runs in a single thread, race conditions due to concurrent access to buffer are not a concern here. I'll keep this in mind for future reviews.


✏️ Learnings added
Learnt from: chronark
PR: unkeyed/unkey#2672
File: apps/chproxy/src/main.ts:30-110
Timestamp: 2024-11-21T20:12:33.365Z
Learning: In `apps/chproxy/src/main.ts`, JavaScript runs in a single thread, so race conditions due to concurrent access to shared variables like `buffer` are not a concern.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

apps/chproxy/src/main.ts Show resolved Hide resolved
apps/chproxy/src/main.ts Show resolved Hide resolved
apps/chproxy/src/main.ts Outdated Show resolved Hide resolved
Comment on lines +2697 to +3003
"application/json": components["schemas"]["ErrUnauthorized"];
};
};
/** @description The client does not have access rights to the content; that is, it is unauthorized, so the server is refusing to give the requested resource. Unlike 401 Unauthorized, the client's identity is known to the server. */
403: {
content: {
"application/json": components["schemas"]["ErrForbidden"];
};
};
/** @description The server cannot find the requested resource. In the browser, this means the URL is not recognized. In an API, this can also mean that the endpoint is valid but the resource itself does not exist. Servers may also send this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client. This response code is probably the most well known due to its frequent occurrence on the web. */
404: {
content: {
"application/json": components["schemas"]["ErrNotFound"];
};
};
/** @description This response is sent when a request conflicts with the current state of the server. */
409: {
content: {
"application/json": components["schemas"]["ErrConflict"];
};
};
/** @description The user has sent too many requests in a given amount of time ("rate limiting") */
429: {
content: {
"application/json": components["schemas"]["ErrTooManyRequests"];
};
};
/** @description The server has encountered a situation it does not know how to handle. */
500: {
content: {
"application/json": components["schemas"]["ErrInternalServerError"];
};
};
};
};
getOverride: {
parameters: {
query: {
namespaceId?: string;
namespaceName?: string;
identifier: string;
};
};
responses: {
/** @description Details of the override for the given identifier */
200: {
content: {
"application/json": {
id: string;
identifier: string;
limit: number;
duration: number;
async?: boolean | null;
};
};
};
/** @description The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing). */
400: {
content: {
"application/json": components["schemas"]["ErrBadRequest"];
};
};
/** @description Although the HTTP standard specifies "unauthorized", semantically this response means "unauthenticated". That is, the client must authenticate itself to get the requested response. */
401: {
content: {
"application/json": components["schemas"]["ErrUnauthorized"];
};
};
/** @description The client does not have access rights to the content; that is, it is unauthorized, so the server is refusing to give the requested resource. Unlike 401 Unauthorized, the client's identity is known to the server. */
403: {
content: {
"application/json": components["schemas"]["ErrForbidden"];
};
};
/** @description The server cannot find the requested resource. In the browser, this means the URL is not recognized. In an API, this can also mean that the endpoint is valid but the resource itself does not exist. Servers may also send this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client. This response code is probably the most well known due to its frequent occurrence on the web. */
404: {
content: {
"application/json": components["schemas"]["ErrNotFound"];
};
};
/** @description This response is sent when a request conflicts with the current state of the server. */
409: {
content: {
"application/json": components["schemas"]["ErrConflict"];
};
};
/** @description The user has sent too many requests in a given amount of time ("rate limiting") */
429: {
content: {
"application/json": components["schemas"]["ErrTooManyRequests"];
};
};
/** @description The server has encountered a situation it does not know how to handle. */
500: {
content: {
"application/json": components["schemas"]["ErrInternalServerError"];
};
};
};
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Enforce mutual exclusivity of 'namespaceId' and 'namespaceName' in type definitions

The descriptions indicate that either namespaceId or namespaceName must be provided, but not both. Currently, both are optional, allowing for the possibility that neither is provided or both are provided, which could lead to runtime errors.

Consider updating the type definitions to enforce that exactly one of namespaceId or namespaceName is provided by using a union type.

Apply this diff to enforce mutual exclusivity:

        requestBody: {
          content: {
            "application/json": {
-             namespaceId?: string;
-             namespaceName?: string;
+             XOR<{ namespaceId: string }, { namespaceName: string }>;
              identifier: string;
              // rest of the fields
            };
          };
        };

Similarly, update the query parameters for the getOverride operation:

        parameters: {
          query: {
-           namespaceId?: string;
-           namespaceName?: string;
+           XOR<{ namespaceId: string }, { namespaceName: string }>;
            identifier: string;
          };
        };

This change enforces that exactly one of the two properties is required, aligning with the intended usage.

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (7)
apps/engineering/content/docs/api-design/meta.json (1)

3-3: Consider enhancing the description to be more informative.

The current description "Intuitive and helpful" is quite generic. Consider expanding it to better describe what developers will learn from this section, such as "Learn best practices for designing intuitive and consistent APIs" or "Guidelines for building developer-friendly REST and RPC APIs".

-  "description": "Intuitive and helpful",
+  "description": "Learn best practices for designing intuitive and consistent APIs",
apps/engineering/app/(home)/page.tsx (1)

3-11: Consider improving layout structure and semantics

The current implementation has several areas for improvement:

  1. The negative margin (-mt-16) could cause layout issues
  2. w-screen might cause horizontal scrolling
  3. Multiple spaces in className strings
  4. Could use more semantic HTML structure

Consider applying these improvements:

-    <div className="min-h-screen   border  text-center -mt-16 pt-16 flex items-center w-screen justify-center ">
-      <div>
-        <h1 className="text-7xl md:text-8xl font-bold  leading-none  uppercase tracking-tight">
+    <main className="min-h-screen border text-center pt-16 flex items-center w-full justify-center">
+      <section>
+        <h1 className="text-7xl md:text-8xl font-bold leading-none uppercase tracking-tight">
           BUILD BETTER
           <br /> APIS FASTER
         </h1>
-        <p className="text-xl mt-8 font-light ">How we work</p>
-      </div>
-    </div>
+        <h2 className="text-xl mt-8 font-light">How we work</h2>
+      </section>
+    </main>

Changes explained:

  • Replaced outer div with semantic main tag
  • Replaced inner div with semantic section
  • Changed w-screen to w-full to prevent horizontal scrolling
  • Removed negative margin that could cause layout issues
  • Cleaned up multiple spaces in className strings
  • Changed the subheading p to h2 for better document outline
apps/engineering/content/docs/contributing/index.mdx (5)

11-14: Enhance the Clerk requirement note

The callout about making Clerk optional would be more helpful with additional context about the timeline or alternative authentication methods being considered.

 <Callout>
-  We're currently working on making this optional.
+  We're currently working on making Clerk optional. Alternative authentication methods will be supported in a future release.
 </Callout>

19-21: Add version requirements for prerequisites

Consider specifying minimum version requirements for the tools to prevent compatibility issues.

-[Node](https://nodejs.org)
-[pnpm](https://pnpm.io/installation)
-[Docker](https://www.docker.com/products/docker-desktop)
+[Node](https://nodejs.org) (v18 or higher)
+[pnpm](https://pnpm.io/installation) (v8 or higher)
+[Docker](https://www.docker.com/products/docker-desktop) (required for local databases)

53-55: Enhance Docker requirement warning

The warning could be more helpful by specifying which services will be started.

 <Callout type="warn">
-    Docker must be running for us to spin up the databases and supporting services.
+    Docker must be running for us to spin up required services:
+    - PostgreSQL database
+    - Redis cache
+    - ClickHouse for analytics
 </Callout>

79-81: Enhance command options documentation

The options documentation could be more detailed and include default values.

-  - `--service=<service>`: Specifies which part of the application to develop. The values are `dashboard`, `api`, or `www`
-  - `--skip-env`: Skips the environment setup prompt if specified.
+  - `--service=<service>`: Specifies which part of the application to develop:
+    - `dashboard`: Run the admin dashboard (default)
+    - `api`: Run the API server
+    - `www`: Run the public website
+  - `--skip-env`: Skip the environment setup prompt. Use this when your .env files are already configured.
+  - `--help`: Display all available options and their descriptions
🧰 Tools
🪛 LanguageTool

[uncategorized] ~79-~79: Loose punctuation mark.
Context: ...ilable options: - --service=<service>: Specifies which part of the application...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~81-~81: Loose punctuation mark.
Context: ...hboard, api, or www ---skip-env`: Skips the environment setup prompt if s...

(UNLIKELY_OPENING_PUNCTUATION)


87-87: Add a troubleshooting section

Consider adding a troubleshooting section to help developers resolve common setup issues. This should include:

  • Common Docker-related issues and solutions
  • Environment configuration problems
  • Known platform-specific challenges

Would you like me to help draft a troubleshooting section for the documentation?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 500e8c1 and e62f300.

📒 Files selected for processing (9)
  • apps/engineering/app/(home)/page.tsx (1 hunks)
  • apps/engineering/app/docs/layout.tsx (1 hunks)
  • apps/engineering/app/layout.config.tsx (2 hunks)
  • apps/engineering/content/docs/api-design/meta.json (1 hunks)
  • apps/engineering/content/docs/contributing/index.mdx (1 hunks)
  • apps/engineering/content/docs/contributing/meta.json (1 hunks)
  • apps/engineering/content/docs/contributing/sdk-development.mdx (1 hunks)
  • apps/engineering/content/docs/contributing/testing.mdx (1 hunks)
  • apps/engineering/content/docs/contributing/versions.mdx (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • apps/engineering/content/docs/contributing/meta.json
  • apps/engineering/content/docs/contributing/sdk-development.mdx
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/engineering/app/docs/layout.tsx
🧰 Additional context used
🪛 LanguageTool
apps/engineering/content/docs/contributing/index.mdx

[uncategorized] ~59-~59: Loose punctuation mark.
Context: ...ollowing values: - CLERK_SECRET_KEY: Your Clerk secret key. - `NEXT_PUBLIC...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~60-~60: Loose punctuation mark.
Context: ... key. - NEXT_PUBLIC_CLERK_PUBLIC_KEY: Your Clerk public key. We will cre...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~79-~79: Loose punctuation mark.
Context: ...ilable options: - --service=<service>: Specifies which part of the application...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~81-~81: Loose punctuation mark.
Context: ...hboard, api, or www ---skip-env`: Skips the environment setup prompt if s...

(UNLIKELY_OPENING_PUNCTUATION)

apps/engineering/content/docs/contributing/testing.mdx

[grammar] ~20-~20: Did you mean the adjective “end-to-end” (spelled with hyphens)?
Context: ...gration tests: ### integration Full end to end tests require a running API either loca...

(END_TO_END_HYPHEN)

apps/engineering/content/docs/contributing/versions.mdx

[style] ~13-~13: Consider removing “of” to be more concise
Context: ...ent versioning. ## Releasing packages All of the npm packages are versioned via changese...

(ALL_OF_THE)

🔇 Additional comments (5)
apps/engineering/content/docs/api-design/meta.json (1)

2-2: LGTM! Well-structured metadata properties.

The title is clear, the icon choice is appropriate for API documentation, and the root property correctly indicates this as a top-level documentation page.

Also applies to: 4-5

apps/engineering/app/(home)/page.tsx (1)

1-1: LGTM! Component name follows Next.js conventions

The rename from HomePage to Page aligns with Next.js 13+ app directory naming conventions.

apps/engineering/app/layout.config.tsx (3)

12-12: LGTM: Title change aligns with rebranding

The simplified title "Unkey" aligns with the broader rebranding effort and provides a more focused brand presentation.


1-1: Verify the new fumadocs-ui import path

The import path has been updated to match the new package structure. Let's verify this change is documented and stable.

✅ Verification successful

Import path change is consistently applied

The new import path fumadocs-ui/layouts/home is used consistently across the codebase in both type imports and component imports, with no instances of the old path remaining. The change appears to be properly implemented.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for other usages of the old and new import paths
# to ensure consistent updates across the codebase

echo "Checking for any remaining old import paths..."
rg "fumadocs-ui/home-layout"

echo "Verifying consistent usage of new import path..."
rg "fumadocs-ui/layouts/home"

Length of output: 466


17-23: Verify new documentation paths and consider navigation structure

The new documentation structure looks more organized, but let's verify:

  1. The existence of the new documentation paths
  2. The consistency of using "nested-url" for active state
✅ Verification successful

Based on the search results, I can now generate the final response:

Documentation paths exist and navigation structure is consistent

The new documentation structure is properly organized:

  • Contributing docs exist at apps/engineering/content/docs/contributing/index.mdx
  • Architecture docs exist at apps/engineering/content/docs/architecture/index.mdx
  • The use of "nested-url" for active state is consistent with other navigation items in apps/engineering/app/layout.config.tsx
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence of new documentation paths
# and check for consistent navigation patterns

echo "Checking for contributing documentation..."
fd "contributing.mdx" docs

echo "Checking for architecture documentation..."
fd "architecture.mdx" docs

echo "Verifying consistent usage of active state..."
rg "active: \"nested-url\"" --type tsx

Length of output: 595


Script:

#!/bin/bash
# Let's try a different approach to find documentation files and navigation patterns

# Find all documentation files
echo "Searching for documentation files..."
fd -e md -e mdx

# Check for navigation patterns in TSX files
echo "Checking navigation patterns..."
fd -e tsx | xargs rg "active:"

# Look for documentation-related directories
echo "Looking for documentation directories..."
fd -t d

Length of output: 32738

Comment on lines +57 to +61
You will be prompted to enter the following values:

- `CLERK_SECRET_KEY`: Your Clerk secret key.
- `NEXT_PUBLIC_CLERK_PUBLIC_KEY`: Your Clerk public key.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add instructions for obtaining Clerk keys

The documentation should guide users on how to obtain the required Clerk keys.

 You will be prompted to enter the following values:

-  - `CLERK_SECRET_KEY`: Your Clerk secret key.
-  - `NEXT_PUBLIC_CLERK_PUBLIC_KEY`: Your Clerk public key.
+  - `CLERK_SECRET_KEY`: Your Clerk secret key (obtain from [Clerk Dashboard](https://dashboard.clerk.com) → API Keys)
+  - `NEXT_PUBLIC_CLERK_PUBLIC_KEY`: Your Clerk public key (obtain from [Clerk Dashboard](https://dashboard.clerk.com) → API Keys)
+
+  For detailed instructions on setting up your Clerk account and obtaining these keys, visit our [Authentication Setup Guide](link-to-auth-guide).

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~59-~59: Loose punctuation mark.
Context: ...ollowing values: - CLERK_SECRET_KEY: Your Clerk secret key. - `NEXT_PUBLIC...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~60-~60: Loose punctuation mark.
Context: ... key. - NEXT_PUBLIC_CLERK_PUBLIC_KEY: Your Clerk public key. We will cre...

(UNLIKELY_OPENING_PUNCTUATION)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (5)
apps/chproxy/src/main.ts (5)

3-4: Consider making batch parameters configurable via environment variables

The hardcoded batch parameters might not be optimal for all use cases. Consider making these configurable:

  • MAX_BATCH_SIZE: Different workloads might require different batch sizes
  • MAX_BATCH_TIME: 3 seconds might be too aggressive for low-traffic scenarios
-const MAX_BATCH_SIZE = 10000;
-const MAX_BATCH_TIME = 3000; // milliseconds
+const MAX_BATCH_SIZE = Number(process.env.MAX_BATCH_SIZE ?? "10000");
+const MAX_BATCH_TIME = Number(process.env.MAX_BATCH_TIME ?? "3000"); // milliseconds

6-10: Add JSDoc comments to document the Batch type

Adding documentation will improve code maintainability and help other developers understand the purpose of each field.

+/**
+ * Represents a batch of rows to be inserted into ClickHouse
+ * @property createdAt - Timestamp when the batch was created
+ * @property rows - Array of strings, each representing a row to insert
+ * @property params - URL parameters for the ClickHouse query
+ */
 type Batch = {
   createdAt: number;
   rows: string[];
   params: URLSearchParams;
 };

23-24: Improve documentation for BASIC_AUTH format

The comment could be more explicit about the expected format and encoding process:

-    // user:pass
-    // We receive a base64 of this in the authorization header ie: `Basic ZGVmYXVsdDo=`
+    // BASIC_AUTH should be in format "username:password"
+    // This will be base64 encoded and used in the Authorization header
+    // Example: If BASIC_AUTH="user:pass", the header will be "Basic dXNlcjpwYXNz"

70-70: Make flush interval configurable

The flush interval should be configurable and aligned with MAX_BATCH_TIME.

-setInterval(flush, 1000);
+const FLUSH_INTERVAL = Number(process.env.FLUSH_INTERVAL ?? "1000");
+setInterval(flush, FLUSH_INTERVAL);

121-127: Enhance signal handling for graceful shutdown

Consider handling additional signals and adding a shutdown timeout:

-process.on("SIGTERM", async (s) => {
+async function shutdown(signal: string) {
   console.warn("Received signal", s);
+  console.info("Starting graceful shutdown...");
+
+  const timeoutPromise = new Promise((_, reject) =>
+    setTimeout(() => reject(new Error("Shutdown timeout")), 10000)
+  );
 
   server.stop();
-  await flush(true);
+  try {
+    await Promise.race([flush(true), timeoutPromise]);
+    console.info("Graceful shutdown completed");
+  } catch (error) {
+    console.error("Error during shutdown:", error);
+  }
   process.exit(0);
-});
+}
+
+process.on("SIGTERM", () => shutdown("SIGTERM"));
+process.on("SIGINT", () => shutdown("SIGINT"));
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e62f300 and 6a26302.

📒 Files selected for processing (1)
  • apps/chproxy/src/main.ts (1 hunks)

Comment on lines +85 to +88
const query = url.searchParams.get("query");
if (!query || !query.toLowerCase().startsWith("insert into")) {
return new Response("wrong query", { status: 400 });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance query validation

The current query validation is basic and case-sensitive. Consider using a more robust approach:

-    if (!query || !query.toLowerCase().startsWith("insert into")) {
-      return new Response("wrong query", { status: 400 });
+    if (!query) {
+      return new Response("missing query parameter", { status: 400 });
+    }
+    const normalizedQuery = query.toLowerCase().trim();
+    if (!normalizedQuery.startsWith("insert into")) {
+      return new Response("only INSERT queries are allowed", { status: 400 });
+    }
+    // Optional: Add more validation like table name whitelist
+    const allowedTables = ["events", "metrics"]; // configure via env
+    const tableMatch = normalizedQuery.match(/insert into\s+([^\s(]+)/i);
+    if (!tableMatch || !allowedTables.includes(tableMatch[1])) {
+      return new Response("invalid table name", { status: 400 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const query = url.searchParams.get("query");
if (!query || !query.toLowerCase().startsWith("insert into")) {
return new Response("wrong query", { status: 400 });
}
const query = url.searchParams.get("query");
if (!query) {
return new Response("missing query parameter", { status: 400 });
}
const normalizedQuery = query.toLowerCase().trim();
if (!normalizedQuery.startsWith("insert into")) {
return new Response("only INSERT queries are allowed", { status: 400 });
}
// Optional: Add more validation like table name whitelist
const allowedTables = ["events", "metrics"]; // configure via env
const tableMatch = normalizedQuery.match(/insert into\s+([^\s(]+)/i);
if (!tableMatch || !allowedTables.includes(tableMatch[1])) {
return new Response("invalid table name", { status: 400 });
}

Comment on lines +98 to +110
const existing = buffer.get(key);
if (existing) {
const size = existing.rows.push(...rows);
if (size >= MAX_BATCH_SIZE) {
await persist(key);
}
} else {
buffer.set(key, {
createdAt: Date.now(),
params,
rows,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Monitor and limit total buffer size

The buffer size should be monitored and limited to prevent memory issues:

+    const currentBufferSize = Array.from(buffer.values()).reduce((sum, batch) => sum + batch.rows.length, 0);
+    const MAX_TOTAL_ROWS = Number(process.env.MAX_TOTAL_ROWS ?? "1000000");
+    
+    if (currentBufferSize + rows.length > MAX_TOTAL_ROWS) {
+      return new Response("buffer full", { status: 503 });
+    }
+
     const existing = buffer.get(key);
     if (existing) {
       const size = existing.rows.push(...rows);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const existing = buffer.get(key);
if (existing) {
const size = existing.rows.push(...rows);
if (size >= MAX_BATCH_SIZE) {
await persist(key);
}
} else {
buffer.set(key, {
createdAt: Date.now(),
params,
rows,
});
}
const currentBufferSize = Array.from(buffer.values()).reduce((sum, batch) => sum + batch.rows.length, 0);
const MAX_TOTAL_ROWS = Number(process.env.MAX_TOTAL_ROWS ?? "1000000");
if (currentBufferSize + rows.length > MAX_TOTAL_ROWS) {
return new Response("buffer full", { status: 503 });
}
const existing = buffer.get(key);
if (existing) {
const size = existing.rows.push(...rows);
if (size >= MAX_BATCH_SIZE) {
await persist(key);
}
} else {
buffer.set(key, {
createdAt: Date.now(),
params,
rows,
});
}

Comment on lines +54 to +61
const res = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "text/plain",
Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`,
},
body: batch.rows.join("\n"),
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance error handling in fetch request

The fetch request should handle different types of errors (network, timeout, etc.) and provide more detailed error information.

-  const res = await fetch(url, {
-    method: "POST",
-    headers: {
-      "Content-Type": "text/plain",
-      Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`,
-    },
-    body: batch.rows.join("\n"),
-  });
+  try {
+    const res = await fetch(url, {
+      method: "POST",
+      headers: {
+        "Content-Type": "text/plain",
+        Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`,
+      },
+      body: batch.rows.join("\n"),
+      // Add timeout to prevent hanging requests
+      signal: AbortSignal.timeout(5000),
+    });
+    if (res.ok) {
+      buffer.delete(key);
+      console.info(`persisted ${batch.rows.length} rows`);
+    } else {
+      console.error(
+        `Failed to persist: ${res.status} ${res.statusText}`,
+        await res.text(),
+        JSON.stringify(batch)
+      );
+    }
+  } catch (error) {
+    if (error.name === 'TimeoutError') {
+      console.error('Request timed out while persisting batch');
+    } else {
+      console.error('Network error while persisting batch:', error);
+    }
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const res = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "text/plain",
Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`,
},
body: batch.rows.join("\n"),
});
try {
const res = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "text/plain",
Authorization: `Basic ${btoa([url.username, url.password].join(":"))}`,
},
body: batch.rows.join("\n"),
// Add timeout to prevent hanging requests
signal: AbortSignal.timeout(5000),
});
if (res.ok) {
buffer.delete(key);
console.info(`persisted ${batch.rows.length} rows`);
} else {
console.error(
`Failed to persist: ${res.status} ${res.statusText}`,
await res.text(),
JSON.stringify(batch)
);
}
} catch (error) {
if (error.name === 'TimeoutError') {
console.error('Request timed out while persisting batch');
} else {
console.error('Network error while persisting batch:', error);
}
}

@chronark chronark merged commit 2b6fa3c into main Nov 21, 2024
27 checks passed
@chronark chronark deleted the clickhouse-buffer branch November 21, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant