Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout to channel builder #710

Open
wants to merge 3 commits into
base: development
Choose a base branch
from

Conversation

dmitrii-ubskii
Copy link
Member

@dmitrii-ubskii dmitrii-ubskii commented Nov 8, 2024

Usage and product changes

We introduce a timeout in the gRPC channel so that the driver correctly fails when the server is available but unresponsive.

Resolves #672.

Implementation

@typedb-bot
Copy link
Member

PR Review Checklist

Do not edit the content of this comment. The PR reviewer should simply update this comment by ticking each review item below, as they get completed.


Trivial Change

  • This change is trivial and does not require a code or architecture review.

Code

  • Packages, classes, and methods have a single domain of responsibility.
  • Packages, classes, and methods are grouped into cohesive and consistent domain model.
  • The code is canonical and the minimum required to achieve the goal.
  • Modules, libraries, and APIs are easy to use, robust (foolproof and not errorprone), and tested.
  • Logic and naming has clear narrative that communicates the accurate intent and responsibility of each module (e.g. method, class, etc.).
  • The code is algorithmically efficient and scalable for the whole application.

Architecture

  • Any required refactoring is completed, and the architecture does not introduce technical debt incidentally.
  • Any required build and release automations are updated and/or implemented.
  • Any new components follows a consistent style with respect to the pre-existing codebase.
  • The architecture intuitively reflects the application domain, and is easy to understand.
  • The architecture has a well-defined hierarchy of encapsulated components.
  • The architecture is extensible and scalable.

@dmitrii-ubskii dmitrii-ubskii marked this pull request as ready for review November 8, 2024 15:33
@@ -49,8 +52,13 @@ impl GRPCChannel for PlainTextChannel {}

impl GRPCChannel for CallCredChannel {}

const TIMEOUT: Duration = Duration::from_secs(60);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chosen arbitrarily. Anything less than a second or so is too strict for slow connections, but going significantly over a minute would probably be too long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's verify that this timeout doesn't drop query requests with complicated, long-running queries. For example, write a simple read query, put a sleep on the server side for 2 minutes, and let it run. This should not fail. But if you put your server sleep on connection, it should fail after 1 minute.

If this works, everything should be fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query does time out if the server does not respond within this time (tested by adding a sleep(60000) in TransactionService::respond() on the server side).

The default transaction timeout is 5 minutes. I'd extend this timeout to 10 minutes or even an hour.

Ideally we'd have a connection builder that would let the user specify the desired request timeout, but that would take a lot longer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid that we'll just destroy some of the usages of analytics queries with this update. I'd suggest giving the query at least 2 hours to execute. And, ideally, highlight this behavior somewhere in the docs.

Later, it will be ideal if this timeout is configurable. Let's create a good task for it if we're going to close the original bug.

Copy link
Member

@farost farost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test queries

@@ -49,8 +52,13 @@ impl GRPCChannel for PlainTextChannel {}

impl GRPCChannel for CallCredChannel {}

const TIMEOUT: Duration = Duration::from_secs(60);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's verify that this timeout doesn't drop query requests with complicated, long-running queries. For example, write a simple read query, put a sleep on the server side for 2 minutes, and let it run. This should not fail. But if you put your server sleep on connection, it should fail after 1 minute.

If this works, everything should be fine.

Copy link
Member

@farost farost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for the timeout value update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants