Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query manager #72

Merged
merged 15 commits into from
Jul 17, 2024
Merged

Query manager #72

merged 15 commits into from
Jul 17, 2024

Conversation

eriktaubeneck
Copy link
Member

@eriktaubeneck eriktaubeneck commented Jul 13, 2024

This refactors a bunch of the implicit query management into a QueryManager class, which can look up queries and run them. With this, we now also reject queries if there is more than 1 running (hard coded right now, but built as a variable to be extended later if we so choose.)

It also adds a bunch more tests! We're now up to ~60% of the sidecar with test coverage.

Summary by CodeRabbit

  • New Features

    • Introduced new endpoints for capacity availability and running queries.
    • Added capacity checks before starting processes.
    • Improved error handling for role validation.
  • Refactor

    • Enhanced logging and status handling to improve query management.
    • Updated endpoints to utilize FastAPI's status module and handle requests more effectively.
  • Bug Fixes

    • Implemented exception handling for missing queries to prevent crashes.
  • Chores

    • Added helper functions to manage queries and capacity checks within the FastAPI framework.

@eriktaubeneck eriktaubeneck requested a review from akoshelev July 13, 2024 06:13
Copy link

vercel bot commented Jul 13, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
draft ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 17, 2024 8:20pm

Copy link
Contributor

coderabbitai bot commented Jul 13, 2024

Walkthrough

The updates enhance the sidecar app by adding new endpoints, refactoring query management, improving error handling, and organizing settings. They introduce capacity checks and improve logging mechanisms, ensuring better resource management and more robust application behavior.

Changes

Files/Modules Change Summary
sidecar/app/routes/start.py Added new endpoints, modified existing ones, introduced IncorrectRoleError, used status module, called get_settings, added request parameter, implemented capacity checks.
sidecar/app/routes/stop.py Imported HTTPException and Request, refactored query retrieval and logging, added request parameter to functions.
sidecar/app/routes/websockets.py Imported HTTPException, refactored query retrieval and logging, added exception handling, improved logging messages.
sidecar/app/query/base.py Added functions for file paths, modified Query class, introduced QueryManager class, refactored logging and initialization, removed get_from_query_id method.
sidecar/app/settings.py Imported loguru and logger, modified Settings class for logging configurations, replaced global settings with memoized get_settings function.
sidecar/app/routes/http_helpers.py (new file) Added functions get_query_from_query_id and check_capacity to enhance query management and capacity checks.

Poem

In the land of code where rabbits hop,
New endpoints spring with every stop.
Queries managed, logs in place,
Capacity checked with saving grace.
Settings now are clear and bright,
A coder’s dream in FastAPI light. 🚀


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)
sidecar/app/main.py (1)

4-4: Consider adding a TODO comment for future adjustments.

The QueryManager is initialized with a hardcoded limit of 1 parallel query. Since the PR summary mentions that this is designed to be adjustable in the future, adding a TODO comment can help track this task.

# TODO: Make the max_parallel_queries value configurable in the future.
app.state.QUERY_MANAGER = QueryManager(max_parallel_queries=1)
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 771b88e and 5f28cca.

Files selected for processing (15)
  • .pre-commit-config.yaml (1 hunks)
  • server/app/alert.tsx (2 hunks)
  • server/app/query/create/page.tsx (4 hunks)
  • server/app/query/layout.tsx (1 hunks)
  • sidecar/app/logger.py (1 hunks)
  • sidecar/app/main.py (1 hunks)
  • sidecar/app/query/base.py (5 hunks)
  • sidecar/app/query/ipa.py (4 hunks)
  • sidecar/app/query/status.py (2 hunks)
  • sidecar/app/routes/start.py (6 hunks)
  • sidecar/app/routes/stop.py (3 hunks)
  • sidecar/app/routes/websockets.py (3 hunks)
  • sidecar/app/settings.py (2 hunks)
  • sidecar/tests/app/query/test_base.py (1 hunks)
  • sidecar/tests/app/routes/test_start.py (1 hunks)
Files skipped from review due to trivial changes (2)
  • .pre-commit-config.yaml
  • server/app/query/layout.tsx
Additional context used
Ruff
sidecar/tests/app/routes/test_start.py

57-60: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)


78-79: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)


96-99: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)


117-118: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

Additional comments not posted (54)
sidecar/app/main.py (1)

8-8: LGTM!

The inclusion of routers and middleware is correct.

sidecar/app/logger.py (2)

6-6: LGTM!

The import of get_settings aligns with the refactoring and is correct.


9-26: LGTM!

The get_logger function is well-structured and correctly configures the logger.

sidecar/app/settings.py (2)

34-41: LGTM!

The properties for path calculations are correctly defined and improve the readability and maintainability of the code.


43-48: LGTM!

The get_settings function is well-structured and correctly memoizes the settings.

sidecar/app/routes/stop.py (3)

1-1: LGTM!

The import of QueryManager aligns with the refactoring and is correct.


17-26: LGTM!

The finish endpoint is well-structured and correctly uses query_manager.


35-42: LGTM!

The kill endpoint is well-structured and correctly uses query_manager.

server/app/alert.tsx (2)

Line range hint 7-28:
LGTM! Named export is appropriate.

The change to a named export for QueryStartedAlert is appropriate and does not affect the functionality.


30-44: LGTM! New alert function is well-implemented.

The QueryFailedToStartAlert function is well-implemented and correctly displays an alert when a query fails to start.

sidecar/app/query/status.py (1)

54-56: LGTM! Optional timestamp parameter improves flexibility.

The change to make the timestamp parameter optional in the add method is straightforward and improves the method's flexibility.

sidecar/app/routes/websockets.py (3)

30-46: LGTM! Using query_manager for query retrieval and logging is appropriate.

The changes to use query_manager for query retrieval and logging in the status_websocket handler are appropriate and correct.


Line range hint 50-71:
LGTM! Using query_manager for query retrieval and logging is appropriate.

The changes to use query_manager for query retrieval and logging in the logs_websocket handler are appropriate and correct.


78-91: LGTM! Using query_manager for query retrieval and logging is appropriate.

The changes to use query_manager for query retrieval and logging in the stats_websocket handler are appropriate and correct.

sidecar/tests/app/query/test_base.py (9)

25-31: LGTM! Test case is correct.

The test_query_files test case is straightforward and correctly verifies the existence of status and log files for a Query instance.


33-40: LGTM! Test case is correct.

The test_query_started test case is straightforward and correctly verifies the started property for different statuses of a Query instance.


43-50: LGTM! Test case is correct.

The test_query_finished test case is straightforward and correctly verifies the finished property for different statuses of a Query instance.


53-61: LGTM! Test case is correct.

The test_query_status_event_json test case is straightforward and correctly verifies the status_event_json property for a Query instance. The use of mock_time ensures consistent timestamps.


64-71: LGTM! Test case is correct.

The test_query_running test case is straightforward and correctly verifies the running property for different statuses of a Query instance.


74-79: LGTM! Test case is correct.

The test_query_manager test case is straightforward and correctly verifies the get_from_query_id method of QueryManager.


82-89: LGTM! Test case is correct.

The test_query_manager_capacity_available test case is straightforward and correctly verifies the capacity_available property of QueryManager.


92-105: LGTM! Test case is correct.

The test_query_manager_run_query test case is straightforward and correctly verifies the run_query method of QueryManager. The use of mock_start ensures the method is called correctly.


108-121: LGTM! Test case is correct.

The test_query_manager_run_query_at_capacity test case is straightforward and correctly verifies the run_query method of QueryManager when at capacity. The use of pytest.raises ensures the method raises the correct exception.

sidecar/app/query/base.py (7)

21-23: LGTM!

The status_file_path function correctly retrieves the status file path for a given query ID.


26-28: LGTM!

The log_file_path function correctly retrieves the log file path for a given query ID.


33-53: LGTM!

The changes to the Query class improve initialization and logging. The use of get_settings and get_logger is consistent with the new design.


165-166: LGTM!

The MaxQueriesRunningError class is correctly defined to handle the case when the maximum number of running queries is exceeded.


170-196: LGTM!

The QueryManager class is well-structured and effectively manages running queries, ensuring that the number of concurrent queries does not exceed the specified limit.


183-192: LGTM!

The run_query method correctly handles the execution and removal of queries, ensuring capacity constraints are respected.


195-196: LGTM!

The capacity_available property correctly checks if the number of running queries is below the maximum allowed.

sidecar/tests/app/routes/test_start.py (7)

16-23: LGTM!

The mock_role fixture is correctly implemented and will be useful for testing role-based functionality.


26-33: LGTM!

The running_query fixture is correctly implemented and will be useful for testing scenarios with running queries.


36-39: LGTM!

The test_capacity_available function is correctly implemented and verifies the capacity availability.


42-46: LGTM!

The test_not_capacity_available function is correctly implemented and verifies the capacity availability when a query is running.


49-52: LGTM!

The test_running_queries function is correctly implemented and verifies the retrieval of running queries.


55-73: LGTM!

The test_start_ipa_helper function is correctly implemented and verifies the starting of an IPA helper query.

Tools
Ruff

57-60: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)


94-112: LGTM!

The test_start_ipa_query function is correctly implemented and verifies the starting of an IPA coordinator query.

Tools
Ruff

96-99: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

sidecar/app/routes/start.py (7)

27-32: LGTM!

The capacity_available endpoint is correctly implemented and uses the QueryManager to check capacity.


35-40: LGTM!

The running_queries endpoint is correctly implemented and uses the QueryManager to retrieve running queries.


43-59: LGTM!

The demo_logger endpoint is correctly implemented and uses the QueryManager to run the query.


Line range hint 62-106:
LGTM!

The start_ipa_helper endpoint is correctly implemented and uses the QueryManager to run the query. The role check and capacity check are correctly implemented.


113-118: LGTM!

The get_ipa_helper_status endpoint is correctly implemented and uses the QueryManager to retrieve the query status.


125-132: LGTM!

The get_ipa_helper_log_file endpoint is correctly implemented and uses the QueryManager to retrieve the query log file. The log file streaming is correctly implemented.


Line range hint 159-198:
LGTM!

The start_ipa_query endpoint is correctly implemented and uses the QueryManager to run the query. The role check and capacity check are correctly implemented.

sidecar/app/query/ipa.py (6)

Line range hint 15-33:
LGTM!

The changes to the IPAQuery class improve initialization and handling of kill signals. The use of get_settings is consistent with the new design.


Line range hint 41-58:
LGTM!

The IPACloneStep class is correctly implemented and includes methods for building the command and handling pre-run conditions.


Line range hint 61-72:
LGTM!

The IPAUpdateRemoteOriginStep class is correctly implemented and includes methods for building the command.


Line range hint 151-174:
LGTM!

The IPACoordinatorGenerateTestDataStep class is correctly implemented and includes methods for building the command and handling pre-run conditions.


Line range hint 244-266:
LGTM!

The IPACoordinatorWaitForHelpersStep class is correctly implemented and includes methods for running the step and handling termination and kill signals.


Line range hint 269-292:
LGTM!

The IPACoordinatorStartStep class is correctly implemented and includes methods for building the command.

server/app/query/create/page.tsx (4)

12-12: Import changes look good.

The import statement has been correctly updated to include QueryFailedToStartAlert and change QueryStartedAlert to a named export.


26-28: State variable addition looks good.

The new state variable querySubmitSuccess is correctly initialized to handle form submission status.


49-56: Form submission logic changes look good.

The form submission logic correctly sets querySubmitSuccess based on the response status.


78-83: Alert rendering logic looks good.

The alert rendering logic correctly displays QueryStartedAlert or QueryFailedToStartAlert based on the value of querySubmitSuccess.

@eriktaubeneck eriktaubeneck requested a review from cberkhoff July 16, 2024 20:20
server/app/alert.tsx Outdated Show resolved Hide resolved
)
def get_logger():
settings = get_settings()
logger.remove()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how expensive is to reconfigure the logger each time this function is called? is it a mistake to call it more than once and is it also ok to call it from more than one thread?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unclear, and easy enough to fix. I can put the logger on settings, so it's only called once. commit coming.

sidecar/app/query/base.py Show resolved Hide resolved
sidecar/app/query/base.py Outdated Show resolved Hide resolved
sidecar/app/query/base.py Show resolved Hide resolved
):
query_manager = request.app.state.QUERY_MANAGER
if not query_manager.capacity_available:
raise HTTPException(status_code=503, detail="Capacity unavailable")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be nicer if we can return something like Not enough capacity to run another query: {m}/{n} queries are already running.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return it, or have the detail in the 503 say that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea in the detail I think - I assumed that detail is returned in HTTP body, so we can read it on the client side

):
# pylint: disable=too-many-arguments
query_manager = request.app.state.QUERY_MANAGER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this maybe get too repetitive and prone for errors if we need to manually check for capacity before trying to do something with query_manager.

it maybe better to add another abstraction like QueryRunner that QueryManager methods return to run a query. Methods will throw if there is no enough capacity (or return optional). At HTTP layer we could catch the exception and wrap it into a 503 error.

    query_manager = QUERY_MANAGER
    query_runner = query_manager.get_runner # this throws if there is no enough capacity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's only twice, and it's because the report collector has a different path than the helpers. if anything, we may just want to put those in the same route.

for now, I'm not too worried about a 2x duplication, but if it we're more than that, I'd agree we should handle it elsewhere.

role = settings.role
if not role or role == role.COORDINATOR:
raise Exception("Cannot start helper without helper role.")
raise IncorrectRoleError("Cannot start helper without helper role.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this message could be made less ambiguous by showing the value of role variable on this helper.

role = settings.role
if role != role.COORDINATOR:
raise Exception(f"Sidecar {role}: Cannot start query without coordinator role.")
raise IncorrectRoleError(
f"Sidecar {role}: Cannot start query without coordinator role."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, we should clearly state what went wrong - starting a query on H1 instead of Coordinator, or something like that

if query is None:
return {"message": "Query not found", "query_id": query_id}
logger.info(f"{query=}")
raise HTTPException(status_code=404, detail="Query not found")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these null checks and rethrowing 404 are really repetitive. I think we should wrap query manager into a class in the HTTP layer that will do it in one place, so every handler does not need to do it on its own

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated it. I just used a wrapper function - we can wrap query manager if there end up being more things like this.

Co-authored-by: Alex Koshelev <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 5f28cca and 94a7d2e.

Files selected for processing (1)
  • server/app/alert.tsx (2 hunks)
Files skipped from review as they are similar to previous changes (1)
  • server/app/alert.tsx

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 94a7d2e and 875e519.

Files selected for processing (1)
  • sidecar/app/query/base.py (5 hunks)
Files skipped from review as they are similar to previous changes (1)
  • sidecar/app/query/base.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 875e519 and 73c9ad1.

Files selected for processing (2)
  • sidecar/app/query/base.py (5 hunks)
  • sidecar/app/settings.py (2 hunks)
Files skipped from review as they are similar to previous changes (2)
  • sidecar/app/query/base.py
  • sidecar/app/settings.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Outside diff range, codebase verification and nitpick comments (2)
sidecar/app/routes/stop.py (1)

22-24: Ensure consistent logging levels.

The logging level used (info) before checking the query status and after might not be consistent with the severity of the operations being logged. Consider using different logging levels based on the operation's outcome or severity.

- query.logger.info(f"{query=}")
+ query.logger.debug(f"{query=}")  # Use debug for less critical information
- query.logger.info("calling query finish")
+ query.logger.warning("calling query finish")  # Use warning if the operation is significant
sidecar/app/routes/websockets.py (1)

31-47: Well-implemented WebSocket integration with query management.

The integration of QueryManager into the WebSocket for status updates is well done. The use of an asynchronous context manager to handle WebSocket connections is a good practice. However, consider adding more detailed error handling for the WebSocket operations.

Consider implementing more granular error handling for WebSocket operations to cover scenarios like network failures or data format issues.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 73c9ad1 and 2fd072a.

Files selected for processing (4)
  • sidecar/app/routes/http_helpers.py (1 hunks)
  • sidecar/app/routes/start.py (6 hunks)
  • sidecar/app/routes/stop.py (3 hunks)
  • sidecar/app/routes/websockets.py (3 hunks)
Files skipped from review as they are similar to previous changes (1)
  • sidecar/app/routes/start.py
Additional comments not posted (1)
sidecar/app/routes/stop.py (1)

18-24: Good use of centralized query retrieval.

The use of get_query_from_query_id centralizes the retrieval of queries, which is a good practice as it reduces code duplication and centralizes error handling. This change aligns with the previous feedback provided by akoshelev.

sidecar/app/routes/http_helpers.py Show resolved Hide resolved
Comment on lines +51 to +64
async def logs_websocket(
websocket: WebSocket,
query_id: str,
):
query = get_query_from_query_id(websocket.app.state.QUERY_MANAGER, Query, query_id)

async with use_websocket(websocket) as websocket:
if query is None:
logger.warning(f"{query_id=} does not exist.")
return

with open(query.log_file_path, "r", encoding="utf8") as log_file:
if query.finished:
logger.info(f"{query_id=} complete. sending all logs.")
query.logger.info(f"{query_id=} complete. sending all logs.")
for line in log_file:
await websocket.send_text(line)
else:
logger.info(f"{query_id=} running. tailing log file.")
query.logger.info(f"{query_id=} running. tailing log file.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor suggestion: Optimize log file handling in WebSocket.

The current implementation reads the log file line by line in a loop. This could be optimized by using asynchronous file reading to prevent blocking the event loop.

import aiofiles

async with aiofiles.open(query.log_file_path, mode='r', encoding='utf8') as log_file:
    async for line in log_file:
        await websocket.send_text(line)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 2fd072a and ba349b9.

Files selected for processing (1)
  • sidecar/app/query/base.py (5 hunks)
Files skipped from review as they are similar to previous changes (1)
  • sidecar/app/query/base.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Outside diff range, codebase verification and nitpick comments (1)
sidecar/app/settings.py (1)

14-16: Consider removing TYPE_CHECKING block if not used.

The TYPE_CHECKING import is used to conditionally import the Logger class. However, if this import is not utilized elsewhere in a way that affects typing checks, consider removing it to clean up the code.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between ba349b9 and 51aad4c.

Files selected for processing (1)
  • sidecar/app/settings.py (2 hunks)
Additional comments not posted (6)
sidecar/app/settings.py (6)

1-8: Review of imports and initial setup.

The imports are correctly organized and relevant to the functionality of the settings module. The use of __future__ annotations ensures forward compatibility with future Python versions.


Line range hint 18-29: Function gen_path and class Settings initialization.

The gen_path function is simple and correctly implemented. The Settings class initialization includes type annotations and validators which are correctly applied. The use of _logger as a private attribute is appropriate to prevent it from being treated as a model field by Pydantic.


50-52: Property logger correctly implemented.

The logger property provides encapsulated access to the _logger instance, adhering to best practices in object-oriented programming.


55-59: Properties helper and helpers.

Both properties provide access to helper instances based on roles. The implementation is straightforward and uses the dictionary access pattern correctly.


71-76: Memoized get_settings function.

The use of lru_cache to memoize the get_settings function is a good practice to ensure that the settings are instantiated only once. The function also ensures that necessary directories are created if they do not exist, which is a robust approach to handle file system dependencies.


62-68: Properties status_dir_path and log_dir_path.

These properties dynamically generate paths based on the root_path. This is a flexible design allowing easy modification of directory paths. However, ensure that root_path is always a valid directory path to prevent runtime errors.

Verification successful

Properties status_dir_path and log_dir_path.

The root_path is assigned and used in various parts of the code, including sidecar/cli/cli.py, sidecar/app/settings.py, and sidecar/app/routes/start.py. The assignments and usages indicate that root_path is often set to default values like Path("."), and there are attempts to ensure it is a valid directory using root_path.mkdir(parents=True, exist_ok=True). This ensures that root_path is always a valid directory path, preventing runtime errors.

  • sidecar/cli/cli.py: Various assignments and usages ensuring root_path is a valid directory.
  • sidecar/app/settings.py: Usage in properties status_dir_path and log_dir_path.
  • sidecar/app/routes/start.py: Usage in constructing repository paths.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that `root_path` is always a valid directory path.

# Test: Search for assignments to `root_path`. Expect: Valid directory paths.
rg --type python -A 5 $'root_path'

Length of output: 64


Script:

#!/bin/bash
# Description: Verify that `root_path` is always a valid directory path.

# Test: Search for assignments to `root_path`. Expect: Valid directory paths.
rg -t py 'root_path'

Length of output: 2827

Comment on lines 33 to 48
self._logger = logger
self._logger.remove()
max_role_str_len = max(len(role.name) for role in Role)
role_str = f"{self.role.name.replace('_', ' ').title():>{max_role_str_len}}"

logger_format = (
"<green>{time:YYYY-MM-DD HH:mm:ss}</green> | "
"<level>{level: <8}</level> | "
"<blue>{extra[role]}</blue> - <level>{message}</level>"
)
self._logger.configure(extra={"role": role_str})
self._logger.add(
sys.stderr,
level="INFO",
format=logger_format,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor logging setup in model_post_init.

The logging setup in model_post_init is extensive but could be refactored for clarity and reusability. Consider extracting the logging configuration into a separate method to improve readability and maintainability.

class Settings(BaseSettings):
    ...
    def model_post_init(self, __context) -> None:
        ...
+       self.configure_logger()

+   def configure_logger(self):
+       max_role_str_len = max(len(role.name) for role in Role)
+       role_str = f"{self.role.name.replace('_', ' ').title():>{max_role_str_len}}"
+       logger_format = (
+           "<green>{time:YYYY-MM-DD HH:mm:ss}</green> | "
+           "<level>{level: <8}</level> | "
+           "<blue>{extra[role]}</blue> - <level>{message}</level>"
+       )
+       self._logger.configure(extra={"role": role_str})
+       self._logger.add(
+           sys.stderr,
+           level="INFO",
+           format=logger_format,
+       )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self._logger = logger
self._logger.remove()
max_role_str_len = max(len(role.name) for role in Role)
role_str = f"{self.role.name.replace('_', ' ').title():>{max_role_str_len}}"
logger_format = (
"<green>{time:YYYY-MM-DD HH:mm:ss}</green> | "
"<level>{level: <8}</level> | "
"<blue>{extra[role]}</blue> - <level>{message}</level>"
)
self._logger.configure(extra={"role": role_str})
self._logger.add(
sys.stderr,
level="INFO",
format=logger_format,
)
self._logger = logger
self._logger.remove()
self.configure_logger()
def configure_logger(self):
max_role_str_len = max(len(role.name) for role in Role)
role_str = f"{self.role.name.replace('_', ' ').title():>{max_role_str_len}}"
logger_format = (
"<green>{time:YYYY-MM-DD HH:mm:ss}</green> | "
"<level>{level: <8}</level> | "
"<blue>{extra[role]}</blue> - <level>{message}</level>"
)
self._logger.configure(extra={"role": role_str})
self._logger.add(
sys.stderr,
level="INFO",
format=logger_format,
)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)
sidecar/app/routes/http_helpers.py (1)

8-16: Ensure proper logging for debugging purposes.

While the current implementation avoids exposing internal details in the HTTPException, consider adding logging for debugging purposes to capture the query type and other relevant information internally.

import logging

logger = logging.getLogger(__name__)

def get_query_from_query_id(
    query_manager: QueryManager,
    query_cls: Type[Query],
    query_id: str,
) -> Query:
    query = query_manager.get_from_query_id(query_cls, query_id)
    if query is None:
        logger.debug(f"Query of type {query_cls.__name__} with ID <{query_id}> not found")
        raise HTTPException(status_code=404, detail=f"Query<{query_id}> not found")
    return query
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 51aad4c and b3d260b.

Files selected for processing (3)
  • sidecar/app/routes/http_helpers.py (1 hunks)
  • sidecar/app/routes/start.py (6 hunks)
  • sidecar/app/settings.py (2 hunks)
Files skipped from review as they are similar to previous changes (2)
  • sidecar/app/routes/start.py
  • sidecar/app/settings.py
Additional comments not posted (1)
sidecar/app/routes/http_helpers.py (1)

19-30: LGTM! The function is well-implemented.

The check_capacity function is well-implemented and provides useful details in the exception message.

@eriktaubeneck eriktaubeneck merged commit 7490840 into main Jul 17, 2024
3 checks passed
@eriktaubeneck eriktaubeneck deleted the query-manager branch July 17, 2024 22:48
query_id: str
current_step: Optional[Step] = field(init=False, default=None, repr=True)
logger: loguru.Logger = field(init=False, repr=False)
_logger_id: int = field(init=False, repr=False)
logger: loguru.Logger = field(init=False, repr=False, compare=False)
Copy link
Collaborator

@cberkhoff cberkhoff Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use @dataclass if we need to add all these exceptions for each of the fields?
Why not just a class?

Copy link
Member Author

@eriktaubeneck eriktaubeneck Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dataclass is just a wrapper that does some codegen to build standard methods like __init__ and __compare__. These are easy ways to define what goes in that codegen. It looks verbose, but it's much cleaner than defining all those manually.

init=False, repr=True, default_factory=dict
)

def get_from_query_id(self, cls, query_id: str) -> Optional[Query]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have self and cls in the same function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cls is poorly named, let me fix that. it's meant to be the class of the Query that we are getting.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which are the different classes of Query?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants