-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add pytest integration test for scheduler and worker #39
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
6898a9c
Add intergration test
sitaowang1998 c9aec03
Add task to run integration tests
sitaowang1998 860fee1
Remove integration test from all target
sitaowang1998 57ce9a7
Add integration test task in testing doc
sitaowang1998 0f9eb5f
Fix wrong variable name
sitaowang1998 ea60758
Fix wrong data_id type in pytest
sitaowang1998 b3c7153
Improve head task check using any
sitaowang1998 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
black>=24.4.2 | ||
# Lock to v18.x until we can upgrade our code to meet v19's formatting standards. | ||
clang-format~=18.1 | ||
clang-tidy>=19.1.0 | ||
ruff>=0.4.4 | ||
gersemi>=0.16.2 | ||
yamllint>=1.35.1 | ||
yamllint>=1.35.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
line-length = 100 | ||
lint.select = ["I"] | ||
|
||
[lint.isort] | ||
case-sensitive = false | ||
order-by-type = false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
msgpack>=1.1.0 | ||
mysql-connector-python>=8.0.26 | ||
pytest>=8.3.4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,77 @@ | ||
version: "3" | ||
|
||
vars: | ||
G_TEST_BINARY: "{{.G_BUILD_SPIDER_DIR}}/tests/unitTest" | ||
G_UNIT_TEST_BINARY: "{{.G_BUILD_SPIDER_DIR}}/tests/unitTest" | ||
G_TEST_VENV_DIR: "{{.G_BUILD_DIR}}/test-venv" | ||
G_TEST_VENV_CHECKSUM_FILE: "{{.G_BUILD_DIR}}/test#venv.md5" | ||
|
||
tasks: | ||
non-storage-unit-tests: | ||
deps: | ||
- "build-unit-test" | ||
cmds: | ||
- "{{.G_TEST_BINARY}} \"~[storage]\"" | ||
- "{{.G_UNIT_TEST_BINARY}} \"~[storage]\"" | ||
|
||
storage-unit-tests: | ||
deps: | ||
- "build-unit-test" | ||
cmds: | ||
- "{{.G_TEST_BINARY}} \"[storage]\"" | ||
- "{{.G_UNIT_TEST_BINARY}} \"[storage]\"" | ||
|
||
all: | ||
deps: | ||
- "build-unit-test" | ||
cmds: | ||
- "{{.G_TEST_BINARY}}" | ||
- "{{.G_UNIT_TEST_BINARY}}" | ||
|
||
build-unit-test: | ||
internal: true | ||
deps: | ||
- task: ":build:target" | ||
vars: | ||
TARGETS: ["spider_task_executor", "unitTest", "worker_test"] | ||
|
||
integration: | ||
dir: "{{.G_BUILD_SPIDER_DIR}}" | ||
deps: | ||
- "venv" | ||
- task: ":build:target" | ||
vars: | ||
TARGETS: [ | ||
"spider_task_executor", | ||
"worker_test", | ||
"spider_worker", | ||
"spider_scheduler", | ||
"integrationTest"] | ||
cmd: |- | ||
. ../test-venv/bin/activate | ||
../test-venv/bin/pytest tests/integration | ||
|
||
venv: | ||
internal: true | ||
vars: | ||
CHECKSUM_FILE: "{{.G_TEST_VENV_CHECKSUM_FILE}}" | ||
OUTPUT_DIR: "{{.G_TEST_VENV_DIR}}" | ||
sources: | ||
- "{{.ROOT_DIR}}/taskfile.yaml" | ||
- "{{.TASKFILE}}" | ||
- "test-requirements.txt" | ||
generates: ["{{.CHECKSUM_FILE}}"] | ||
run: "once" | ||
deps: | ||
- ":init" | ||
- task: ":utils:validate-checksum" | ||
vars: | ||
CHECKSUM_FILE: "{{.CHECKSUM_FILE}}" | ||
DATA_DIR: "{{.OUTPUT_DIR}}" | ||
cmds: | ||
- task: ":utils:create-venv" | ||
vars: | ||
LABEL: "test" | ||
OUTPUT_DIR: "{{.OUTPUT_DIR}}" | ||
REQUIREMENTS_FILE: "test-requirements.txt" | ||
# This command must be last | ||
- task: ":utils:compute-checksum" | ||
vars: | ||
DATA_DIR: "{{.OUTPUT_DIR}}" | ||
OUTPUT_FILE: "{{.CHECKSUM_FILE}}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,157 @@ | ||
import re | ||
import uuid | ||
from dataclasses import dataclass | ||
from typing import Dict, List, Optional, Tuple | ||
|
||
import mysql.connector | ||
import pytest | ||
|
||
|
||
@dataclass | ||
class TaskInput: | ||
type: str | ||
task_output: Optional[Tuple[uuid.UUID, int]] = None | ||
value: Optional[str] = None | ||
data_id: Optional[uuid.UUID] = None | ||
|
||
|
||
@dataclass | ||
class TaskOutput: | ||
type: str | ||
value: Optional[str] = None | ||
data_id: Optional[uuid.UUID] = None | ||
|
||
|
||
@dataclass | ||
class Task: | ||
id: uuid.UUID | ||
function_name: str | ||
inputs: List[TaskInput] | ||
outputs: List[TaskOutput] | ||
timeout: float = 0.0 | ||
|
||
|
||
@dataclass | ||
class TaskGraph: | ||
id: uuid.UUID | ||
tasks: Dict[uuid.UUID, Task] | ||
dependencies: List[Tuple[uuid.UUID, uuid.UUID]] | ||
|
||
|
||
def create_connection(storage_url: str): | ||
pattern = re.compile( | ||
r"jdbc:mariadb://(?P<host>[^:/]+):(?P<port>\d+)/(?P<database>[^?]+)\?user=(?P<user>[^&]+)&password=(?P<password>[^&]+)" | ||
) | ||
match = pattern.match(storage_url) | ||
if not match: | ||
raise ValueError("Invalid JDBC URL format") | ||
|
||
connection_params = match.groupdict() | ||
return mysql.connector.connect( | ||
host=connection_params["host"], | ||
port=int(connection_params["port"]), | ||
database=connection_params["database"], | ||
user=connection_params["user"], | ||
password=connection_params["password"], | ||
) | ||
|
||
|
||
def is_head_task(task_id: uuid.UUID, dependencies: List[Tuple[uuid.UUID, uuid.UUID]]): | ||
return not any(dependency[1] == task_id for dependency in dependencies) | ||
|
||
|
||
storage_url = "jdbc:mariadb://localhost:3306/spider_test?user=root&password=password" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Avoid hard-coding credentials |
||
|
||
|
||
@pytest.fixture(scope="session") | ||
def storage(): | ||
conn = create_connection(storage_url) | ||
yield conn | ||
conn.close() | ||
|
||
|
||
def submit_job(conn, client_id: uuid.UUID, graph: TaskGraph): | ||
cursor = conn.cursor() | ||
|
||
cursor.execute( | ||
"INSERT INTO jobs (id, client_id) VALUES (%s, %s)", (graph.id.bytes, client_id.bytes) | ||
) | ||
|
||
for task_id, task in graph.tasks.items(): | ||
if is_head_task(task_id, graph.dependencies): | ||
state = "ready" | ||
else: | ||
state = "pending" | ||
cursor.execute( | ||
"INSERT INTO tasks (id, job_id, func_name, state, timeout) VALUES (%s, %s, %s, %s, %s)", | ||
(task.id.bytes, graph.id.bytes, task.function_name, state, task.timeout), | ||
) | ||
|
||
for i, task_input in enumerate(task.inputs): | ||
cursor.execute( | ||
"INSERT INTO task_inputs (type, task_id, position, output_task_id, output_task_position, value, data_id) VALUES (%s, %s, %s, %s, %s, %s, %s)", | ||
( | ||
task_input.type, | ||
task.id.bytes, | ||
i, | ||
task_input.task_output[0].bytes if task_input.task_output is not None else None, | ||
task_input.task_output[1] if task_input.task_output is not None else None, | ||
task_input.value, | ||
task_input.data_id.bytes if task_input.data_id is not None else None, | ||
), | ||
) | ||
|
||
for i, task_output in enumerate(task.outputs): | ||
cursor.execute( | ||
"INSERT INTO task_outputs (task_id, position, type) VALUES (%s, %s, %s)", | ||
(task.id.bytes, i, task_output.type), | ||
) | ||
|
||
for dependency in graph.dependencies: | ||
cursor.execute( | ||
"INSERT INTO task_dependencies (parent, child) VALUES (%s, %s)", | ||
(dependency[0].bytes, dependency[1].bytes), | ||
) | ||
|
||
conn.commit() | ||
cursor.close() | ||
|
||
|
||
def get_task_outputs(conn, task_id: uuid.UUID) -> List[TaskOutput]: | ||
cursor = conn.cursor() | ||
|
||
cursor.execute( | ||
"SELECT type, value, data_id FROM task_outputs WHERE task_id = %s ORDER BY position", | ||
(task_id.bytes,), | ||
) | ||
outputs = [] | ||
for output_type, value, data_id in cursor.fetchall(): | ||
if value is not None: | ||
outputs.append(TaskOutput(type=output_type, value=value)) | ||
elif data_id is not None: | ||
outputs.append(TaskOutput(type=output_type, data_id=uuid.UUID(bytes=data_id))) | ||
else: | ||
outputs.append(TaskOutput(type=output_type)) | ||
|
||
conn.commit() | ||
cursor.close() | ||
return outputs | ||
|
||
|
||
def get_task_state(conn, task_id: uuid.UUID) -> str: | ||
cursor = conn.cursor() | ||
|
||
cursor.execute("SELECT state FROM tasks WHERE id = %s", (task_id.bytes,)) | ||
state = cursor.fetchone()[0] | ||
|
||
conn.commit() | ||
cursor.close() | ||
return state | ||
|
||
|
||
def remove_job(conn, job_id: uuid.UUID): | ||
cursor = conn.cursor() | ||
|
||
cursor.execute("DELETE FROM jobs WHERE id = %s", (job_id.bytes,)) | ||
conn.commit() | ||
cursor.close() |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Variable G_UNIT_TEST_BINARY is used but not defined
The codebase search confirms that G_UNIT_TEST_BINARY is referenced in multiple test commands but is not defined in the variables section where G_TEST_BINARY is defined. This mismatch needs to be addressed to prevent potential runtime errors.
🔗 Analysis chain
Double-check usage of G_UNIT_TEST_BINARY
The command references "{{.G_UNIT_TEST_BINARY}}" but the variable name G_TEST_BINARY is defined at the top. Confirm that G_UNIT_TEST_BINARY is similarly defined and that no mismatch occurs.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
Length of output: 699