Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Active Learning data collecting block for workflows #264

Merged
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
b6823a1
Removed results unwrapping from workflows
PawelPeczek-Roboflow Feb 13, 2024
3b26b2b
Make Tiny cache default option
PawelPeczek-Roboflow Feb 13, 2024
c2af81e
Add draft implementation of the active learning block
PawelPeczek-Roboflow Feb 13, 2024
89a39a7
Add extension to output prediction type from model steps and steps th…
PawelPeczek-Roboflow Feb 13, 2024
134eb87
Bring back validation of dangling nodes
PawelPeczek-Roboflow Feb 14, 2024
5c10578
Merge branch 'main' into feature/add_active_learning_block_for_workflows
PawelPeczek-Roboflow Feb 14, 2024
b2efc38
Bump rc version
PawelPeczek-Roboflow Feb 14, 2024
83eed51
Add tests for validation of al data collector step definition
PawelPeczek-Roboflow Feb 14, 2024
cc34098
Add possibility to define AL config at the level of AL workflow steps
PawelPeczek-Roboflow Feb 14, 2024
8746357
Connect AL configuration stated at step level to the middleware
PawelPeczek-Roboflow Feb 14, 2024
ecfc67a
Fix issue spotted in initial testing
PawelPeczek-Roboflow Feb 14, 2024
e66b9bf
Fix problem with docker build
PawelPeczek-Roboflow Feb 14, 2024
99a7cf0
Bring back old build
PawelPeczek-Roboflow Feb 14, 2024
efbae9e
Revert change
PawelPeczek-Roboflow Feb 14, 2024
e2b7c92
Update notebook example
PawelPeczek-Roboflow Feb 14, 2024
69c88aa
Add changed to enable persistence of AL predictions in workflows + tests
PawelPeczek-Roboflow Feb 14, 2024
d60d007
Add changes into docs
PawelPeczek-Roboflow Feb 14, 2024
b227529
Add changes to notebook to illustrate AL usage:
PawelPeczek-Roboflow Feb 14, 2024
85e2ead
Update dosc
PawelPeczek-Roboflow Feb 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 20 additions & 19 deletions examples/notebooks/workflows.ipynb

Large diffs are not rendered by default.

65 changes: 51 additions & 14 deletions inference/core/active_learning/configuration.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,23 +58,37 @@ def prepare_active_learning_configuration(
f"project: {project_metadata.dataset_id} of type: {project_metadata.dataset_type}. "
f"AL configuration: {project_metadata.active_learning_configuration}"
)
sampling_methods = initialize_sampling_methods(
sampling_strategies_configs=project_metadata.active_learning_configuration[
"sampling_strategies"
],
return initialise_active_learning_configuration(
project_metadata=project_metadata,
)
target_workspace_id = project_metadata.active_learning_configuration.get(
"target_workspace", project_metadata.workspace_id


def prepare_active_learning_configuration_inplace(
api_key: str,
model_id: str,
active_learning_configuration: Optional[dict],
) -> Optional[ActiveLearningConfiguration]:
if (
active_learning_configuration is None
or active_learning_configuration.get("enabled", False) is False
):
return None
dataset_id, version_id = get_model_id_chunks(model_id=model_id)
workspace_id = get_roboflow_workspace(api_key=api_key)
dataset_type = get_roboflow_dataset_type(
api_key=api_key,
workspace_id=workspace_id,
dataset_id=dataset_id,
)
target_dataset_id = project_metadata.active_learning_configuration.get(
"target_project", project_metadata.dataset_id
project_metadata = RoboflowProjectMetadata(
dataset_id=dataset_id,
version_id=version_id,
workspace_id=workspace_id,
dataset_type=dataset_type,
active_learning_configuration=active_learning_configuration,
)
return ActiveLearningConfiguration.init(
roboflow_api_configuration=project_metadata.active_learning_configuration,
sampling_methods=sampling_methods,
workspace_id=target_workspace_id,
dataset_id=target_dataset_id,
model_id=model_id,
return initialise_active_learning_configuration(
project_metadata=project_metadata,
)


Expand Down Expand Up @@ -145,6 +159,29 @@ def parse_cached_roboflow_project_metadata(
) from error


def initialise_active_learning_configuration(
project_metadata: RoboflowProjectMetadata,
) -> ActiveLearningConfiguration:
sampling_methods = initialize_sampling_methods(
sampling_strategies_configs=project_metadata.active_learning_configuration[
"sampling_strategies"
],
)
target_workspace_id = project_metadata.active_learning_configuration.get(
"target_workspace", project_metadata.workspace_id
)
target_dataset_id = project_metadata.active_learning_configuration.get(
"target_project", project_metadata.dataset_id
)
return ActiveLearningConfiguration.init(
roboflow_api_configuration=project_metadata.active_learning_configuration,
sampling_methods=sampling_methods,
workspace_id=target_workspace_id,
dataset_id=target_dataset_id,
model_id=f"{project_metadata.dataset_id}/{project_metadata.version_id}",
)


def initialize_sampling_methods(
sampling_strategies_configs: List[Dict[str, Any]]
) -> List[SamplingMethod]:
Expand Down
2 changes: 1 addition & 1 deletion inference/core/active_learning/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,5 +215,5 @@ def is_prediction_registration_forbidden(
roboflow_image_id is None
or persist_predictions is False
or prediction.get("is_stub", False) is True
or len(prediction.get("predictions", [])) == 0
or (len(prediction.get("predictions", [])) == 0 and "top" not in prediction)
)
38 changes: 38 additions & 0 deletions inference/core/active_learning/middlewares.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from inference.core.active_learning.batching import generate_batch_name
from inference.core.active_learning.configuration import (
prepare_active_learning_configuration,
prepare_active_learning_configuration_inplace,
)
from inference.core.active_learning.core import (
execute_datapoint_registration,
Expand Down Expand Up @@ -72,6 +73,21 @@ def init(
cache=cache,
)

@classmethod
def init_from_config(
cls, api_key: str, model_id: str, cache: BaseCache, config: Optional[dict]
) -> "ActiveLearningMiddleware":
configuration = prepare_active_learning_configuration_inplace(
api_key=api_key,
model_id=model_id,
active_learning_configuration=config,
)
return cls(
api_key=api_key,
configuration=configuration,
cache=cache,
)

def __init__(
self,
api_key: str,
Expand Down Expand Up @@ -178,6 +194,28 @@ def init(
task_queue=task_queue,
)

@classmethod
def init_from_config(
cls,
api_key: str,
model_id: str,
cache: BaseCache,
config: Optional[dict],
max_queue_size: int = MAX_REGISTRATION_QUEUE_SIZE,
) -> "ThreadingActiveLearningMiddleware":
configuration = prepare_active_learning_configuration_inplace(
api_key=api_key,
model_id=model_id,
active_learning_configuration=config,
)
task_queue = Queue(max_queue_size)
return cls(
api_key=api_key,
configuration=configuration,
cache=cache,
task_queue=task_queue,
)

def __init__(
self,
api_key: str,
Expand Down
2 changes: 1 addition & 1 deletion inference/core/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@
GAZE_MAX_BATCH_SIZE = int(os.getenv("GAZE_MAX_BATCH_SIZE", 8))

# If true, this will store a non-verbose version of the inference request and repsonse in the cache
TINY_CACHE = str2bool(os.getenv("TINY_CACHE", False))
TINY_CACHE = str2bool(os.getenv("TINY_CACHE", True))

# Maximum batch size for CLIP, default is 8
CLIP_MAX_BATCH_SIZE = int(os.getenv("CLIP_MAX_BATCH_SIZE", 8))
Expand Down
14 changes: 14 additions & 0 deletions inference/core/interfaces/http/http_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from fastapi_cprofile.profiler import CProfileMiddleware

from inference.core import logger
from inference.core.cache import cache
from inference.core.devices.utils import GLOBAL_INFERENCE_SERVER_ID
from inference.core.entities.requests.clip import (
ClipCompareRequest,
Expand Down Expand Up @@ -126,6 +127,9 @@
from inference.core.utils.notebooks import start_notebook
from inference.enterprise.workflows.complier.core import compile_and_execute_async
from inference.enterprise.workflows.complier.entities import StepExecutionMode
from inference.enterprise.workflows.complier.steps_executors.active_learning_middlewares import (
WorkflowsActiveLearningMiddleware,
)
from inference.enterprise.workflows.errors import (
ExecutionEngineError,
RuntimePayloadError,
Expand Down Expand Up @@ -296,6 +300,9 @@ async def count_errors(request: Request, call_next):

self.app = app
self.model_manager = model_manager
self.workflows_active_learning_middleware = WorkflowsActiveLearningMiddleware(
cache=cache,
)

async def process_inference_request(
inference_request: InferenceRequest, **kwargs
Expand All @@ -320,6 +327,7 @@ async def process_inference_request(
async def process_workflow_inference_request(
workflow_request: WorkflowInferenceRequest,
workflow_specification: dict,
background_tasks: Optional[BackgroundTasks],
) -> WorkflowInferenceResponse:
step_execution_mode = StepExecutionMode(WORKFLOWS_STEP_EXECUTION_MODE)
result = await compile_and_execute_async(
Expand All @@ -329,6 +337,8 @@ async def process_workflow_inference_request(
api_key=workflow_request.api_key,
max_concurrent_steps=WORKFLOWS_MAX_CONCURRENT_STEPS,
step_execution_mode=step_execution_mode,
active_learning_middleware=self.workflows_active_learning_middleware,
background_tasks=background_tasks,
)
outputs = serialise_workflow_result(
result=result,
Expand Down Expand Up @@ -656,6 +666,7 @@ async def infer_from_predefined_workflow(
workspace_name: str,
workflow_name: str,
workflow_request: WorkflowInferenceRequest,
background_tasks: BackgroundTasks,
) -> WorkflowInferenceResponse:
workflow_specification = get_workflow_specification(
api_key=workflow_request.api_key,
Expand All @@ -665,6 +676,7 @@ async def infer_from_predefined_workflow(
return await process_workflow_inference_request(
workflow_request=workflow_request,
workflow_specification=workflow_specification,
background_tasks=background_tasks if not LAMBDA else None,
)

@app.post(
Expand All @@ -676,13 +688,15 @@ async def infer_from_predefined_workflow(
@with_route_exceptions
async def infer_from_workflow(
workflow_request: WorkflowSpecificationInferenceRequest,
background_tasks: BackgroundTasks,
) -> WorkflowInferenceResponse:
workflow_specification = {
"specification": workflow_request.specification
}
return await process_workflow_inference_request(
workflow_request=workflow_request,
workflow_specification=workflow_specification,
background_tasks=background_tasks if not LAMBDA else None,
)

if CORE_MODELS_ENABLED:
Expand Down
2 changes: 1 addition & 1 deletion inference/core/version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "0.9.10"
__version__ = "0.9.11rc1"


if __name__ == "__main__":
Expand Down
55 changes: 55 additions & 0 deletions inference/enterprise/workflows/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,7 @@ input parameter
* `confidence` - confidence of prediction
* `parent_id` - identifier of parent image / associated detection that helps to identify predictions with RoI in case
of multi-step pipelines
* `prediction_type` - denoting `classification` model

#### `MultiLabelClassificationModel`
This step represents inference from multi-label classification model.
Expand All @@ -281,6 +282,7 @@ input parameter
* `predicted_classes` - top classes
* `parent_id` - identifier of parent image / associated detection that helps to identify predictions with RoI in case
of multi-step pipelines
* `prediction_type` - denoting `classification` model

#### `ObjectDetectionModel`
This step represents inference from object detection model.
Expand Down Expand Up @@ -309,6 +311,7 @@ input parameter. Default: `0.3`.
* `image` - size of input image, that `predictions` coordinates refers to
* `parent_id` - identifier of parent image / associated detection that helps to identify predictions with RoI in case
of multi-step pipelines
* `prediction_type` - denoting `object-detection` model

#### `KeypointsDetectionModel`
This step represents inference from keypoints detection model.
Expand Down Expand Up @@ -339,6 +342,7 @@ input parameter
* `image` - size of input image, that `predictions` coordinates refers to
* `parent_id` - identifier of parent image / associated detection that helps to identify predictions with RoI in case
of multi-step pipelines
* `prediction_type` - denoting `keypoint-detection` model

#### `InstanceSegmentationModel`
This step represents inference from instance segmentation model.
Expand Down Expand Up @@ -370,6 +374,7 @@ input parameter
* `image` - size of input image, that `predictions` coordinates refers to
* `parent_id` - identifier of parent image / associated detection that helps to identify predictions with RoI in case
of multi-step pipelines
* `prediction_type` - denoting `instance-segmentation` model

#### `OCRModel`
This step represents inference from OCR model.
Expand All @@ -384,6 +389,7 @@ This step represents inference from OCR model.
* `result` - details of predictions
* `parent_id` - identifier of parent image / associated detection that helps to identify predictions with RoI in case
of multi-step pipelines
* `prediction_type` - denoting `ocr` model

#### `Crop`
This step produces **dynamic** crops based on detections from detections-based model.
Expand Down Expand Up @@ -466,6 +472,7 @@ This let user define recursive structure of filters.
* `image` - size of input image, that `predictions` coordinates refers to
* `parent_id` - identifier of parent image / associated detection that helps to identify predictions with RoI in case
of multi-step pipelines
* `prediction_type` - denoting parent model type

#### `DetectionOffset`
This step is responsible for applying fixed offset on width and height of detections.
Expand All @@ -484,6 +491,7 @@ This step is responsible for applying fixed offset on width and height of detect
* `image` - size of input image, that `predictions` coordinates refers to
* `parent_id` - identifier of parent image / associated detection that helps to identify predictions with RoI in case
of multi-step pipelines
* `prediction_type` - denoting parent model type


#### `AbsoluteStaticCrop` and `RelativeStaticCrop`
Expand Down Expand Up @@ -596,6 +604,53 @@ of multi-step pipelines (can be `undefined` if all sources of predictions give n
objects specified in config are present
* `presence_confidence` - for each input image, for each present class - aggregated confidence indicating presence
of objects
* `prediction_type` - denoting `object-detection` prediction (as this format is effective even if other detections
models are combined)

#### `ActiveLearningDataCollector`
Step that is supposed to be a solution for anyone who wants to collect data and predictions that flow through the
`workflows`. The block is build on the foundations of Roboflow Active Learning capabilities implemented in
[`active_learning` module](../../core/active_learning/README.md) - so all the capabilities should be preserved.
There are **very important** considerations regarding collecting data with AL at the `workflows` level and in
scope of specific models. Read `important notes` section to discover nuances.
General use-cases for this block:
* grab data and predictions from single model / ensemble of models
* posting the data in different project that the origin of models used in `workflow` - in particular **one may now use
open models - like `yolov8n-640` and start sampling data to their own project!**
* defining multiple different sampling strategies for different `workflows` (step allows to provide custom config of AL
data collection - so you are not bounded to configuration of AL at the project level - and multiple instances of
configs can co-exist)

##### Step parameters
* `type`: must be `ActiveLearningDataCollector` (required)
* `name`: must be unique within all steps - used as identifier (required)
* `image`: must be a reference to input of type `InferenceImage` or `crops` output from steps executing cropping (
`Crop`, `AbsoluteStaticCrop`, `RelativeStaticCrop`) (required)
* `predictions` - selector pointing to outputs of detections models output of the detections model: [`ObjectDetectionModel`,
`KeypointsDetectionModel`, `InstanceSegmentationModel`, `DetectionFilter`, `DetectionsConsensus`] (then use `$steps.<det_step_name>.predictions`)
or outputs of classification [`ClassificationModel`] (then use `$steps.<cls_step_name>.top`) (required)
* `target_dataset` - name of Roboflow dataset / project to be used as target for collected data (required)
* `target_dataset_api_key` - optional API key to be used for data registration. This may help in a scenario when data
are to be registered cross-workspaces. If not provided - the API key from a request would be used to register data (
applicable for Universe models predictions to be saved in private workspaces and for models that were trained in the same
workspace (not necessarily within the same project)).
* `disable_active_learning` - boolean flag that can be also reference to input - to arbitrarily disable data collection
for specific request - overrides all AL config. (optional, default: `False`)
* `active_learning_configuration` - optional configuration of Active Learning data sampling in the exact format provided
in [`active_learning` docs](../../core/active_learning/README.md)

##### Step outputs
No outputs are declared - step is supposed to cause side effect in form of data sampling and registration.

##### Important notes
* this block is implemented in non-async way - which means that in certain cases it can block event loop causing
parallelization not feasible. This is not the case when running in `inference` HTTP container. At Roboflow
hosted platform - registration cannot be executed as background task - so its duration must be added into expected
latency
* **important :exclamation:** be careful in enabling / disabling AL at the level of steps - remember that when
predicting from each model, `inference` HTTP API tries to get Active Learning config from the project that model
belongs to and register datapoint. To prevent that from happening - model steps can be provided with
`disable_active_learning=True` parameter. Then the only place where AL registration happens is `ActiveLearningDataCollector`.

## Different modes of execution
Workflows can be executed in `local` environment, or `remote` environment can be used. `local` means that model steps
Expand Down
Loading
Loading