Skip to content

Commit

Permalink
Merge pull request #1 from MitchellAcoustics/refactoring
Browse files Browse the repository at this point in the history
Refactoring
  • Loading branch information
MitchellAcoustics authored Aug 7, 2024
2 parents f2311f9 + 2c57e68 commit defd1c1
Show file tree
Hide file tree
Showing 28 changed files with 2,038 additions and 959 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ wheels/
# venv
.venv
example_inputs
.idea
*.log
.vscode

# IDE files
.idea/
Expand All @@ -19,3 +22,4 @@ src/.DS_Store
# in progress / trials
scripts/

/.vscode/settings.json
52 changes: 37 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,33 +86,55 @@ visualization:
## Models
CitySeg currently supports OneFormer models. The verified models include:
CitySeg currently supports Mask2Former and BEIT models. The verified models include:
- `shi-labs/oneformer_ade20k_swin_large`
- `shi-labs/oneformer_cityscapes_swin_large`
- `shi-labs/oneformer_ade20k_dinat_large`
- `shi-labs/oneformer_cityscapes_dinat_large`
- "facebook/mask2former-swin-large-cityscapes-semantic"
- "facebook/mask2former-swin-large-mapillary-vistas-semantic"
- "facebook/maskformer-swin-small-ade" (sort of, this often leads to segfaults. Recommend using `disable_tqdm` in the config.)
- "microsoft/beit-large-finetuned-ade-640-640"
- "nvidia/segformer-b5-finetuned-cityscapes-1024-1024"
- "zoheb/mit-b5-finetuned-sidewalk-semantic"
- "nickmuchi/segformer-b4-finetuned-segments-sidewalk"

Mask2Former are by far the most stable.

Some models which seem to load correctly but continually produce segfault errors on my machine are:

- "facebook/maskformer-swin-large-ade"
- "nvidia/segformer-b5-finetuned-ade-640-640"
- "nvidia/segformer-b0-finetuned-cityscapes-1024-1024"
- "zoheb/mit-b5-finetuned-sidewalk-semantic" (use `model_type: segformer` in the config)

Confirmed not to work due to issues with the Hugging Face pipeline:

- "shi-labs/oneformer_ade20k_dinat_large"

**Note on `dinat` models:** The `dinat` backbone models require the `natten` package, which may have installation issues on some systems. These models are also significantly slower than the `swin` backbone models, especially when forced to run on CPU. However, they may produce better quality outputs in some cases.

## Project Structure

The project is organized into several Python modules:
The project is organized into several Python modules, each serving a specific purpose within the CitySeg pipeline:

- `main.py`: Entry point of the application, responsible for initializing and running the segmentation pipeline.
- `config.py`: Defines configuration classes and handles loading and validating configuration settings.
- `pipeline.py`: Implements the core segmentation pipeline, including model loading and inference.
- `processors.py`: Contains classes for processing images, videos, and directories, managing the segmentation workflow.
- `segmentation_analyzer.py`: Provides functionality for analyzing segmentation results, including computing statistics and generating reports.
- `video_file_iterator.py`: Implements an iterator for efficiently processing multiple video files in a directory.
- `visualization_handler.py`: Handles the visualization of segmentation results using color palettes.
- `file_handler.py`: Manages file operations related to saving and loading segmentation data and metadata.
- `utils.py`: Provides utility functions for various tasks, including data handling and logging.
- `palettes.py`: Defines color palettes for different datasets used in segmentation.
- `exceptions.py`: Custom exception classes for error handling throughout the pipeline.

- `main.py`: Entry point of the application
- `config.py`: Defines configuration classes for the pipeline
- `pipeline.py`: Implements the core segmentation pipeline
- `processors.py`: Contains classes for processing images, videos, and directories
- `utils.py`: Provides utility functions for analysis, file operations, and logging
- `palettes.py`: Defines color palettes for different datasets
- `exceptions.py`: Custom exception classes for error handling
This modular structure allows for easy maintenance and extension of the CitySeg pipeline, facilitating the addition of new features and models.

## Logging

The pipeline uses the `loguru` library for flexible and configurable logging. You can set the log level and enable verbose output using command-line arguments:

```
python main.py --config path/to/your/config.yaml --log-level INFO --verbose
python main.py --config path/to/your/config.yaml --log-level INFO # or DEBUG, WARNING, ERROR, CRITICAL or --verbose
```

Logs are output to both the console and a file (`segmentation.log`). The file log is in JSON format for easy parsing and analysis.
Expand All @@ -136,4 +158,4 @@ CitySeg is released under the BSD 3-Clause License. See the `LICENSE` file for d

## Contact

For support or inquiries, please open an issue on the GitHub repository or contact [Your Name/Email].
For support or inquiries, please open an issue on the GitHub repository or contact Andrew Mitchell.
10 changes: 10 additions & 0 deletions docs/api/handlers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Handlers

::: cityseg.file_handler


::: cityseg.visualization_handler
options:
show_root_heading: true
members: true
parameter_headings: true
13 changes: 10 additions & 3 deletions docs/api/processors.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# Processors Module

::: cityseg.processors
options:
members: true
parameter_headings: true

::: cityseg.processing_plan.ProcessingPlan
options:
show_root_heading: true
show_root_full_path: false

::: cityseg.video_file_iterator.VideoFileIterator
options:
show_root_heading: true
show_root_full_path: false
1 change: 1 addition & 0 deletions docs/api/segments_analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: cityseg.segmentation_analyzer
19 changes: 13 additions & 6 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,19 @@ For more detailed information on how to use CitySeg, check out our [Getting Star

## Project Structure

CitySeg is organized into several Python modules:
The project is organized into several Python modules, each serving a specific purpose within the CitySeg pipeline:

- `main.py`: Entry point of the application, responsible for initializing and running the segmentation pipeline.
- `config.py`: Defines configuration classes and handles loading and validating configuration settings.
- `pipeline.py`: Implements the core segmentation pipeline, including model loading and inference.
- `processors.py`: Contains classes for processing images, videos, and directories, managing the segmentation workflow.
- `segmentation_analyzer.py`: Provides functionality for analyzing segmentation results, including computing statistics and generating reports.
- `video_file_iterator.py`: Implements an iterator for efficiently processing multiple video files in a directory.
- `visualization_handler.py`: Handles the visualization of segmentation results using color palettes.
- `file_handler.py`: Manages file operations related to saving and loading segmentation data and metadata.
- `utils.py`: Provides utility functions for various tasks, including data handling and logging.
- `palettes.py`: Defines color palettes for different datasets used in segmentation.
- `exceptions.py`: Custom exception classes for error handling throughout the pipeline.

- `config.py`: Configuration classes for the pipeline
- `pipeline.py`: Core segmentation pipeline implementation
- `processors.py`: Classes for processing images, videos, and directories
- `utils.py`: Utility functions for analysis, file operations, and logging
- `exceptions.py`: Custom exception classes for error handling

For detailed API documentation, visit our [API Reference](api/config.md) section.
3 changes: 3 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ nav:
- Config: api/config.md
- Pipeline: api/pipeline.md
- Processors: api/processors.md
- Segments Analysis: api/segments_analysis.md
- Handlers: api/handlers.md
- Utils: api/utils.md
- Palettes: api/palettes.md
- Exceptions: api/exceptions.md
Expand Down Expand Up @@ -104,6 +106,7 @@ plugins:
merge_init_into_class: true
ignore_init_summary: true
show_labels: false
parameter_headings: true

show_if_no_docstring: false
docstring_section_style: spacy
Expand Down
5 changes: 2 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "cityseg"
version = "0.2.11"
version = "0.3.0"
description = "A flexible and efficient semantic segmentation pipeline for processing images and videos"
authors = [
{ name = "Andrew Mitchell", email = "[email protected]" }
Expand Down Expand Up @@ -42,9 +42,8 @@ dev-dependencies = [
"jupyter>=1.0.0",
"pytest>=8.3.2",
"rich[jupyter]>=13.7.1",
"natten==0.17.1",
# "natten==0.17.1",
"matplotlib>=3.9.1",
"yappi>=1.6.0",
]

[tool.hatch.metadata]
Expand Down
4 changes: 0 additions & 4 deletions requirements-dev.lock
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,6 @@ mpmath==1.3.0
# via sympy
mypy-extensions==1.0.0
# via black
natten==0.17.1
nbclient==0.10.0
# via nbconvert
nbconvert==7.16.4
Expand Down Expand Up @@ -280,7 +279,6 @@ packaging==24.1
# via jupyterlab-server
# via matplotlib
# via mkdocs
# via natten
# via nbconvert
# via pytest
# via qtconsole
Expand Down Expand Up @@ -422,7 +420,6 @@ tokenizers==0.19.1
# via transformers
torch==2.4.0
# via cityseg
# via natten
# via torchvision
torchvision==0.19.0
# via cityseg
Expand Down Expand Up @@ -479,4 +476,3 @@ websocket-client==1.8.0
# via jupyter-server
widgetsnbextension==4.0.11
# via ipywidgets
yappi==1.6.0
1 change: 1 addition & 0 deletions src/cityseg/SemanticSidewalk_id2label.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"0": "unlabeled", "1": "flat-road", "2": "flat-sidewalk", "3": "flat-crosswalk", "4": "flat-cyclinglane", "5": "flat-parkingdriveway", "6": "flat-railtrack", "7": "flat-curb", "8": "human-person", "9": "human-rider", "10": "vehicle-car", "11": "vehicle-truck", "12": "vehicle-bus", "13": "vehicle-tramtrain", "14": "vehicle-motorcycle", "15": "vehicle-bicycle", "16": "vehicle-caravan", "17": "vehicle-cartrailer", "18": "construction-building", "19": "construction-door", "20": "construction-wall", "21": "construction-fenceguardrail", "22": "construction-bridge", "23": "construction-tunnel", "24": "construction-stairs", "25": "object-pole", "26": "object-trafficsign", "27": "object-trafficlight", "28": "nature-vegetation", "29": "nature-terrain", "30": "sky", "31": "void-ground", "32": "void-dynamic", "33": "void-static", "34": "void-unclear"}
18 changes: 13 additions & 5 deletions src/cityseg/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,7 @@
This package provides a flexible and efficient semantic segmentation pipeline
for processing images and videos. It supports multiple segmentation models
and datasets, with capabilities for tiling large images, mixed-precision
processing, and comprehensive result analysis.
and datasets.
Main components:
- Config: Configuration class for the pipeline
Expand All @@ -20,27 +19,36 @@
For detailed usage instructions, please refer to the package documentation.
"""

__version__ = "0.2.0"
__version__ = "0.2.12"

from . import palettes
from .config import Config
from .exceptions import ConfigurationError, InputError, ModelError, ProcessingError
from .file_handler import FileHandler
from .pipeline import SegmentationPipeline, create_segmentation_pipeline
from .processing_plan import ProcessingPlan
from .processors import DirectoryProcessor, SegmentationProcessor, create_processor
from .utils import analyze_segmentation_map, setup_logging
from .segmentation_analyzer import SegmentationAnalyzer
from .utils import setup_logging
from .video_file_iterator import VideoFileIterator
from .visualization_handler import VisualizationHandler

__all__ = [
"Config",
"SegmentationPipeline",
"create_segmentation_pipeline",
"SegmentationProcessor",
"SegmentationAnalyzer",
"DirectoryProcessor",
"create_processor",
"ConfigurationError",
"InputError",
"ModelError",
"ProcessingError",
"analyze_segmentation_map",
"setup_logging",
"palettes",
"FileHandler",
"VisualizationHandler",
"ProcessingPlan",
"VideoFileIterator",
]
46 changes: 41 additions & 5 deletions src/cityseg/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from typing import Any, Dict, List, Optional, Union

import yaml
from loguru import logger


class InputType(Enum):
Expand All @@ -36,14 +37,45 @@ class ModelConfig:
"""

name: str
model_type: Optional[str] = (
None # Can be 'oneformer', 'mask2former', or None for auto-detection
)
model_type: Optional[str] = None
max_size: Optional[int] = None
device: Optional[str] = None
dataset: Optional[str] = None
num_workers: Optional[int] = 8
pipe_batch: Optional[int] = 1

# TODO: impelement model_type auto-detection
# TODO: implement device auto-detection
def __post_init__(self):
"""
Post-initialization method to set up the model type if not provided.
"""
self.auto_detect_model_type()
if self.device == "mps" and self.num_workers > 0 or self.num_workers is None:
logger.warning(
"MPS is not compatible with multiple workers in pytorch. Setting num_workers to 0."
)
self.num_workers = 0

def auto_detect_model_type(self):
"""
Automatically detect the model type from the model name if not provided.
"""

def auto_model_type(model_name: str) -> str:
return model_name.split("/")[-1].split("-")[0]

if self.model_type is None:
try:
self.model_type = auto_model_type(self.name)
except IndexError:
logger.warning(
"Unable to auto-detect model type from the model name and none provided."
)
return
logger.info(f"Auto-detected model type: {self.model_type}")
elif self.model_type != auto_model_type(self.name):
logger.warning(
f"Model type does not match auto-detected model type. Using provided model type: {self.model_type}"
)


@dataclass
Expand Down Expand Up @@ -83,6 +115,7 @@ class Config:
visualization (VisualizationConfig): The visualization configuration.
input_type (InputType): The type of input (automatically determined).
force_reprocess (bool): Whether to force reprocessing of existing results.
disable_tqdm (bool): Whether to disable the progress bar display.
"""

input: Union[Path, str]
Expand All @@ -100,6 +133,7 @@ class Config:
visualization: VisualizationConfig = field(default_factory=VisualizationConfig)
input_type: InputType = field(init=False)
force_reprocess: bool = False
disable_tqdm: bool = False

def __post_init__(self):
"""
Expand Down Expand Up @@ -232,6 +266,7 @@ def from_yaml(cls, config_path: Path) -> "Config":
analyze_results=config_dict.get("analyze_results", True),
visualization=vis_config,
force_reprocess=config_dict.get("force_reprocess", False),
disable_tqdm=config_dict.get("disable_tqdm", False),
)

def to_dict(self) -> Dict[str, Any]:
Expand All @@ -257,6 +292,7 @@ def to_dict(self) -> Dict[str, Any]:
"visualization": asdict(self.visualization),
"input_type": self.input_type.value,
"force_reprocess": self.force_reprocess,
"disable_tqdm": self.disable_tqdm,
}


Expand Down
14 changes: 9 additions & 5 deletions src/cityseg/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,24 @@ ignore_files: null # Optional: list of file names to ignore

# Model configuration
model:
name: "facebook/mask2former-swin-large-mapillary-vistas-semantic"
model_type: "mask2former" # Optional: can be 'oneformer', 'mask2former', or null for auto-detection
name: "facebook/mask2former-swin-large-cityscapes-semantic"
model_type: null # Optional: can be 'beit', 'mask2former', or null for auto-detection
max_size: null # Optional: maximum size for input images/frames
device: "mps" # Options: "cuda", "cpu", "mps", or null for auto-detection
dataset: "semantic-sidewalk" # Optional: dataset name for model-specific postprocessing
num_workers: 0 # Number of workers for data loading
pipe_batch: 5 # Number of frames to process in each batch. Recommend setting this equal to batch_size below.

# Processing configuration
frame_step: 10 # Process every 5th frame
frame_step: 1 # Process every 5th frame
batch_size: 5 # Number of frames to process in each batch
output_fps: null # Optional: FPS for output video (if different from input)

# Output options
save_raw_segmentation: false
save_raw_segmentation: true
save_colored_segmentation: true
save_overlay: true
analyze_results: false
analyze_results: true

# Visualization configuration
visualization:
Expand All @@ -29,3 +32,4 @@ visualization:

# Advanced options
force_reprocess: false # Set to true to reprocess even if output files exist
disable_tqdm: false # Set to true to disable progress bars. In some cases, tqdm seems to lead to segfaults.
Loading

0 comments on commit defd1c1

Please sign in to comment.