Skip to content

Latest commit

 

History

History
74 lines (46 loc) · 3.96 KB

README.md

File metadata and controls

74 lines (46 loc) · 3.96 KB

Video analytic overview

Here, we focus on those researches that provide system-level support for deep-neural-network-based video analytics. If you have any suggestions or I miss some papers, please inform me through creating a new issue. Hope these papers will help you.

Video Streaming Approaches

Tune configurations to adapt to video dynamics.

This paper explores AR scenario, so the key thing here to minimize the latency. This paper uses parallel streaming and inference to reduce latency, motion-vector-based tracking & ROI encoding to reduce bandwidth (thus reduce streaming latency), and adaptive offloading to reduce compute cost.

Tune configurations of encoder to adapt to video dynamics.

This paper uses super resolution to save bandwidth. It trains a super resolution model based on some ground truth by minimizing analytic error, and then uses this model to super-resolute the input video clip for analytic purpose.

This paper targets object detection. It says that by biasing the model to current video and specific classes, the model could be much more cheaper.

Scale up inference

Through building effective library/database

The key research topic here is how to choose correct API set and provide clever implementation to save unnecessary cost.

This paper argues that multiple applications will reuse the same video and invoke the same library. So one could cache the result for future use.

It expresses the analytic pipeline as a static computation graph, bound the temporal dependency through pre-defined API, and scale the pipeline to a cluster.

This paper says that each step of video analytic pipeline will map from a stream to another stream. Based on this abstraction, they explore how to encode, index the vido; and how to reuse previous results.

This paper aims to support and optimize relational queries. It tears specific video analytic task to four parts: extractors, processors, reducers and combiners, and then conduct optimization based on it.

This paper tries to detect all frames that contains certain objects. They use cheap models to reduce ingest-time cost, and identify same/similiar objects to reduce query-time latency.

Through resource management

Through edge

Scale up training

Here is the [paper link]. This paper argues that people train multiple models based on same backbone on same videos. Thus, the backbone could be shared between different models.

Dataset and benchmarking

This paper argues that to benchmark video analytic databases, the generation of benchmark videos and the the annotate of those videos should be automatic. So they use synthetic way to generate videos.