Skip to content

Commit

Permalink
Improved doc
Browse files Browse the repository at this point in the history
  • Loading branch information
gineshidalgo99 committed Nov 21, 2018
1 parent 3763619 commit 174faa3
Show file tree
Hide file tree
Showing 13 changed files with 178 additions and 84 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ Output (format, keypoint index ordering, etc.) in [doc/output.md](doc/output.md)


## Speeding Up OpenPose and Benchmark
Check the OpenPose Benchmark as well as some hints to speed up and/or reduce the memory requirements for OpenPose on [doc/faq.md#speed-up-memory-reduction-and-benchmark](doc/faq.md#speed-up-memory-reduction-and-benchmark).
Check the OpenPose Benchmark as well as some hints to speed up and/or reduce the memory requirements for OpenPose on [doc/speed_up_preserving_accuracy.md](doc/speed_up_preserving_accuracy.md).



Expand Down
45 changes: 19 additions & 26 deletions doc/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,17 @@ OpenPose - Frequently Asked Question (FAQ)
1. [FAQ](#faq)
1. [Out of Memory Error](#out-of-memory-error)
2. [Speed Up, Memory Reduction, and Benchmark](#speed-up-memory-reduction-and-benchmark)
3. [Estimating FPS without Display](#estimating-fps-without-display)
4. [Webcam Slower than Images](#webcam-slower-than-images)
5. [Video/Webcam Not Working](#videowebcam-not-working)
6. [Cannot Find OpenPose.dll Error](#cannot-find-openpose.dll-error-windows)
7. [Free Invalid Pointer Error](#free-invalid-pointer-error)
8. [Source Directory does not Contain CMakeLists.txt (Windows)](#source-directory-does-not-contain-cmakelists.txt-windows)
9. [How Should I Link my IP Camera?](#how-should-i-link-my-ip-camera)
10. [Difference between BODY_25 vs. COCO vs. MPI](#difference-between-body_25-vs.-coco-vs.-mpi)
11. [How to Measure the Latency Time?](#how-to-measure-the-latency-time)
12. [Zero People Detected](#zero-people-detected)
13. [CPU Version Too Slow](#cpu-version-too-slow)
3. [CPU Version Too Slow](#cpu-version-too-slow)
4. [Profiling Speed and Estimating FPS without Display](#profiling-speed-and-estimating-fps-without-display)
5. [Webcam Slower than Images](#webcam-slower-than-images)
6. [Video/Webcam Not Working](#videowebcam-not-working)
7. [Cannot Find OpenPose.dll Error](#cannot-find-openpose.dll-error-windows)
8. [Free Invalid Pointer Error](#free-invalid-pointer-error)
9. [Source Directory does not Contain CMakeLists.txt (Windows)](#source-directory-does-not-contain-cmakelists.txt-windows)
10. [How Should I Link my IP Camera?](#how-should-i-link-my-ip-camera)
11. [Difference between BODY_25 vs. COCO vs. MPI](#difference-between-body_25-vs.-coco-vs.-mpi)
12. [How to Measure the Latency Time?](#how-to-measure-the-latency-time)
13. [Zero People Detected](#zero-people-detected)



Expand All @@ -32,18 +32,18 @@ OpenPose - Frequently Asked Question (FAQ)
### Speed Up, Memory Reduction, and Benchmark
**Q: Low speed** - OpenPose is quite slow, is it normal? How can I speed it up?

**A**: Check the [OpenPose Benchmark](https://docs.google.com/spreadsheets/d/1-DynFGvoScvfWDA1P4jDInCkbD4lg0IKOYbXgEq0sK0/edit#gid=0) to discover the approximate speed of your graphics card. Some speed tips:
**A**: Check [doc/speed_up_preserving_accuracy.md](./speed_up_preserving_accuracy.md) to discover the approximate speed of your graphics card and some speed tips.

1. Use cuDNN 5.1 (cuDNN 6 is ~10% slower).
2. Reduce the `--net_resolution` (e.g., to 320x176) (lower accuracy). Note: For maximum accuracy, follow [doc/quick_start.md#maximum-accuracy-configuration](./quick_start.md#maximum-accuracy-configuration).
3. For face, reduce the `--face_net_resolution`. The resolution 320x320 usually works pretty decently.
4. Use the `MPI_4_layers` model (lower accuracy and lower number of parts).
5. Change GPU rendering by CPU rendering to get approximately +0.5 FPS (`--render_pose 1`).
6. Points 2-4 will also help reducing GPU memory (or RAM memory for CPU version).


### CPU Version Too Slow
**Q: The CPU version is insanely slow compared to the GPU version.**

**A**: Check [doc/speed_up_preserving_accuracy.md#cpu-version](./speed_up_preserving_accuracy.md#cpu-version) to discover the approximate speed and some speed tips.

### Estimating FPS without Display


### Profiling Speed and Estimating FPS without Display
Check the [doc/installation.md#profiling-speed](./installation.md#profiling-speed) section.


Expand Down Expand Up @@ -112,10 +112,3 @@ COCO model will eventually be removed. BODY_25 model is faster, more accurate, a
**Q: 0 people detected and displayed in default video and images.**

**A**: This problem occurs when the caffemodel has not been properly downloaded. E.g., if the connection drops when downloading the models. Please, remove the current models in the model folder, and download them manually from the links in [doc/installation.md](./installation.md). Alternatively, remove them and re-run Cmake again.



### CPU Version Too Slow
**Q: The CPU version is insanely slow compared to the GPU version.**

**A**: Yes, that is expected. The CPU version runs at about 0.3 FPS on the COCO model, and at about 0.1 FPS (i.e., about 15 sec / frame) on the default BODY_25 model. Switch to COCO model and/or reduce the `net_resolution` as indicated in [Speed Up, Memory Reduction, and Benchmark](#speed-up-memory-reduction-and-benchmark). Contradictory fact: BODY_25 model is about 5x slower than COCO on CPU-only version, but it is about 40% faster on GPU version.
4 changes: 3 additions & 1 deletion doc/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,8 @@ OpenPose displays the FPS in the basic GUI. However, more complex speed metrics
- Time measurement for 1 graphic card: The FPS will be the slowest time displayed in your terminal command line (as OpenPose is multi-threaded). Times are in milliseconds, so `FPS = 1000/millisecond_measurement`.
- Time measurement for >1 graphic cards: Assuming `n` graphic cards, you will have to wait up to `n` x `F` frames to visualize each graphic card speed (as the frames are splitted among them). In addition, the FPS would be: `FPS = minFPS(speed_per_GPU/n, worst_time_measurement_other_than_GPUs)`. For < 4 GPUs, this is usually `FPS = speed_per_GPU/n`.

Make sure that `wPoseExtractor` time is the slowest timing. Otherwise the input producer (video/webcam codecs issues with OpenCV, images too big, etc.) or the GUI display (use OpenGL support as detailed in [doc/speed_up_preserving_accuracy.md](./speed_up_preserving_accuracy.md)) might not be optimized.



#### Faster GUI Display
Expand Down Expand Up @@ -304,7 +306,7 @@ export MKL_NUM_THREADS="8"
export OMP_NUM_THREADS="8"
```

Do note that increasing the number of threads results in more memory use. You can check the [doc/faq.md#speed-up-memory-reduction-and-benchmark](./faq.md#speed-up-memory-reduction-and-benchmark) for more information about speed and memory requirements in several CPUs and GPUs.
Do note that increasing the number of threads results in more memory use. You can check the [doc/speed_up_preserving_accuracy.md](./speed_up_preserving_accuracy.md) for more information about speed and memory requirements in several CPUs and GPUs.



Expand Down
1 change: 1 addition & 0 deletions doc/release_notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,7 @@ OpenPose Library - Release Notes
18. Removed warnings from Spinnaker SDK at compiling time.
19. All bash scripts incorporate `#!/bin/bash` to tell the terminal that they are bash scripts.
20. Added flag `--verbose` to plot the progress.
21. Added find_package(Protobuf) to allow specific versions of Protobuf.
2. Functions or parameters renamed:
1. By default, python example `tutorial_developer/python_2_pose_from_heatmaps.py` was using 2 scales starting at -1x736, changed to 1 scale at -1x368.
2. WrapperStructPose default parameters changed to match those of the OpenPose demo binary.
Expand Down
46 changes: 46 additions & 0 deletions doc/speed_up_preserving_accuracy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
OpenPose - Maximizing the OpenPose Speed
========================================================================================

## Contents
1. [OpenPose Benchmark](#openpose-benchmark)
2. [Profiling Speed](#profiling-speed)
3. [Speed Up Preserving Accuracy](#speed-up-preserving-accuracy)
4. [Speed Up and Memory Reduction](#speed-up-and-memory-reduction)
5. [CPU Version](#cpu-version)





## OpenPose Benchmark
Check the [OpenPose Benchmark](https://docs.google.com/spreadsheets/d/1-DynFGvoScvfWDA1P4jDInCkbD4lg0IKOYbXgEq0sK0/edit#gid=0) to discover the approximate expected speed of your graphics card.



### Profiling Speed
Check the [doc/installation.md#profiling-speed](./installation.md#profiling-speed) section to measure the bottlenecks in your OpenPose distribution and make sure everything is working as expected.



## Speed Up Preserving Accuracy
Some speed tips to maximize the OpenPose runtime speed while preserving the accuracy (do not expect miracles, but it might help a bit boosting the framerate):

1. Enable the `WITH_OPENCV_WITH_OPENGL` flag in CMake to have a much faster GUI display (but you must also compile OpenCV with OpenGL support). Note: Default OpenCV in Ubuntu (from apt-get install) does have OpenGL support included. Nevertheless, default Windows portable binaries do not.
2. Change GPU rendering by CPU rendering to get approximately +0.5 FPS (`--render_pose 1`).
3. Use cuDNN 5.1 (cuDNN 6 is ~10% slower).
4. Use the `BODY_25` model for simultaneously maximum speed and accuracy (both COCO and MPII models are slower and less accurate).



## Speed Up and Memory Reduction
Some speed tips to highly maximize the OpenPose speed, but keep in mind the accuracy trade-off:

1. Reduce the `--net_resolution` (e.g., to 320x176) (lower accuracy). Note: For maximum accuracy, follow [doc/quick_start.md#maximum-accuracy-configuration](./quick_start.md#maximum-accuracy-configuration).
2. For face, reduce the `--face_net_resolution`. The resolution 320x320 usually works pretty decently.
3. Points 1-2 will also reduce the GPU memory usage (or RAM memory for CPU version).
4. Use the `BODY_25` model for maximum speed. Use `MPI_4_layers` model for minimum GPU memory usage (but lower accuracy, speed, and number of parts).



### CPU Version
The CPU version runs at about 0.3 FPS on the COCO model, and at about 0.1 FPS (i.e., about 15 sec / frame) on the default BODY_25 model. Switch to COCO model and/or reduce the `net_resolution` as indicated above. Contradictory fact: BODY_25 model is about 5x slower than COCO on CPU-only version, but it is about 40% faster on GPU version.
10 changes: 5 additions & 5 deletions examples/tests/pose_accuracy_coco_val.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,16 @@ JSON_FOLDER=../evaluation/coco_val_jsons/
OP_BIN=./build/examples/openpose/openpose.bin

# 1 scale
$OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1.json
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_max.json \
$OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1.json --write_coco_foot_json ${JSON_FOLDER}1_foot.json
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_max.json --write_coco_foot_json ${JSON_FOLDER}1_foot_max.json \
# --maximize_positives

# # 3 scales
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_3.json \
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_3.json --write_coco_foot_json ${JSON_FOLDER}3_foot.json \
# --scale_number 3 --scale_gap 0.25

# # 4 scales
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_4.json \
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_4.json --write_coco_foot_json ${JSON_FOLDER}4_foot.json \
# --scale_number 4 --scale_gap 0.25 --net_resolution "1312x736"
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_4_max.json \
# $OP_BIN --image_dir $IMAGE_FOLDER --display 0 --render_pose 0 --cli_verbose 0.2 --write_coco_json ${JSON_FOLDER}1_4_max.json --write_coco_foot_json ${JSON_FOLDER}4_foot_max.json \
# --scale_number 4 --scale_gap 0.25 --net_resolution "1312x736" --maximize_positives
1 change: 1 addition & 0 deletions include/openpose/pose/enumClasses.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ namespace op
BODY_25D, /**< Experimental. Do not use. */
BODY_23, /**< Experimental. Do not use. */
CAR_22, /**< Experimental. Do not use. */
BODY_19E, /**< Experimental. Do not use. */
Size,
};

Expand Down
117 changes: 72 additions & 45 deletions src/openpose/filestream/cocoJsonSaver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,36 +44,12 @@ namespace op
// Sanity check
if ((size_t)poseKeypoints.getSize(0) != poseScores.getVolume())
error("Dimension mismatch between poseKeypoints and poseScores.", __LINE__, __FUNCTION__, __FILE__);
// Fixed variables
const auto numberPeople = poseKeypoints.getSize(0);
const auto numberBodyParts = poseKeypoints.getSize(1);
const auto imageId = getLastNumber(imageName);
for (auto person = 0 ; person < numberPeople ; person++)
if (numberPeople > 0)
{
// Comma at any moment but first element
if (mFirstElementAdded)
{
mJsonOfstream.comma();
mJsonOfstream.enter();
}
else
mFirstElementAdded = true;

// New element
mJsonOfstream.objectOpen();

// image_id
mJsonOfstream.key("image_id");
mJsonOfstream.plainText(imageId);
mJsonOfstream.comma();

// category_id
mJsonOfstream.key("category_id");
mJsonOfstream.plainText("1");
mJsonOfstream.comma();

// keypoints - i.e., poseKeypoints
mJsonOfstream.key("keypoints");
mJsonOfstream.arrayOpen();
const auto numberBodyParts = poseKeypoints.getSize(1);
// Get indexesInCocoOrder
std::vector<int> indexesInCocoOrder;
// Body/car
if (mCocoJsonFormat == CocoJsonFormat::Body)
Expand Down Expand Up @@ -117,27 +93,78 @@ namespace op
if (indexesInCocoOrder.empty())
error("Invalid number of body parts (" + std::to_string(numberBodyParts) + ").",
__LINE__, __FUNCTION__, __FILE__);
for (auto bodyPart = 0u ; bodyPart < indexesInCocoOrder.size() ; bodyPart++)
// Save on JSON file
const auto imageId = getLastNumber(imageName);
for (auto person = 0 ; person < numberPeople ; person++)
{
const auto finalIndex = 3*(person*numberBodyParts + indexesInCocoOrder.at(bodyPart));
const auto validPoint = (poseKeypoints[finalIndex+2] > 0.f);
mJsonOfstream.plainText(validPoint ? poseKeypoints[finalIndex] : -1.f);
mJsonOfstream.comma();
mJsonOfstream.plainText(validPoint ? poseKeypoints[finalIndex+1] : -1.f);
mJsonOfstream.comma();
mJsonOfstream.plainText(validPoint ? 1 : 0);
// mJsonOfstream.plainText(poseKeypoints[finalIndex+2]); // For debugging
if (bodyPart < indexesInCocoOrder.size() - 1u)
bool foundAtLeast1Keypoint = true;
// Foot
if (mCocoJsonFormat == CocoJsonFormat::Foot)
{
// At least 1 valid keypoint?
foundAtLeast1Keypoint = false;
for (auto bodyPart = 0u ; bodyPart < indexesInCocoOrder.size() ; bodyPart++)
{
const auto finalIndex = 3*(person*numberBodyParts + indexesInCocoOrder.at(bodyPart));
const auto validPoint = (poseKeypoints[finalIndex+2] > 0.f);
if (validPoint)
{
foundAtLeast1Keypoint = true;
break;
}
}
}

if (foundAtLeast1Keypoint)
{
// Comma at any moment but first element
if (mFirstElementAdded)
{
mJsonOfstream.comma();
mJsonOfstream.enter();
}
else
mFirstElementAdded = true;

// New element
mJsonOfstream.objectOpen();

// image_id
mJsonOfstream.key("image_id");
mJsonOfstream.plainText(imageId);
mJsonOfstream.comma();
}
mJsonOfstream.arrayClose();
mJsonOfstream.comma();

// score
mJsonOfstream.key("score");
mJsonOfstream.plainText(poseScores[person]);
// category_id
mJsonOfstream.key("category_id");
mJsonOfstream.plainText("1");
mJsonOfstream.comma();

mJsonOfstream.objectClose();
// keypoints - i.e., poseKeypoints
mJsonOfstream.key("keypoints");
mJsonOfstream.arrayOpen();
for (auto bodyPart = 0u ; bodyPart < indexesInCocoOrder.size() ; bodyPart++)
{
const auto finalIndex = 3*(person*numberBodyParts + indexesInCocoOrder.at(bodyPart));
const auto validPoint = (poseKeypoints[finalIndex+2] > 0.f);
mJsonOfstream.plainText(validPoint ? poseKeypoints[finalIndex] : -1.f);
mJsonOfstream.comma();
mJsonOfstream.plainText(validPoint ? poseKeypoints[finalIndex+1] : -1.f);
mJsonOfstream.comma();
mJsonOfstream.plainText(validPoint ? 1 : 0);
// mJsonOfstream.plainText(poseKeypoints[finalIndex+2]); // For debugging
if (bodyPart < indexesInCocoOrder.size() - 1u)
mJsonOfstream.comma();
}
mJsonOfstream.arrayClose();
mJsonOfstream.comma();

// score
mJsonOfstream.key("score");
mJsonOfstream.plainText(poseScores[person]);

mJsonOfstream.objectClose();
}
}
}
}
catch (const std::exception& e)
Expand Down
5 changes: 3 additions & 2 deletions src/openpose/net/bodyPartConnectorBase.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,9 @@ namespace op
{
try
{
if (poseModel == PoseModel::BODY_25E)
error("BODY_25E not implemented for CPU body connector.", __LINE__, __FUNCTION__, __FILE__);
if (poseModel != PoseModel::BODY_25 || poseModel != PoseModel::COCO_18
|| poseModel != PoseModel::MPI_15 || poseModel != PoseModel::MPI_15_4)
error("Model not implemented for CPU body connector.", __LINE__, __FUNCTION__, __FILE__);

// std::vector<std::pair<std::vector<int>, double>> refers to:
// - std::vector<int>: [body parts locations, #body parts found]
Expand Down
Loading

0 comments on commit 174faa3

Please sign in to comment.