'Openpose' for human pose estimation have been implemented using Tensorflow. It also provides several variants that have made some changes to the network structure for real-time processing on the CPU or low-power embedded devices.
You can even run this on your macbook with descent FPS!
Original Repo(Caffe) : https://github.com/CMU-Perceptual-Computing-Lab/openpose
Implemented features are listed here : features
You need dependencies below.
- python3
- tensorflow 1.4.1+
- opencv3, protobuf, python3-tk
$ git clone https://www.github.com/ildoonet/tf-openpose
$ cd tf-openpose
$ pip3 install -r requirements.txt
I have tried multiple variations of models to find optmized network architecture. Some of them are below and checkpoint files are provided for research purpose.
-
cmu
- the model based VGG pretrained network which described in the original paper.
- I converted Weights in Caffe format to use in tensorflow.
- pretrained weight download
-
dsconv
- Same architecture as the cmu version except for the depthwise separable convolution of mobilenet.
- I trained it using 'transfer learning', but it provides not-enough speed and accuracy.
-
mobilenet
- Based on the mobilenet paper, 12 convolutional layers are used as feature-extraction layers.
- To improve on small person, minor modification on the architecture have been made.
- Three models were learned according to network size parameters.
- mobilenet
- 368x368 : checkpoint weight download
- mobilenet_fast
- mobilenet_accurate
- mobilenet
- I published models which is not the best ones, but you can test them before you trained a model from the scratch.
Before running demo, you should download graph files. You can deploy this graph on your mobile or other platforms.
- cmu_640x360
- cmu_640x480
- mobilenet_thin_432x368
CMU's model graphs are too large for git, so I uploaded them on dropbox. You should download them if you want to use cmu's original model.
$ cd models/graph/cmu_640x360
$ bash download.sh
$ cd models/graph/cmu_640x480
$ bash download.sh
Dataset | Model | Inference Time Macbook Pro i5 3.1G |
Inference Time Jetson TX2 |
---|---|---|---|
Coco | cmu | 10.0s @ 368x368 | OOM @ 368x368 5.5s @ 320x240 |
Coco | dsconv | 1.10s @ 368x368 | |
Coco | mobilenet_accurate | 0.40s @ 368x368 | 0.18s @ 368x368 |
Coco | mobilenet | 0.24s @ 368x368 | 0.10s @ 368x368 |
Coco | mobilenet_fast | 0.16s @ 368x368 | 0.07s @ 368x368 |
You can test the inference feature with a single image.
$ python3 run.py --model=mobilenet_thin_432x368 --image=...
The image flag MUST be relative to the src folder with no "~", i.e:
--image ../../Desktop
Then you will see the screen as below with pafmap, heatmap, result and etc.
$ python3 run_webcam.py --model=mobilenet_thin_432x368 --camera=0
Then you will see the realtime webcam screen with estimated poses as below. This Realtime Result was recored on macbook pro 13" with 3.1Ghz Dual-Core CPU.
This pose estimator provides simple python classes that you can use in your applications.
See run.py or run_webcam.py as references.
e = TfPoseEstimator(get_graph_path(args.model), target_size=(w, h))
humans = e.inference(image)
image = TfPoseEstimator.draw_humans(image, humans, imgcopy=False)
See : etcs/ros.md
See : etcs/training.md
[1] https://github.com/CMU-Perceptual-Computing-Lab/openpose
[2] Training Codes : https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation
[3] Custom Caffe by Openpose : https://github.com/CMU-Perceptual-Computing-Lab/caffe_train
[4] Keras Openpose : https://github.com/michalfaber/keras_Realtime_Multi-Person_Pose_Estimation
[1] Arxiv Paper : https://arxiv.org/abs/1701.00295
[2] https://github.com/DenisTome/Lifting-from-the-Deep-release
[1] Original Paper : https://arxiv.org/abs/1704.04861
[2] Pretrained model : https://github.com/tensorflow/models/blob/master/slim/nets/mobilenet_v1.md
[1] Tensorpack : https://github.com/ppwwyyxx/tensorpack
[1] Freeze graph : https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py
[2] Optimize graph : https://codelabs.developers.google.com/codelabs/tensorflow-for-poets-2