luoyc a9c35a4807 opencv source code commit 1 tahun lalu
..
doc a9c35a4807 opencv source code commit 1 tahun lalu
include a9c35a4807 opencv source code commit 1 tahun lalu
samples a9c35a4807 opencv source code commit 1 tahun lalu
scripts a9c35a4807 opencv source code commit 1 tahun lalu
src a9c35a4807 opencv source code commit 1 tahun lalu
tutorials a9c35a4807 opencv source code commit 1 tahun lalu
CMakeLists.txt a9c35a4807 opencv source code commit 1 tahun lalu
README.md a9c35a4807 opencv source code commit 1 tahun lalu

README.md

Object Detection using Convolutional Neural Networks

This module uses Convolutional Neural Networks for detecting objects in an image

Dependencies

  • opencv dnn module
  • Google Protobuf

Building this module

Run the following command to build this module:

cmake -DOPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -Dopencv_dnn_objdetect=ON <opencv_source_dir>

Models

There are two models which are trained.

SqueezeNet model trained for Image Classification.

  • This model was trained for 1500000 iterations with a batch size of 16
  • Size of Model: 4.9MB
  • Top-1 Accuracy on ImageNet 2012 DataSet: 56.10%
  • Top-5 Accuracy on ImageNet 2012 DataSet: 79.54%
  • Link to trained weights: here (copy)

SqueezeDet model trained for Object Detection

  • This model was trained for 180000 iterations with a batch size of 16
  • Size of the Model: 14.2MB
  • Link to the trained weights: here (copy)

Usage

With Caffe

For details pertaining to the usage of the model, have a look at this repository

You can infact train your own object detection models with the loss function which is implemented.

Without Caffe, using opencv's dnn module

tutorials/core_detect.cpp gives an example of how to use the model to predict the bounding boxes. tutorials/image_classification.cpp gives an example of how to use the model to classify an image.

Here's the brief summary of examples. For detailed usage and testing, refer tutorials directory.

Examples:

Image Classification

// Read the net along with it's trained weights
cv::dnn::net = cv::dnn::readNetFromCaffe(model_defn, model_weights);

// Read an image
cv::Mat image = cv::imread(image_file);

// Convert the image into blob
cv::Mat image_blob = cv::net::blobFromImage(image);

// Get the output of "predictions" layer
cv::Mat probs = net.forward("predictions");

probs is a 4-d tensor of shape [1, 1000, 1, 1] which is obtained after the application of softmax activation.

Object Detection

// Reading the network and weights, converting image to blob is same as Image Classification example.

// Forward through the network and collect blob data
cv::Mat delta_bboxs = net.forward("slice")[0];
cv::Mat conf_scores = net.forward("softmax");
cv::Mat class_scores = net.forward("sigmoid");

Three blobs aka delta_bbox, conf_scores, class_scores are post-processed in cv::dnn_objdetect::InferBbox class and the bounding boxes predicted.

InferBbox infer(delta_bbox, class_scores, conf_scores);
infer.filter();

infer.filter() returns vector of cv::dnn_objdetect::object of predictions. Here cv::dnn_objdetect::object is a structure containing the following elements.

typedef struct {
  int xmin, xmax;
  int ymin, ymax;
  int class_idx;
  std::string label_name;
  double class_prob;
} object;

For further details on post-processing refer this detailed blog-post.

Results from Object Detection

Refer tutorials directory for results.