From a98da8dc4624d0161b7800abc41c13372922c679 Mon Sep 17 00:00:00 2001
From: mike2ox
Date: Fri, 18 Oct 2019 17:45:31 +0900
Subject: [PATCH] Change Model, Dataset(#18)
- Faster R-CNN to SSD
- Deepfashion2 to VOC2012
---
Object_detection_tutorial_by_keras_API.md | 39 +++++++++++++++--------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/Object_detection_tutorial_by_keras_API.md b/Object_detection_tutorial_by_keras_API.md
index b757ba7..5630391 100644
--- a/Object_detection_tutorial_by_keras_API.md
+++ b/Object_detection_tutorial_by_keras_API.md
@@ -3,31 +3,44 @@
## Introduction
- 
+ 
-__Object Detection__ is one of the most popular computer vision technologies in many areas.(Face detection, Self-driving car etc) Recently, __Deep Learning__ technology has greatly influenced the Object Detection field, such as accuracy, performance improvement.
-There are several popular deep learning algorithms. __Faster R-CNN(2015)__, __YOLO(2015)__, __SSD(2015)__ and __RetinaNet(2017)__. In this tutorial, we will __Faster R-CNN(2015)__ to learn what object detection is.
+__Object Detection__ is one of the most popular computer vision technologies in many areas.(Face detection, Self-driving car etc) Recently, __Deep Learning__ technology has greatly influenced the Object Detection field, such as accuracy, performance improvement.
+There are several popular deep learning algorithms. __Faster R-CNN(2015)__, __YOLO (2015)__, __SSD(2016)__ and __RetinaNet(2017)__. In this tutorial, we will __SSD(2016)__ to learn what object detection is.
-#### _What's Faster R-CNN?_
-[Faster R-CNN(2015)](https://arxiv.org/pdf/1506.01497.pdf) is one of the R-CNN models that extracts Region Proposals **used by Region Proposal Network** and classifies them on the basis of CNN models.
+#### _Why and what is SSD (Single Shot MultiBox Dectector)?_
-[model picture]()
+
+
+
+
+In image above, the models marked by red showed excellent result at object detection field.
+Among those models, the reasons for selecting SSD in this tutorial are
+ - **Fast training** (SSD is 1 stage method and use convolution layer at `Extra Feature Layers`)
+ - **Getting high detection accuracy** (SSD produce predictions of diffenrent scales from multiple scale feature maps)
+ - **Providing weights (trained by COCO) of SSD in [Tensorflow github](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md)**
+
+So, Let's look at the SSD model structure.
+
+
+
+
-Before 'Faster R-CNN', Region proposals were extracted in raw image(R-CNN) or feature map(Fast R-CNN) using selvective serarch. However, This method is slower than gpu computation and cause to occur bottleneck because they operate on cpu computation outside the CNN model.
+First, Look at SSD structure image above. The SSD consists of VGG16 and Extra feature layers and uses input images(300*300*3).
-In order to eliminate bottlenecks, 'Faster R-CNN' applied a CNN model(called **Region proposal network(RPN)**) to the algorithm to obtain region proposals. RPN takes as input a small window (3 X 3) of feature map passed by CNN model (just make feature map of raw image). Each window is mapped to a lower-dimensional feature(256 or 512). This feature is used 2 small networks. one is classifying object or none object, the other is regressing bbox locations.
-#### _What's Deepfashion2?_
+#### _What Dataset use this tutorial?_
- 
+
+
- In fact, many object detection tutorials use famous dataset such as [COCO](http://cocodataset.org/), [VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/) etc. But, this tutorial uses [DeepFashion2(2019)](https://arxiv.org/pdf/1901.07973.pdf). ~~DeepFashion2 is comprehensive fashion dataset that contains 491k images, each of which is richly labeled with style, scale, occlusion, zooming, viewpoint, bounding box, dense landmarks and pose, pixel-level masks, and pair of images of identical item from consumer and commercial store.~~(To be written by the responsible personnel.)
+ In fact, many object detection tutorials use famous dataset such as [COCO](http://cocodataset.org/), [VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/) etc. Among them, we decided to use VOC2012. ~~insert voc2012 description~~
## Setting up Environments
-In order to running this tutorial(object detection based on deep learning), many development packages and environment settings are needed. **BUT, DON'T WORRY.** In this tutorial, you can easily set up a development environment on your computer or server using a docker. **JUST FOLLOW ME**
+In order to running this tutorial(object detection based on deep learning), many development packages and environment settings are needed. **BUT, DON'T WORRY.** In this tutorial, you can easily set up a development environment on your computer or server using a docker. **Just follow this tutorial.**
-First, we use a Docker (OS : [Ubuntu](https://docs.docker.com/install/linux/docker-ce/ubuntu/), [windows](https://docs.docker.com/docker-for-windows/)) to set up the developments package and environments required for deep learning development. ~~And, If you don't have Docker Hub ID, you can't download [our docker image](). so, you need to sign up in [Docker Hub](https://hub.docker.com/).~~ See below.
+First, we use a Docker (OS : [Ubuntu](https://docs.docker.com/install/linux/docker-ce/ubuntu/), [windows](https://docs.docker.com/docker-for-windows/)) to set up the developments package and environments required for deep learning development. See below.
```bash
# yolk/