DAY 78-100 DAYS MLCODE: Object Detection and Segmentation

My Tech World

DAY 78-100 DAYS MLCODE: Object Detection and Segmentation

January 27, 2019 100-Days-Of-ML-Code blog 0

In the past few blogs, we discussed object detection using ImageAI, TensorFlow and Yolo V3 using CV2, in this blog, we’ll implement Object Detection and Segmentation using Mask R-CNN.

Mask R-CNN:

In 2017, a paper Mask R-CNN was published, this paper talks about flexible, and general framework for object instance segmentation.

The aim of the paper was to solve instance segmentation problem in machine learning or computer vision. Mask R-CNN efficiently detects objects in an image or video while simultaneously generating a high-quality segmentation mask for each instance

This extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.

Now, we are going to clone the repo Mask R-CNN GitHub repo to see how the Mask R-CNN works.

We are going to install the requirements. Mask R-CNN depends on pycocotools, Let’s install the pycocotools:
1) Install Cython ( which is already installed in colab
2) Clone the repo Mask R-CNN GitHub repo

!git clone https://github.com/waleedka/coco
!pip install -U setuptools
!pip install -U wheel
!make install -C coco/PythonAPI

The above cell clones the coco repository from GitHub and Install build dependencies. And Finally, build and install the coco API library in directory /content/coco/PythonAPI.

Now time to clone the Mask_RCNN repo from GitHub .

!git clone https://github.com/matterport/Mask_RCNN

Change to the directory ./Mask_RCNN and download the weights from here

cd ./Mask_RCNN
!wget https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5

Now, initialize the paths for the program

# Root directory of the project
ROOT_DIR = os.getcwd()
# Import Mask RCNN
sys.path.append(ROOT_DIR) # To find local version of the library
from mrcnn import utils
import mrcnn.model as modellib
from mrcnn import visualize
# Import COCO config
sys.path.append(os.path.join(ROOT_DIR, “samples/coco/”)) # To find local version
import coco

# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, “logs”)

# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(ROOT_DIR, “mask_rcnn_coco.h5”)
# Download COCO trained weights from Releases if needed
if not os.path.exists(COCO_MODEL_PATH):
utils.download_trained_weights(COCO_MODEL_PATH)

# Directory of images to run detection on
IMAGE_DIR = os.path.join(ROOT_DIR, “images”)

Configurations

We’ll be using a model trained on the MS-COCO dataset. The configurations of this model are in the CocoConfig class in coco.py.

For inferencing, modify the configurations a bit to fit the task. To do so, sub-class the CocoConfig class and override the attributes you need to change.

class InferenceConfig(coco.CocoConfig):
# Set batch size to 1 since we’ll be running inference on
# one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
GPU_COUNT = 1
IMAGES_PER_GPU = 1

config = InferenceConfig()
config.display()

The output of the display method is
Configurations:
BACKBONE resnet101
BACKBONE_STRIDES [4, 8, 16, 32, 64]
BATCH_SIZE 1
BBOX_STD_DEV [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE None
DETECTION_MAX_INSTANCES 100
DETECTION_MIN_CONFIDENCE 0.7
DETECTION_NMS_THRESHOLD 0.3
FPN_CLASSIF_FC_LAYERS_SIZE 1024
GPU_COUNT 1
GRADIENT_CLIP_NORM 5.0
IMAGES_PER_GPU 1
IMAGE_CHANNEL_COUNT 3
IMAGE_MAX_DIM 1024
IMAGE_META_SIZE 93
IMAGE_MIN_DIM 800
IMAGE_MIN_SCALE 0
IMAGE_RESIZE_MODE square
IMAGE_SHAPE [1024 1024 3]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
LOSS_WEIGHTS {‘rpn_class_loss’: 1.0, ‘rpn_bbox_loss’: 1.0, ‘mrcnn_class_loss’: 1.0, ‘mrcnn_bbox_loss’: 1.0, ‘mrcnn_mask_loss’: 1.0}
MASK_POOL_SIZE 14
MASK_SHAPE [28, 28]
MAX_GT_INSTANCES 100
MEAN_PIXEL [123.7 116.8 103.9]
MINI_MASK_SHAPE (56, 56)
NAME coco
NUM_CLASSES 81
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 1000
POST_NMS_ROIS_TRAINING 2000
PRE_NMS_LIMIT 6000
ROI_POSITIVE_RATIO 0.33
RPN_ANCHOR_RATIOS [0.5, 1, 2]
RPN_ANCHOR_SCALES (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE 1
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD 0.7
RPN_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 1000
TOP_DOWN_PYRAMID_SIZE 256
TRAIN_BN False
TRAIN_ROIS_PER_IMAGE 200
USE_MINI_MASK True
USE_RPN_ROIS True
VALIDATION_STEPS 50
WEIGHT_DECAY 0.0001

Create Model and Load Trained Weights

# Create model object in inference mode.
model = modellib.MaskRCNN(mode=”inference”, model_dir=MODEL_DIR, config=config)

# Load weights trained on MS-COCO
model.load_weights(COCO_MODEL_PATH, by_name=True)

Class Names

Download the labels.txt and store in the class_name. Or we can we download the COCO Data like below:

#Load COCO dataset
dataset = coco.CocoDataset()
dataset.load_coco(COCO_DIR, “train”)
dataset.prepare()
#Print class names
print(dataset.class_names)

!wget “https://raw.githubusercontent.com/nightrome/cocostuff/master/labels.txt”

# load the COCO class labels our YOLO model was trained on
labelsPath = os.path.sep.join([os.getcwd(),”labels.txt”])
LABELS = open(labelsPath).read().strip(‘:’).split(“\n”)
class_name = []
for data in LABELS:
head, tail = data.split(“:”)
class_name.append(tail)

Download Random Image from Internet

!wget “http://s1.dmcdn.net/PIdu0/640×360-VfQ.jpg”

Input Image is

Downloaded from google image

Run Object Detection

# Load a downloaded image from the images folder
file_names = ‘640×360-VfQ.jpg’
image = skimage.io.imread(file_names)

# Run detection
results = model.detect([image], verbose=1)

# Visualize results
r = results[0]
visualize.display_instances(image, r[‘rois’], r[‘masks’], r[‘class_ids’],
class_name, r[‘scores’])

Processing 1 images image                    shape: (360, 640, 3)         min:    0.00000  max:  255.00000  uint8 molded_images            shape: (1, 1024, 1024, 3)    min: -123.70000  max:  151.10000  float64 image_metas              shape: (1, 93)               min:    0.00000  max: 1024.00000  float64 anchors                  shape: (1, 261888, 4)        min:   -0.35390  max:    1.29134  float32 
Object detected along with segmentation

In conclusion, Masking R-CNN is working as expected and it was easier it implement by copy the repo from Github. In next blog, we’ll try to use the same Mask R-CNN on Video. You can find today’s code here.