SOLOv2 : Dynamic, Faster, and Stronger 논문요약 - 개발 여정 - A Journey to be a Developer

Compared to SOLOv1

SOLOv1

Location과 Size 기반의 Instance Segmentation
Category Branch : 각 grid cell의 semantic category 예측
Mask Branch : segment object instance
Decoupled SOLO : S^2 에 비해 instance 수는 너무 작은 문제를 해결.

SOLOv2는 더 좋은 object masks를 만드는 것에 집중함.

Mask learning
Matrix NMS
performance ↑, speed of inference ↑

Dynamic Instance Segmentation

M = F ∗ G

Mask Kernel Branch

각 grid cell마다 D-dimension의 아웃풋이 생성됨 = 각 grid cell마다 D개의 파라미터를 가짐

Mask Feature Branch

predict instance-aware feature map F
Mask Feature Branch의 위치
- (a) 각 FPN level마다 mask prediction
- (b) unified mask ⇢ ACCEPT

Loss

Inference

backbone과 FPN에 input을 넣어, grid(i,j)에 대한 category score p_(i,j)를 구함
Thresholding(confidence = 0.1)
Mask feature 와 Mask kernel을 컨볼루션
Sigmoid를 통과시킨 후, 0.5를 기준으로 binary mask를 생성.

Matrix NMS

Soft NMS

decay factor = f(iou)
두 obj의 IoU가 높으면, 단순감소함수 f(iou)를 곱해서 서서히 score를 감소시킴.
minimum score thresholding으로 제거됨.

Matrix NMS

m_i와 m_j 중 m_i의 score가 더 높을 때, 아래에 의해서 decay factor m_j가 결정됨.
m_i가 m_j에게 주는 패널티 → f(iou)
m_i가 suppress될 확률 → IoU로 추정하기
m_i와 가장 많이 겹치는 mask와의 IoU를 기반으로 확률을 추정.

Final Decay factor

Experiments

Instance segmentation 성능 비교

Object Detection 성능 비교

다양한 kernel shape에 대한 실험. 1 x 1 x 256 일때 가장 안정적임

position 정보를 주었을 때(✓)의 AP의 변화

Mask Feature representation method : unified 일때 성능이 더 좋음.
Medium-size와 large-size object에 대해 더 정확한 boundary를 잡아냄

Dynamic Head를 썼을 때 성능이 더 좋음.

Matrix NMS를 썼을 때 speed와 accuracy가 향상함.

Mask feature map의 두가지 양상 : position-sensitive(주황색) / position-agnostic(흰색)

Conclusion

Adaptive, dynamic convolution kernels → Powerful head design, reduced FLOPs
Unified Mask → predict more accurate boundaries
Matrix NMS : much faster without sacrificing AP

박나깨

저는 Deep Learning, Computer Vision, AI, Image Processing에 관심이 있는 학생입니다.

Recent post

Deep Double Descent 요약 Improved Training of WGAN(WGAN-GP) 논문 요약 Wasserstein GAN 논문 요약 LSGAN과 DCGAN