[Paper Review📃] Facial Expression Recognition in the Wild via Deep Attentive Center Loss

Facial Expression Recognition in the Wild via Deep Attentive Center Loss

-DACL-

Studying I searched FER using center loss. This model is a SOTA model, ranking each 6 and 5 in RAF-DB and AffectNet as of today (09 March 2022).

Intro

In FER, soft max loss was widely used. However, softmax loss is incapable of yielding discriminative features in wild scenarios.
Deep Metric Learning(DML) approaches constrain the embedding space to obtain well-discriminated deep features.
In a typical DML problem, the deep feature equally contributes to the DML’s objective function along all dimensions. Therefore, DML methods are prone to discriminate redundant and noisy information along with important information encoded in the deep feature vector. This leads to over-fitting and hinders the generalization ability of the learning algorithm

➜ Paper designed a modular attention-based DML approach, called Deep Attentive Center Loss (DACL), to selectively learn to discriminate exclusively the relevant information in the embedding space.

DACL extract attention weights to apply it in loss computational process.

DACL method

Above image shows a whole process of the proposed model. When input image goes to CNN(ResNet18), the last layer’s feature goes to two different ways.

The DACL take flattened feature as a input for attention network. The output of attention network will be attention weights which will be element-wise multiplicated with sparse center loss computation.

the same last layer’s feature go through pooling layer, will be computed in sparse center loss and softmas loss each.

The final loss will be summation of softmax loss and sparse center loss.

DACL method

- Context Encoder (CE) Unit

The three fully connected layer can be mathematically notated to…

Since CE Unit is composed of FC layer, the significant features can be extracted well. The final single unit vector $e_i$ is a latent representation vector.

DACL method

- Multi-head binary classification

attention value $a_{ij}$ eventually saturates 0~1.

DACL method

- sparse center loss

as you can see, the difference between feature and center point process element-wise multiplication with attention weight.

Training DACL

Sparse center loss is jointly optimized with softmax loss : $L = L_S$ $+ \lambda L_{SC}$
Sparse center loss contributes to the gradients with respect to the deep features and their corresponding attention weights

zzennin

[Paper Review📃] Facial Expression Recognition in the Wild via Deep Attentive Center Loss

Facial Expression Recognition in the Wild via Deep Attentive Center Loss

-DACL-

Intro

DACL method

DACL method

- Context Encoder (CE) Unit

DACL method

- Multi-head binary classification

DACL method

- sparse center loss

Training DACL

Experiment

RAF-DB

AffectNet

Attention weights visualization

공유하기

댓글남기기

참고

Svg_test

[Portfolio]

[LATEX] 논문에 줄 번호 삽입하기 - linenumbers

[LeetCode] 62. Unique Paths2 - python