Unsupervised Domain Adapatation (UDA) for semantic segmentation is the task
of aligning a network trained on source data to perform well on target
data. Complex deep neural networks for this task require to be trained
with a huge amount of labeled data, which is difficult and expensive to
acquire. A recently proposed workaround is to use synthetic data,
however the differences between real world and synthetic scenes limit
the performance. UDA techniques allow to reduce this gap allowing to
obtain reliable performances on the target domain.
- In  a novel unsupervised domain adaptation strategy is
proposed to adapt a synthetic supervised training to real world
data. The proposed learning strategy exploits three components: a
standard supervised learning on synthetic data, an adversarial
learning strategy able to exploit both labeled synthetic data and
unlabeled real data and finally a self-teaching strategy working on
unlabeled data only. The last component is guided by the
segmentation confidence, estimated by the fully convolutional
discriminator of the adversarial learning module, helping to
further reduce the domain shift between synthetic and real data.
Furthermore we weighted this loss on the basis of the class
frequencies to enhance the performance on less common classes.
- The approach of  moves from . However, the self-teaching component has been greatly improved in this work. First of all, the output of the discriminator has been considered as a weight to be applied to the loss function of the self-teaching component at each location (in place of the hard threshold used in previous work ). Then, a novel region growing scheme is introduced in order to extend and better represent the shape of reliable regions (previous approaches tend to almost always discard edge regions and small objects). Finally, since the various classes have different frequencies, we also weighted the loss coming from unlabeled data in proportion to the frequency of the various classes in the dataset thus obtaining a better balance of the results between the different classes and avoiding the dramatic drop in performance on less common classes (typically corresponding to small objects and structures that represent the critical elements for an autonomous vehicle).
- Starting from the architecture of  and , in  we
further develop a novel UDA framework where
a standard supervised loss on labeled synthetic data is supported
by an adversarial module and a self-training strategy aiming at
aligning the two domain distributions. An improved adversarial module is
driven by a couple of fully convolutional discriminators
dealing with different domains: the first discriminates between
ground truth and generated maps, while the second between
segmentation maps coming from synthetic or real world data.
The self-training module exploits the confidence estimated by
the discriminators on unlabeled data to select the regions used
to reinforce the learning process. Furthermore, the confidence is
thresholded with an adaptive mechanism based on the per-class
- In  we propose a novel UDA strategy to address the domain shift issue between real world and synthetic representations. An adversarial model, based on the cycle consistency framework, performs the mapping between the synthetic and real domain. The data is then fed to a MobileNet-v2 architecture that performs the semantic segmentation task. An additional couple of discriminators, working at the feature level of the MobileNet-v2, allows to better align the features of the two domain distributions and to further improve the performance. Finally, the consistency of the semantic maps is exploited. After an initial supervised training on synthetic data, the whole UDA architecture is trained end-to-end considering all its components at once. Experimental results show how the proposed strategy is able to obtain impressive performance in adapting a segmentation network trained on synthetic data to real world scenarios. The usage of the lightweight MobileNet-v2 architecture allows its deployment on devices with limited computational resources as the ones employed in autonomous vehicles.
The aim of  is to give an overview of the recent advancements in the Unsupervised Domain Adaptation (UDA) of deep networks for semantic segmentation.
Motivated by the recent growth in interest towards this field, we build a comprehensive overview of the proposed methodologies and provide a clear categorization.
We start by introducing the problem, its formulation and the various scenarios that can be considered.
Then, we introduce the different levels at which adaptation strategies may be applied: namely, at the input (image) level, at the internal features representation and at the output level.
Furthermore, we present a detailed overview of the literature in the field, dividing previous methods based on the following (non mutually exclusive) categories:
adversarial learning, generative-based, analysis of the classifier discrepancies, self-teaching, entropy minimization, curriculum learning and multi-task learning.
Novel research directions are also briefly introduced to give a hint of interesting open problems in the field.
Finally, a comparison of the performance of the various methods in the widely used autonomous driving scenario is presented.
In  we propose a novel Unsupervised Domain Adaptation (UDA) strategy,
based on a feature clustering method that captures the different semantic modes of the feature distribution
and groups features of the same class into tight and well-separated clusters.
Furthermore, we introduce two novel learning objectives to enhance the discriminative clustering performance:
an orthogonality loss forces spaced out individual representations to be orthogonal, while a sparsity loss reduces class-wise the number of active feature channels.
The joint effect of these modules is to regularize the structure of the feature space.
M. Biasetton, U. Michieli, G. Agresti and P. Zanuttigh,
Unsupervised Domain Adaptation for Semantic Segmentation of Urban Scenes
, Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), Workshop on Autonomous Driving (WAD), 2019.
Michieli, M. Biasetton, G. Agresti and P. Zanuttigh,
Adversarial Learning and Self-Teaching Techniques for Domain Adaptation in Semantic Segmentation
, IEEE Transactions on Intelligent Vehicles (T-IV), 2020.
T. Spadotto, M. Toldo, U. Michieli, P. Zanuttigh,
Unsupervised Domain Adaptation with Multiple Domain Discriminators and Adaptive Self-Training
, International Conference on Pattern Recognition (ICPR), 2020.
M. Toldo, U. Michieli, G. Agresti and P. Zanuttigh,
Unsupervised Domain Adaptation for Mobile Semantic Segmentation based on Cycle Consistency and Feature Alignment
, Elsevier Image and Vision Computing (IMAVIS), 2020.
M. Toldo, A. Maracani, U. Michieli, P. Zanuttigh,
Unsupervised Domain Adaptation in Semantic Segmentation: A Review
, Technologies, 2020.
M. Toldo, U. Michieli, P. Zanuttigh,
Unsupervised Domain Adaptation in Semantic Segmentation via Orthogonal and Clustered Embeddings
, Winter Conference on Applications of Computer Vision (WACV), 2021.