SlideShare a Scribd company logo
Multispectral Transfer Network:
Unsupervised Depth Estimation for All-day Vision
AAAI 2018, New Orleans
Namil Kim*, Yukyung Choi*, Soonmin Hwang, In So Kweon
KAIST RCV Lab / All-day Vision Team
*Equal contributions
Problem definition
Why we are interesting in depth?
“Crucial information” to understand the world around us
*From NVidia
It is necessary to 3D understanding for self-decision making
Problem definition
How do we usually get “dense depth”
in any time of the day?
RGB-Stereo 3D LiDAR
DayNight
≤ 11.45m≥ 23.89m
4 points
2 points
LiDAR
0.16°
Sensitive Sparse
Problem solution
3D LiDAR
DayNight
Thermal
(LWIR )
Depth Estimation
from a single thermal Image
How do we usually get “dense depth”
in any time of the day?
RGB-Stereo
Related works
Single image based depth estimation
 Supervised depth estimation
 Unsupervised depth estimation
 Semi-supervised depth estimation
Supervised depth estimation
Supervised [NIPS’14, CVPR’15, ICCV’15, NIPS’16, PAMI’16]
Semi-supervised [CVPR’17]
Unsupervised [ECCV’16, 3DV’16, CVPR’17]
Unsupervised depth estimation
Semi-supervised depth estimation
Idea to all-day depth estimation
Day Night
Illumination change
RGB
O X
Unsupervised
Learning
Unsupervised
Learning
Idea to all-day depth estimation
Day Night
Illumination change
RGBThermal
O X
Robust to illumination change
Unsupervised
Learning
Unsupervised
Learning
Idea to all-day depth estimation
Day Night
Illumination change
RGBThermal
Alignment
O X
Thermal-to-depth
#1
#2
Unsupervised
Learning
Unsupervised
Learning
Idea to all-day depth estimation
Day Night
Illumination change
RGBThermal
Alignment
O X
Thermal-to-depth
Adaptation
Robust to illumination change
Unsupervised
Learning
Unsupervised
Learning
Requirements #1
Multispectral (RGB-Thermal) dataset
 RGB stereo pair
 Alignment between thermal and RGB(left)
 3D measurement
Yukyung Choi et al., KAIST Multispectral Recognition Dataset in Day and Night, TITS’18
Requirements #2
Multispectral (RGB-Thermal) Transfer Network
 Aim: Thermal to depth prediction
 Data: Thermal and aligned left RGB
(+ right RGB, stereo pair)
 Model: unsupervised method
RGBThermal
Alignment
O
U.S.L
Thermal-to-depth
Proposed framework
What is Multispectral Transfer Network?
@Supervised method @Unsupervised method
@MTN method
Contributions
Key Ideas of Proposed MTN (Overview)
1) Efficient Multi-task Learning
Predicting Depth, Surface Normals and Semantic Labels
with a Common Multi-Scale Convolutional Architecture,
ICCV2015.
Without annotated data:
Propose an efficient multi-task methodology
Depth and Chromaticity
- surface normal
- semantic labeling
- object pose annotation
* Most of works under an indoor.
(difficulty of collecting sources of
subsequent task in outdoor)
Multi-task learning for
depth estimation
No human-intensive data
Relevance to the depth
Contextual information
Key Ideas of Proposed MTN (1/4)
Predicting Depth, Surface Normals and Semantic Labels
with a Common Multi-Scale Convolutional Architecture,
ICCV2015.
- surface normal
- semantic labeling
- object pose annotation
* Most of works under an indoor.
(difficulty of collecting sources of
subsequent task in outdoor)
Previous works:
No human-intensive data
Relevance to the depth
Contextual information
Our work: Chromaticity
1) Efficient Multi-task Learning
Without annotated data:
Propose an efficient multi-task methodology
Key Ideas of Proposed MTN (2/4)
Interleaver Module:
to directly interleave the chromaticity into the depth estimation
“Skip-connection meets Inter-leaver for the feature learning”
Encoder Decoder
Multispectral Transfer Network (MTN)
2) Novel Module for Multi-task learning
Thermal Input
Disparity Output
Chromaticity Output
Conv.
DeConv.
Interleaver
Skip Connect.
Forward flow
Key Ideas of Proposed MTN (2/4)
2) Novel Module for Multi-task learning
1. Global/Un-Pooling + L2 Norm.
 Enlarge receptive field [ParseNet] + feature transformation
2. Gating mechanism
 Control the degree of the effectiveness of another task
to the main task. (especially in back-propagation).
3. Up-sampling and adding to previous output
Equipped in every skip-connected flows
(fully-connections between layers)
Key Ideas of Proposed MTN (2/4)
2) Novel Module for Multi-task learning
 Do not have to find an optimal split point or
parameters.  <c.f.,(b), (c), (d)>
 Reduce adverse effects from inbuilt sharing
mechanism.  <c.f.,(a), (b)>
 Optimize the same strategy as the general multi-task
learning in end-to-end manner.  <c.f., (d)>
 In the inference, the Interleaver unit can be
removed.  <c.f., (d)>
(a) Fully Shared Architecture
(c) No shared Architecture (d) Connected Architecture
(b) Partial Split Architectures
Previous Multi-task Learning Our Multi-task Learning
Key Ideas of Proposed MTN (3/4)
3) Photometric Correction
“Thermal Crossover”
Thermal-infrared image is not directly affected by changing lighting conditions.
However, thermal-infrared image suffers indirectly from cyclic illumination.
Key Ideas of Proposed MTN (4/4)
Propose the adaptive scaled sigmoid to stably train the
model as the bilinear activation function.
From the initial smaller maximum disparity 𝛽0,
we iteratively increase the value 𝛼 at each epoch
to cover the large disparity level in end of training.
According to the derivative,
this is not stable for large quantities in initial stages
4) Adaptive scaled sigmoid function
Results
Experimental results: Day
MTN
GT
ColorThermal
Single Task LsMTN DsMTN MTN-P DIW [NIPS’16]
Without
Binary error map (error > 3 pixels)
[Eigen, NIPS2014]
[DIW, NIPS2016]
Daytime
1~50m Methods
STN LsMTN DsMTN MTN-P MTN STN-RGB Eigen-RGB Eigen-T DIW-RGB DIW-T
Distance *Lower is better
RMS 7.7735 6.6967 6.3671 7.0058 6.0786 7.5876 10.1792 10.2660 6.4993 6.4427
Log RMS 0.2000 0.1801 0.1761 0.1951 0.1714 0.2094 0.2386 0.2384 0.1934 0.1967
Abs. Relative 0.1531 0.1325 0.1259 0.1413 0.1207 0.1570 0.1992 0.1976 0.1644 0.1697
Sq. Relative 2.2767 1.6322 1.4394 1.7251 1.3119 2.0618 4.0629 4.0835 1.8030 1.7543
Accuracy *Higher is better
δ<1.25 0.8060 0.8358 0.8407 0.8040 0.8451 0.7772 0.7551 0.7561 0.7956 0.7825
δ<1.252
0.9337 0.9492 0.9544 0.9440 0.9557 0.9378 0.8965 0.8947 0.9482 0.9454
δ<1.253
0.9776 0.9842 0.9855 0.9827 0.9868 0.9806 0.9612 0.9618 0.9842 0.9851
Experimental results: Night
MTNSingle Task MTN-P DIW [NIPS’16]
Without
Nighttime
1~50m Methods
STN LsMTN DsMTN MTN-P MTN STN-RGB Eigen-RGB Eigen-T DIW-RGB DIW-T
Ordinal Accuracy *Higher is better
ξ<10 0.3233 0.3405 0.3745 0.3096 0.4666 0.2508 0.1728 0.2033 0.1404 0.3744
ξ<20 0.6237 0.6855 0.6820 0.6225 0.7026 0.3284 0.2442 0.6178 0.3176 0.7459
ξ<30 0.7317 0.7753 0.7797 0.7397 0.7757 0.3592 0.3064 0.7516 0.3805 0.8401
[Eigen, NIPS2014]
[DIW, NIPS2016]
GT
ColorThermal
Experimental Videos
Experimental Videos
Colors are mapped for visualization
This 3D information is from single monocular thermal image
Only the red part is used for inference
Conclusion
𝑰𝒏𝒕𝒆𝒓𝒍𝒆𝒂𝒗𝒆𝒓
in every skip-connected layer.
1. Pooling mechanism + L2 Norm.
(enlarge receptive field)
2. Gated Unit via Convolution
3. Up-sampling
 Employ multi-task learning for depth estimation
 Novel architecture for multi-task learning: Interleaver
 Photometric correction is helpful to deal with a thermal image.
 Adaptive sigmoid function help stable converge.
http://multispectral.kaist.ac.kr
You can download Dataset & Code
Thank you
Q & A
Ad

More Related Content

What's hot (20)

Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Wanjin Yu
 
Architecture Design for Deep Neural Networks II
Architecture Design for Deep Neural Networks IIArchitecture Design for Deep Neural Networks II
Architecture Design for Deep Neural Networks II
Wanjin Yu
 
Super resolution from a single image
Super resolution from a single imageSuper resolution from a single image
Super resolution from a single image
Lakkhana Mallikarachchi
 
[Seminar arxiv]fake face detection via adaptive residuals extraction network
[Seminar arxiv]fake face detection via adaptive residuals extraction network [Seminar arxiv]fake face detection via adaptive residuals extraction network
[Seminar arxiv]fake face detection via adaptive residuals extraction network
KIMMINHA3
 
Algorithms and tools for point cloud generation
Algorithms and tools for point cloud generationAlgorithms and tools for point cloud generation
Algorithms and tools for point cloud generation
Radhe Syam
 
“An Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for C...
“An Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for C...“An Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for C...
“An Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for C...
Edge AI and Vision Alliance
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Alex Conway
 
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional NetworksSingle Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Greeshma M.S.R
 
Qgis2threejs demo by Neetmaps
Qgis2threejs demo by NeetmapsQgis2threejs demo by Neetmaps
Qgis2threejs demo by Neetmaps
Matt Travis
 
Ieee gold 2010 resta
Ieee gold 2010 restaIeee gold 2010 resta
Ieee gold 2010 resta
grssieee
 
Coded Photography - Ramesh Raskar
Coded Photography - Ramesh RaskarCoded Photography - Ramesh Raskar
Coded Photography - Ramesh Raskar
Camera Culture Group, MIT Media Lab
 
M.tech dsp list 2014 15
M.tech dsp list 2014 15M.tech dsp list 2014 15
M.tech dsp list 2014 15
SAK Informatics
 
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Tomohiro Fukuda
 
A comparatively study on visual cryptography
A comparatively study on visual cryptographyA comparatively study on visual cryptography
A comparatively study on visual cryptography
eSAT Publishing House
 
What is the past future tense of data?
What is the past future tense of data?What is the past future tense of data?
What is the past future tense of data?
Ted Dunning
 
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic..."What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
Edge AI and Vision Alliance
 
Tracking Robustness and Green View Index Estimation of Augmented and Diminish...
Tracking Robustness and Green View Index Estimation of Augmented and Diminish...Tracking Robustness and Green View Index Estimation of Augmented and Diminish...
Tracking Robustness and Green View Index Estimation of Augmented and Diminish...
Tomohiro Fukuda
 
How to Determine which Algorithms Really Matter
How to Determine which Algorithms Really MatterHow to Determine which Algorithms Really Matter
How to Determine which Algorithms Really Matter
DataWorks Summit
 
40120140503004
4012014050300440120140503004
40120140503004
IAEME Publication
 
Proposal Presentation
Proposal PresentationProposal Presentation
Proposal Presentation
bigdavedev
 
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...
Wanjin Yu
 
Architecture Design for Deep Neural Networks II
Architecture Design for Deep Neural Networks IIArchitecture Design for Deep Neural Networks II
Architecture Design for Deep Neural Networks II
Wanjin Yu
 
[Seminar arxiv]fake face detection via adaptive residuals extraction network
[Seminar arxiv]fake face detection via adaptive residuals extraction network [Seminar arxiv]fake face detection via adaptive residuals extraction network
[Seminar arxiv]fake face detection via adaptive residuals extraction network
KIMMINHA3
 
Algorithms and tools for point cloud generation
Algorithms and tools for point cloud generationAlgorithms and tools for point cloud generation
Algorithms and tools for point cloud generation
Radhe Syam
 
“An Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for C...
“An Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for C...“An Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for C...
“An Introduction to Single-Photon Avalanche Diodes—A New Type of Imager for C...
Edge AI and Vision Alliance
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Alex Conway
 
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional NetworksSingle Image Super Resolution using Fuzzy Deep Convolutional Networks
Single Image Super Resolution using Fuzzy Deep Convolutional Networks
Greeshma M.S.R
 
Qgis2threejs demo by Neetmaps
Qgis2threejs demo by NeetmapsQgis2threejs demo by Neetmaps
Qgis2threejs demo by Neetmaps
Matt Travis
 
Ieee gold 2010 resta
Ieee gold 2010 restaIeee gold 2010 resta
Ieee gold 2010 resta
grssieee
 
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Point Cloud Stream on Spatial Mixed Reality: Toward Telepresence in Architect...
Tomohiro Fukuda
 
A comparatively study on visual cryptography
A comparatively study on visual cryptographyA comparatively study on visual cryptography
A comparatively study on visual cryptography
eSAT Publishing House
 
What is the past future tense of data?
What is the past future tense of data?What is the past future tense of data?
What is the past future tense of data?
Ted Dunning
 
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic..."What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
Edge AI and Vision Alliance
 
Tracking Robustness and Green View Index Estimation of Augmented and Diminish...
Tracking Robustness and Green View Index Estimation of Augmented and Diminish...Tracking Robustness and Green View Index Estimation of Augmented and Diminish...
Tracking Robustness and Green View Index Estimation of Augmented and Diminish...
Tomohiro Fukuda
 
How to Determine which Algorithms Really Matter
How to Determine which Algorithms Really MatterHow to Determine which Algorithms Really Matter
How to Determine which Algorithms Really Matter
DataWorks Summit
 
Proposal Presentation
Proposal PresentationProposal Presentation
Proposal Presentation
bigdavedev
 

Similar to [AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for All-day Vision (20)

Report
ReportReport
Report
Conor McMenamin
 
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
Seiya Ito
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
PetteriTeikariPhD
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
Jihong Kang
 
Implementation of digital image watermarking techniques using dwt and dwt svd...
Implementation of digital image watermarking techniques using dwt and dwt svd...Implementation of digital image watermarking techniques using dwt and dwt svd...
Implementation of digital image watermarking techniques using dwt and dwt svd...
eSAT Journals
 
Implementation of digital image watermarking techniques using dwt and dwt svd...
Implementation of digital image watermarking techniques using dwt and dwt svd...Implementation of digital image watermarking techniques using dwt and dwt svd...
Implementation of digital image watermarking techniques using dwt and dwt svd...
eSAT Journals
 
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
NVIDIA Japan
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MLAI2
 
The 'Rubble of the North' -a solution for modelling the irregular architectur...
The 'Rubble of the North' -a solution for modelling the irregular architectur...The 'Rubble of the North' -a solution for modelling the irregular architectur...
The 'Rubble of the North' -a solution for modelling the irregular architectur...
3D ICONS Project
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
Sungjoon Choi
 
Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]
Mohammad Shaker
 
A Predetermined Position-Wise Node Deployment for Optimizing Lifetime in Visu...
A Predetermined Position-Wise Node Deployment for Optimizing Lifetime in Visu...A Predetermined Position-Wise Node Deployment for Optimizing Lifetime in Visu...
A Predetermined Position-Wise Node Deployment for Optimizing Lifetime in Visu...
IRJET Journal
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
IRJET Journal
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
Chester Chen
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
PyData
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
AshrafDabbas1
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
Ravi Kiran B.
 
Minimum image disortion of reversible data hiding
Minimum image disortion of reversible data hidingMinimum image disortion of reversible data hiding
Minimum image disortion of reversible data hiding
IRJET Journal
 
Deblurring of License Plate Image using Blur Kernel Estimation
Deblurring of License Plate Image using Blur Kernel EstimationDeblurring of License Plate Image using Blur Kernel Estimation
Deblurring of License Plate Image using Blur Kernel Estimation
IRJET Journal
 
G0523444
G0523444G0523444
G0523444
IOSR Journals
 
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
Seiya Ito
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
PetteriTeikariPhD
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
Jihong Kang
 
Implementation of digital image watermarking techniques using dwt and dwt svd...
Implementation of digital image watermarking techniques using dwt and dwt svd...Implementation of digital image watermarking techniques using dwt and dwt svd...
Implementation of digital image watermarking techniques using dwt and dwt svd...
eSAT Journals
 
Implementation of digital image watermarking techniques using dwt and dwt svd...
Implementation of digital image watermarking techniques using dwt and dwt svd...Implementation of digital image watermarking techniques using dwt and dwt svd...
Implementation of digital image watermarking techniques using dwt and dwt svd...
eSAT Journals
 
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
NVIDIA Japan
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MLAI2
 
The 'Rubble of the North' -a solution for modelling the irregular architectur...
The 'Rubble of the North' -a solution for modelling the irregular architectur...The 'Rubble of the North' -a solution for modelling the irregular architectur...
The 'Rubble of the North' -a solution for modelling the irregular architectur...
3D ICONS Project
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
Sungjoon Choi
 
Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]Structured Forests for Fast Edge Detection [Paper Presentation]
Structured Forests for Fast Edge Detection [Paper Presentation]
Mohammad Shaker
 
A Predetermined Position-Wise Node Deployment for Optimizing Lifetime in Visu...
A Predetermined Position-Wise Node Deployment for Optimizing Lifetime in Visu...A Predetermined Position-Wise Node Deployment for Optimizing Lifetime in Visu...
A Predetermined Position-Wise Node Deployment for Optimizing Lifetime in Visu...
IRJET Journal
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
IRJET Journal
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
Chester Chen
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
PyData
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
Ravi Kiran B.
 
Minimum image disortion of reversible data hiding
Minimum image disortion of reversible data hidingMinimum image disortion of reversible data hiding
Minimum image disortion of reversible data hiding
IRJET Journal
 
Deblurring of License Plate Image using Blur Kernel Estimation
Deblurring of License Plate Image using Blur Kernel EstimationDeblurring of License Plate Image using Blur Kernel Estimation
Deblurring of License Plate Image using Blur Kernel Estimation
IRJET Journal
 
Ad

Recently uploaded (20)

Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
Top-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptxTop-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptx
BR Softech
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Build With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdfBuild With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdf
Google Developer Group - Harare
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
Top-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptxTop-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptx
BR Softech
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Ad

[AAAI2018] Multispectral Transfer Network: Unsupervised Depth Estimation for All-day Vision

  • 1. Multispectral Transfer Network: Unsupervised Depth Estimation for All-day Vision AAAI 2018, New Orleans Namil Kim*, Yukyung Choi*, Soonmin Hwang, In So Kweon KAIST RCV Lab / All-day Vision Team *Equal contributions
  • 2. Problem definition Why we are interesting in depth? “Crucial information” to understand the world around us *From NVidia It is necessary to 3D understanding for self-decision making
  • 3. Problem definition How do we usually get “dense depth” in any time of the day? RGB-Stereo 3D LiDAR DayNight ≤ 11.45m≥ 23.89m 4 points 2 points LiDAR 0.16° Sensitive Sparse
  • 4. Problem solution 3D LiDAR DayNight Thermal (LWIR ) Depth Estimation from a single thermal Image How do we usually get “dense depth” in any time of the day? RGB-Stereo
  • 5. Related works Single image based depth estimation  Supervised depth estimation  Unsupervised depth estimation  Semi-supervised depth estimation Supervised depth estimation Supervised [NIPS’14, CVPR’15, ICCV’15, NIPS’16, PAMI’16] Semi-supervised [CVPR’17] Unsupervised [ECCV’16, 3DV’16, CVPR’17] Unsupervised depth estimation Semi-supervised depth estimation
  • 6. Idea to all-day depth estimation Day Night Illumination change RGB O X Unsupervised Learning Unsupervised Learning
  • 7. Idea to all-day depth estimation Day Night Illumination change RGBThermal O X Robust to illumination change Unsupervised Learning Unsupervised Learning
  • 8. Idea to all-day depth estimation Day Night Illumination change RGBThermal Alignment O X Thermal-to-depth #1 #2 Unsupervised Learning Unsupervised Learning
  • 9. Idea to all-day depth estimation Day Night Illumination change RGBThermal Alignment O X Thermal-to-depth Adaptation Robust to illumination change Unsupervised Learning Unsupervised Learning
  • 10. Requirements #1 Multispectral (RGB-Thermal) dataset  RGB stereo pair  Alignment between thermal and RGB(left)  3D measurement Yukyung Choi et al., KAIST Multispectral Recognition Dataset in Day and Night, TITS’18
  • 11. Requirements #2 Multispectral (RGB-Thermal) Transfer Network  Aim: Thermal to depth prediction  Data: Thermal and aligned left RGB (+ right RGB, stereo pair)  Model: unsupervised method RGBThermal Alignment O U.S.L Thermal-to-depth
  • 12. Proposed framework What is Multispectral Transfer Network? @Supervised method @Unsupervised method @MTN method
  • 14. Key Ideas of Proposed MTN (Overview) 1) Efficient Multi-task Learning Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture, ICCV2015. Without annotated data: Propose an efficient multi-task methodology Depth and Chromaticity - surface normal - semantic labeling - object pose annotation * Most of works under an indoor. (difficulty of collecting sources of subsequent task in outdoor) Multi-task learning for depth estimation No human-intensive data Relevance to the depth Contextual information
  • 15. Key Ideas of Proposed MTN (1/4) Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture, ICCV2015. - surface normal - semantic labeling - object pose annotation * Most of works under an indoor. (difficulty of collecting sources of subsequent task in outdoor) Previous works: No human-intensive data Relevance to the depth Contextual information Our work: Chromaticity 1) Efficient Multi-task Learning Without annotated data: Propose an efficient multi-task methodology
  • 16. Key Ideas of Proposed MTN (2/4) Interleaver Module: to directly interleave the chromaticity into the depth estimation “Skip-connection meets Inter-leaver for the feature learning” Encoder Decoder Multispectral Transfer Network (MTN) 2) Novel Module for Multi-task learning Thermal Input Disparity Output Chromaticity Output Conv. DeConv. Interleaver Skip Connect. Forward flow
  • 17. Key Ideas of Proposed MTN (2/4) 2) Novel Module for Multi-task learning 1. Global/Un-Pooling + L2 Norm.  Enlarge receptive field [ParseNet] + feature transformation 2. Gating mechanism  Control the degree of the effectiveness of another task to the main task. (especially in back-propagation). 3. Up-sampling and adding to previous output Equipped in every skip-connected flows (fully-connections between layers)
  • 18. Key Ideas of Proposed MTN (2/4) 2) Novel Module for Multi-task learning  Do not have to find an optimal split point or parameters.  <c.f.,(b), (c), (d)>  Reduce adverse effects from inbuilt sharing mechanism.  <c.f.,(a), (b)>  Optimize the same strategy as the general multi-task learning in end-to-end manner.  <c.f., (d)>  In the inference, the Interleaver unit can be removed.  <c.f., (d)> (a) Fully Shared Architecture (c) No shared Architecture (d) Connected Architecture (b) Partial Split Architectures Previous Multi-task Learning Our Multi-task Learning
  • 19. Key Ideas of Proposed MTN (3/4) 3) Photometric Correction “Thermal Crossover” Thermal-infrared image is not directly affected by changing lighting conditions. However, thermal-infrared image suffers indirectly from cyclic illumination.
  • 20. Key Ideas of Proposed MTN (4/4) Propose the adaptive scaled sigmoid to stably train the model as the bilinear activation function. From the initial smaller maximum disparity 𝛽0, we iteratively increase the value 𝛼 at each epoch to cover the large disparity level in end of training. According to the derivative, this is not stable for large quantities in initial stages 4) Adaptive scaled sigmoid function
  • 22. Experimental results: Day MTN GT ColorThermal Single Task LsMTN DsMTN MTN-P DIW [NIPS’16] Without Binary error map (error > 3 pixels) [Eigen, NIPS2014] [DIW, NIPS2016] Daytime 1~50m Methods STN LsMTN DsMTN MTN-P MTN STN-RGB Eigen-RGB Eigen-T DIW-RGB DIW-T Distance *Lower is better RMS 7.7735 6.6967 6.3671 7.0058 6.0786 7.5876 10.1792 10.2660 6.4993 6.4427 Log RMS 0.2000 0.1801 0.1761 0.1951 0.1714 0.2094 0.2386 0.2384 0.1934 0.1967 Abs. Relative 0.1531 0.1325 0.1259 0.1413 0.1207 0.1570 0.1992 0.1976 0.1644 0.1697 Sq. Relative 2.2767 1.6322 1.4394 1.7251 1.3119 2.0618 4.0629 4.0835 1.8030 1.7543 Accuracy *Higher is better δ<1.25 0.8060 0.8358 0.8407 0.8040 0.8451 0.7772 0.7551 0.7561 0.7956 0.7825 δ<1.252 0.9337 0.9492 0.9544 0.9440 0.9557 0.9378 0.8965 0.8947 0.9482 0.9454 δ<1.253 0.9776 0.9842 0.9855 0.9827 0.9868 0.9806 0.9612 0.9618 0.9842 0.9851
  • 23. Experimental results: Night MTNSingle Task MTN-P DIW [NIPS’16] Without Nighttime 1~50m Methods STN LsMTN DsMTN MTN-P MTN STN-RGB Eigen-RGB Eigen-T DIW-RGB DIW-T Ordinal Accuracy *Higher is better ξ<10 0.3233 0.3405 0.3745 0.3096 0.4666 0.2508 0.1728 0.2033 0.1404 0.3744 ξ<20 0.6237 0.6855 0.6820 0.6225 0.7026 0.3284 0.2442 0.6178 0.3176 0.7459 ξ<30 0.7317 0.7753 0.7797 0.7397 0.7757 0.3592 0.3064 0.7516 0.3805 0.8401 [Eigen, NIPS2014] [DIW, NIPS2016] GT ColorThermal
  • 24. Experimental Videos Experimental Videos Colors are mapped for visualization This 3D information is from single monocular thermal image Only the red part is used for inference
  • 25. Conclusion 𝑰𝒏𝒕𝒆𝒓𝒍𝒆𝒂𝒗𝒆𝒓 in every skip-connected layer. 1. Pooling mechanism + L2 Norm. (enlarge receptive field) 2. Gated Unit via Convolution 3. Up-sampling  Employ multi-task learning for depth estimation  Novel architecture for multi-task learning: Interleaver  Photometric correction is helpful to deal with a thermal image.  Adaptive sigmoid function help stable converge.
  • 26. http://multispectral.kaist.ac.kr You can download Dataset & Code Thank you Q & A
  翻译: