SlideShare a Scribd company logo
Convolutional Features for
Instance Search
Amaia Salvador
03/05/2016
2
Related Publications
E. Mohedano, A. Salvador, K. McGuinness, F. Marques, N. E. O'Connor and X. Giro,
Bags of Local Convolutional Features for Scalable Instance Search
Accepted at ICMR 2016
A. Salvador, X. Giro, F. Marques, S. Satoh,
Faster R-CNN Features for Instance Search
Accepted at DeepVision CVPRW 2016
Part I
E. Mohedano, A. Salvador, K. McGuinness, F. Marques, N. E. O'Connor and X. Giro,
Bags of Local Convolutional Features for Scalable Instance Search
Visual Image Retrieval
4Image Database
Visual Query
“A dog”
Expected outcome:
Visual Instance Retrieval
5Image Database
Visual Query
“This dog”
Expected outcome:
Visual Instance Retrieval
6
Image RepresentationsQuery image
Image
Database
Image Matching Ranking List
Similarity score Image
...
0.98
0.97
0.10
0.01
v = (v1
, …, vn
)
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
Similarity
Metric
(e.g. cosine similarity)
...
7
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
INVERTED FILE
word Image ID
1 1, 12,
2 1, 30, 102
3 10, 12
4 2,3
6 10
...
Local hand-crafted features
(e.g. SIFT)
Bag of Visual
WordsN-Dimensional
feature space
Image Representations
High-dimensional
Highly sparse
8
Image Representations
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In
Advances in neural information processing systems (pp. 1097-1105).
Convolutional Neural Networks
9
Image Representations
Babenko, A., Slesarev, A., Chigorin, A., & Lempitsky, V. (2014). Neural codes for image retrieval. In ECCV 2014
Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: an astounding baseline for recognition. In
DeepVision CVPRW 2014
Convolutional Neural Networks FC layers as global feature representation
10
Image Representations
Babenko, A., & Lempitsky, V. (2015). Aggregating local deep features for image retrieval. ICCV 2015
Tolias, G., Sicre, R., & Jégou, H. (2015). Particular object retrieval with integral max-pooling of CNN activations. ICLR 2015
Kalantidis, Y., Mellina, C., & Osindero, S. (2015). Cross-dimensional Weighting for Aggregated Deep Convolutional Features. arXiv
preprint arXiv:1512.04065.
Convolutional Neural Networks
sum/max pooled conv features as global representation
11
Image Representations
Ng, J., Yang, F., & Davis, L. (2015). Exploiting local features from deep networks for image retrieval. In DeepVision CVPRW 2015
Convolutional Neural Networks
conv features encoded with VLAD as global representation
12
Motivation
Dataset Complexity
TRECVID Instance Search
464 hours of video content
13
Motivation: Image Representations
High-dimensional & Sparse
Bag of Visual Words
Compact & Dense
(e.g. sum/max pooling conv feats, FC feats)
Capacity?
High-dimensional & Dense
(e.g. VLAD encoding)
Scalability?
Methodology
15
Bag of Words Framework
16
Bag of Words Framework
(336x256)
Resolution
conv5_1 from
VGG16[1]
(42x32)
[1]Simonyan K., Zisserman A., Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv 2014
25K centroids 25K-D vector
17
Instance Retrieval
Query Representation
... ... ...
... ... ...
Global Search
(GS)
Local Search
(LS)
18
Spatial Reranking
Image RepresentationsQuery image
Image
Database
Image Matching Ranking List
v = (v1
, …, vn
)
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
Similarity
Metric
(cosine similarity)
...
Top M images
are locally analyzed
and reranked
(M = 100)
19
Spatial Reranking
All window combinations with:
Query Image Target image in top M ranking
...
...
20
Query Expansion
Image RepresentationsQuery image
Image
Database
Image Matching Ranking List
v = (v1
, …, vn
)
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
Similarity
Metric
(cosine similarity)
...
Top N images
are added to the
query for a new
search
(N = 5)
Experiments
22
Datasets
Paris Buildings 6k Oxford Buildings 5k
TRECVID Instance Search 2013
(subset of 23k frames)
Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching, CVPR 2007
Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases. CVPR 2008
Smeaton, A. F., Over, P., & Kraaij, W. Evaluation campaigns and TRECVid. ACM MM Multimedia information retrieval Workshop 2006
23
Results I: SoA Comparison
24
Results II: TRECVid INS
25
Qualitative Results
26
Conclusion
BoW encoding of convolutional features
• High-dimensional sparse representation suitable for fast retrieval
• Competitive results in two image retrieval benchmarks
• Well suited and more robust for scenarios where only small number of features are
in the target images are relevant to the query (INS).
Part II
A. Salvador, X. Giro, F. Marques, S. Satoh,
Faster R-CNN Features for Instance Search
28
Reminder: Spatial Reranking
Query Image Target image in top M ranking
...
...
29
Reminder: Spatial Reranking
Koen E. A. van de Sande, Jasper R. R. Uijlings, Theo Gevers, Arnold W. M. Smeulders. Segmentation as Selective
Search for Object Recognition, ICCV 2011
Object Proposals
30
Image & Region Representations
“dog”
CNN Architectures
plant, table, dog
CNN
CNN
Image Classification
Object Detection
31
Image & Region Representations
Faster R-CNN
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
Ren, S., He, K., Girshick, R., & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal
networks. NIPS 2015
32
Image & Region Representations
Faster R-CNN
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
Ren, S., He, K., Girshick, R., & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal
networks. NIPS 2015
Image representation
Region Representation
33
Image & Region Representations
Image representation Region Representation
(for reranking)
RoI
Pooling
Conv5_3 RoI
Pooling
sum-pooling max-pooling
DD
34
Fine tuning for query objects
Faster R-CNN
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
Train object detector for query instances using query images as training data
35
Fine tuning for query objects
FT #1: Train FC layers only
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
36
Fine tuning for query objects
FT #2: Train all weights after conv2
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
37
Spatial Reranking Strategies
Class-agnostic Spatial Reranking (CA-SR)
Query Image Database
Image
FC6
Class probabilities
FC7
FC8
...
Class-specific Spatial Reranking (CS-SR)
38
Results
39Query image Top N retrieved images
40
Conclusion
Faster R-CNN for Instance Search
• Suitable to obtain image and region features in a single forward pass
• Fine tuning as an effective solution to boost retrieval performance (subject to
application time constraints)
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
Image representation
Region Representation
41
Thank you for your attention !
E. Mohedano, A. Salvador, K. McGuinness, F. Marques, N. E. O'Connor and X. Giro,
Bags of Local Convolutional Features for Scalable Instance Search
Accepted at ICMR 2016
A. Salvador, X. Giro, F. Marques, S. Satoh,
Faster R-CNN Features for Instance Search
Accepted at DeepVision CVPRW 2016
Ad

More Related Content

What's hot (20)

Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Simone Ercoli
 
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Image Retrieval (D4L5 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Universitat Politècnica de Catalunya
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
Dat Nguyen
 
Object Detection Methods using Deep Learning
Object Detection Methods using Deep LearningObject Detection Methods using Deep Learning
Object Detection Methods using Deep Learning
Sungjoon Choi
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Universitat Politècnica de Catalunya
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
Nader Karimi
 
Fine tuning a convolutional network for cultural event recognition
Fine tuning a convolutional network for cultural event recognitionFine tuning a convolutional network for cultural event recognition
Fine tuning a convolutional network for cultural event recognition
Universitat Politècnica de Catalunya
 
Visual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local FeaturesVisual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local Features
Universitat Politècnica de Catalunya
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Universitat Politècnica de Catalunya
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
Usman Qayyum
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
Rishabh Indoria
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Universitat Politècnica de Catalunya
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
Dongmin Choi
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]
Dongmin Choi
 
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Vision and Multimedia Reading Group: DeCAF: a Deep Convolutional Activation F...
Simone Ercoli
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
Dat Nguyen
 
Object Detection Methods using Deep Learning
Object Detection Methods using Deep LearningObject Detection Methods using Deep Learning
Object Detection Methods using Deep Learning
Sungjoon Choi
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Universitat Politècnica de Catalunya
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
Nader Karimi
 
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Universitat Politècnica de Catalunya
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
Usman Qayyum
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
Rishabh Indoria
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Universitat Politècnica de Catalunya
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
Dongmin Choi
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
YolactEdge Review [cdm]
YolactEdge Review [cdm]YolactEdge Review [cdm]
YolactEdge Review [cdm]
Dongmin Choi
 

Similar to Convolutional Features for Instance Search (20)

Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
Universitat Politècnica de Catalunya
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
Taegyun Jeon
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
Hiroshi Fukui
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
NAVER Engineering
 
D3L4-objects.pdf
D3L4-objects.pdfD3L4-objects.pdf
D3L4-objects.pdf
ssusere945ae
 
Presentation2.pptx of sota seminar iit kanpur
Presentation2.pptx of sota seminar iit kanpurPresentation2.pptx of sota seminar iit kanpur
Presentation2.pptx of sota seminar iit kanpur
datastudydaily
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
jaewon lee
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
Sangmin Woo
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
Jihong Kang
 
Lec11 object-re-id
Lec11 object-re-idLec11 object-re-id
Lec11 object-re-id
United States Air Force Academy
 
Information Visualisation (Multimedia 2009 course)
Information Visualisation (Multimedia 2009 course)Information Visualisation (Multimedia 2009 course)
Information Visualisation (Multimedia 2009 course)
Joris Klerkx
 
Poster 1-13-Paper ID 207
Poster 1-13-Paper ID 207Poster 1-13-Paper ID 207
Poster 1-13-Paper ID 207
Sudeshna Roy
 
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Universitat Politècnica de Catalunya
 
rips-hk-lenovo (1)
rips-hk-lenovo (1)rips-hk-lenovo (1)
rips-hk-lenovo (1)
Owen Richfield
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
Wenjing Chen
 
Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdf
SamuCerezo
 
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approachConvolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Universitat de Barcelona
 
1 introduction.pptx
1 introduction.pptx1 introduction.pptx
1 introduction.pptx
mustafa sarac
 
Bagwords
BagwordsBagwords
Bagwords
mustafa sarac
 
Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...Faster R-CNN: Towards real-time object detection with region proposal network...
Faster R-CNN: Towards real-time object detection with region proposal network...
Universitat Politècnica de Catalunya
 
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
[OSGeo-KR Tech Workshop] Deep Learning for Single Image Super-Resolution
Taegyun Jeon
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
Hiroshi Fukui
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
NAVER Engineering
 
Presentation2.pptx of sota seminar iit kanpur
Presentation2.pptx of sota seminar iit kanpurPresentation2.pptx of sota seminar iit kanpur
Presentation2.pptx of sota seminar iit kanpur
datastudydaily
 
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIPPR-110: An Analysis of Scale Invariance in Object Detection – SNIP
PR-110: An Analysis of Scale Invariance in Object Detection – SNIP
jaewon lee
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
Sangmin Woo
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
Jihong Kang
 
Information Visualisation (Multimedia 2009 course)
Information Visualisation (Multimedia 2009 course)Information Visualisation (Multimedia 2009 course)
Information Visualisation (Multimedia 2009 course)
Joris Klerkx
 
Poster 1-13-Paper ID 207
Poster 1-13-Paper ID 207Poster 1-13-Paper ID 207
Poster 1-13-Paper ID 207
Sudeshna Roy
 
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Interpretability of Convolutional Neural Networks - Xavier Giro - UPC Barcelo...
Universitat Politècnica de Catalunya
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
Wenjing Chen
 
Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdf
SamuCerezo
 
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approachConvolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Universitat de Barcelona
 
Ad

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Universitat Politècnica de Catalunya
 
Ad

Recently uploaded (20)

Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Build With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdfBuild With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdf
Google Developer Group - Harare
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 

Convolutional Features for Instance Search

  • 1. Convolutional Features for Instance Search Amaia Salvador 03/05/2016
  • 2. 2 Related Publications E. Mohedano, A. Salvador, K. McGuinness, F. Marques, N. E. O'Connor and X. Giro, Bags of Local Convolutional Features for Scalable Instance Search Accepted at ICMR 2016 A. Salvador, X. Giro, F. Marques, S. Satoh, Faster R-CNN Features for Instance Search Accepted at DeepVision CVPRW 2016
  • 3. Part I E. Mohedano, A. Salvador, K. McGuinness, F. Marques, N. E. O'Connor and X. Giro, Bags of Local Convolutional Features for Scalable Instance Search
  • 4. Visual Image Retrieval 4Image Database Visual Query “A dog” Expected outcome:
  • 5. Visual Instance Retrieval 5Image Database Visual Query “This dog” Expected outcome:
  • 6. Visual Instance Retrieval 6 Image RepresentationsQuery image Image Database Image Matching Ranking List Similarity score Image ... 0.98 0.97 0.10 0.01 v = (v1 , …, vn ) v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... Similarity Metric (e.g. cosine similarity) ...
  • 7. 7 v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... INVERTED FILE word Image ID 1 1, 12, 2 1, 30, 102 3 10, 12 4 2,3 6 10 ... Local hand-crafted features (e.g. SIFT) Bag of Visual WordsN-Dimensional feature space Image Representations High-dimensional Highly sparse
  • 8. 8 Image Representations Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105). Convolutional Neural Networks
  • 9. 9 Image Representations Babenko, A., Slesarev, A., Chigorin, A., & Lempitsky, V. (2014). Neural codes for image retrieval. In ECCV 2014 Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: an astounding baseline for recognition. In DeepVision CVPRW 2014 Convolutional Neural Networks FC layers as global feature representation
  • 10. 10 Image Representations Babenko, A., & Lempitsky, V. (2015). Aggregating local deep features for image retrieval. ICCV 2015 Tolias, G., Sicre, R., & Jégou, H. (2015). Particular object retrieval with integral max-pooling of CNN activations. ICLR 2015 Kalantidis, Y., Mellina, C., & Osindero, S. (2015). Cross-dimensional Weighting for Aggregated Deep Convolutional Features. arXiv preprint arXiv:1512.04065. Convolutional Neural Networks sum/max pooled conv features as global representation
  • 11. 11 Image Representations Ng, J., Yang, F., & Davis, L. (2015). Exploiting local features from deep networks for image retrieval. In DeepVision CVPRW 2015 Convolutional Neural Networks conv features encoded with VLAD as global representation
  • 12. 12 Motivation Dataset Complexity TRECVID Instance Search 464 hours of video content
  • 13. 13 Motivation: Image Representations High-dimensional & Sparse Bag of Visual Words Compact & Dense (e.g. sum/max pooling conv feats, FC feats) Capacity? High-dimensional & Dense (e.g. VLAD encoding) Scalability?
  • 15. 15 Bag of Words Framework
  • 16. 16 Bag of Words Framework (336x256) Resolution conv5_1 from VGG16[1] (42x32) [1]Simonyan K., Zisserman A., Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv 2014 25K centroids 25K-D vector
  • 17. 17 Instance Retrieval Query Representation ... ... ... ... ... ... Global Search (GS) Local Search (LS)
  • 18. 18 Spatial Reranking Image RepresentationsQuery image Image Database Image Matching Ranking List v = (v1 , …, vn ) v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... Similarity Metric (cosine similarity) ... Top M images are locally analyzed and reranked (M = 100)
  • 19. 19 Spatial Reranking All window combinations with: Query Image Target image in top M ranking ... ...
  • 20. 20 Query Expansion Image RepresentationsQuery image Image Database Image Matching Ranking List v = (v1 , …, vn ) v1 = (v11 , …, v1n ) vk = (vk1 , …, vkn ) ... Similarity Metric (cosine similarity) ... Top N images are added to the query for a new search (N = 5)
  • 22. 22 Datasets Paris Buildings 6k Oxford Buildings 5k TRECVID Instance Search 2013 (subset of 23k frames) Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching, CVPR 2007 Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases. CVPR 2008 Smeaton, A. F., Over, P., & Kraaij, W. Evaluation campaigns and TRECVid. ACM MM Multimedia information retrieval Workshop 2006
  • 23. 23 Results I: SoA Comparison
  • 26. 26 Conclusion BoW encoding of convolutional features • High-dimensional sparse representation suitable for fast retrieval • Competitive results in two image retrieval benchmarks • Well suited and more robust for scenarios where only small number of features are in the target images are relevant to the query (INS).
  • 27. Part II A. Salvador, X. Giro, F. Marques, S. Satoh, Faster R-CNN Features for Instance Search
  • 28. 28 Reminder: Spatial Reranking Query Image Target image in top M ranking ... ...
  • 29. 29 Reminder: Spatial Reranking Koen E. A. van de Sande, Jasper R. R. Uijlings, Theo Gevers, Arnold W. M. Smeulders. Segmentation as Selective Search for Object Recognition, ICCV 2011 Object Proposals
  • 30. 30 Image & Region Representations “dog” CNN Architectures plant, table, dog CNN CNN Image Classification Object Detection
  • 31. 31 Image & Region Representations Faster R-CNN Conv layers Region Proposal Network FC6 Class probabilities FC7 FC8 RPN Proposals RoI Pooling Conv5_3 RPN Proposals Ren, S., He, K., Girshick, R., & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS 2015
  • 32. 32 Image & Region Representations Faster R-CNN Conv layers Region Proposal Network FC6 Class probabilities FC7 FC8 RPN Proposals RoI Pooling Conv5_3 RPN Proposals Ren, S., He, K., Girshick, R., & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS 2015 Image representation Region Representation
  • 33. 33 Image & Region Representations Image representation Region Representation (for reranking) RoI Pooling Conv5_3 RoI Pooling sum-pooling max-pooling DD
  • 34. 34 Fine tuning for query objects Faster R-CNN Conv layers Region Proposal Network FC6 Class probabilities FC7 FC8 RPN Proposals RoI Pooling Conv5_3 RPN Proposals Train object detector for query instances using query images as training data
  • 35. 35 Fine tuning for query objects FT #1: Train FC layers only Conv layers Region Proposal Network FC6 Class probabilities FC7 FC8 RPN Proposals RoI Pooling Conv5_3 RPN Proposals
  • 36. 36 Fine tuning for query objects FT #2: Train all weights after conv2 Conv layers Region Proposal Network FC6 Class probabilities FC7 FC8 RPN Proposals RoI Pooling Conv5_3 RPN Proposals
  • 37. 37 Spatial Reranking Strategies Class-agnostic Spatial Reranking (CA-SR) Query Image Database Image FC6 Class probabilities FC7 FC8 ... Class-specific Spatial Reranking (CS-SR)
  • 39. 39Query image Top N retrieved images
  • 40. 40 Conclusion Faster R-CNN for Instance Search • Suitable to obtain image and region features in a single forward pass • Fine tuning as an effective solution to boost retrieval performance (subject to application time constraints) Conv layers Region Proposal Network FC6 Class probabilities FC7 FC8 RPN Proposals RoI Pooling Conv5_3 RPN Proposals Image representation Region Representation
  • 41. 41 Thank you for your attention ! E. Mohedano, A. Salvador, K. McGuinness, F. Marques, N. E. O'Connor and X. Giro, Bags of Local Convolutional Features for Scalable Instance Search Accepted at ICMR 2016 A. Salvador, X. Giro, F. Marques, S. Satoh, Faster R-CNN Features for Instance Search Accepted at DeepVision CVPRW 2016
  翻译: