Convolutional Features for Instance Search

Convolutional Features for
Instance Search
Amaia Salvador
03/05/2016

2
Related Publications
E. Mohedano, A. Salvador, K. McGuinness, F. Marques, N. E. O'Connor and X. Giro,
Bags of Local Convolutional Features for Scalable Instance Search
Accepted at ICMR 2016
A. Salvador, X. Giro, F. Marques, S. Satoh,
Faster R-CNN Features for Instance Search
Accepted at DeepVision CVPRW 2016

Part I

Visual Image Retrieval
4Image Database
Visual Query
“A dog”
Expected outcome:

Visual Instance Retrieval
5Image Database
Visual Query
“This dog”
Expected outcome:

Visual Instance Retrieval
6
Image RepresentationsQuery image
Image
Database
Image Matching Ranking List
Similarity score Image
...
0.98
0.97
0.10
0.01
v = (v1
, …, vn
)
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
Similarity
Metric
(e.g. cosine similarity)
...

7
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
INVERTED FILE
word Image ID
1 1, 12,
2 1, 30, 102
3 10, 12
4 2,3
6 10
...
Local hand-crafted features
(e.g. SIFT)
Bag of Visual
WordsN-Dimensional
feature space
Image Representations
High-dimensional
Highly sparse

8
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In
Advances in neural information processing systems (pp. 1097-1105).
Convolutional Neural Networks

9
Babenko, A., Slesarev, A., Chigorin, A., & Lempitsky, V. (2014). Neural codes for image retrieval. In ECCV 2014
Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: an astounding baseline for recognition. In
DeepVision CVPRW 2014
Convolutional Neural Networks FC layers as global feature representation

10
Babenko, A., & Lempitsky, V. (2015). Aggregating local deep features for image retrieval. ICCV 2015
Tolias, G., Sicre, R., & Jégou, H. (2015). Particular object retrieval with integral max-pooling of CNN activations. ICLR 2015
Kalantidis, Y., Mellina, C., & Osindero, S. (2015). Cross-dimensional Weighting for Aggregated Deep Convolutional Features. arXiv
preprint arXiv:1512.04065.
sum/max pooled conv features as global representation

11
Ng, J., Yang, F., & Davis, L. (2015). Exploiting local features from deep networks for image retrieval. In DeepVision CVPRW 2015
conv features encoded with VLAD as global representation

12
Motivation
Dataset Complexity
TRECVID Instance Search
464 hours of video content

13
Motivation: Image Representations
High-dimensional & Sparse
Bag of Visual Words
Compact & Dense
(e.g. sum/max pooling conv feats, FC feats)
Capacity?
High-dimensional & Dense
(e.g. VLAD encoding)
Scalability?

16
Bag of Words Framework
(336x256)
Resolution
conv5_1 from
VGG16[1]
(42x32)
[1]Simonyan K., Zisserman A., Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv 2014
25K centroids 25K-D vector

17
Instance Retrieval
Query Representation
... ... ...
... ... ...
Global Search
(GS)
Local Search
(LS)

18
Spatial Reranking
Image
Database
v = (v1
, …, vn
)
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
Similarity
Metric
(cosine similarity)
...
Top M images
are locally analyzed
and reranked
(M = 100)

19
Spatial Reranking
All window combinations with:
Query Image Target image in top M ranking
...
...

20
Query Expansion
Image
Database
v = (v1
, …, vn
)
v1
= (v11
, …, v1n
)
vk
= (vk1
, …, vkn
)
...
Similarity
Metric
(cosine similarity)
...
Top N images
are added to the
query for a new
search
(N = 5)

22
Datasets
Paris Buildings 6k Oxford Buildings 5k
TRECVID Instance Search 2013
(subset of 23k frames)
Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Object retrieval with large vocabularies and fast spatial matching, CVPR 2007
Philbin, J. , Chum, O. , Isard, M. , Sivic, J. and Zisserman, A. Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases. CVPR 2008
Smeaton, A. F., Over, P., & Kraaij, W. Evaluation campaigns and TRECVid. ACM MM Multimedia information retrieval Workshop 2006

26
Conclusion
BoW encoding of convolutional features
• High-dimensional sparse representation suitable for fast retrieval
• Competitive results in two image retrieval benchmarks
• Well suited and more robust for scenarios where only small number of features are
in the target images are relevant to the query (INS).

Part II

28
Reminder: Spatial Reranking
Query Image Target image in top M ranking
...
...

29
Reminder: Spatial Reranking
Koen E. A. van de Sande, Jasper R. R. Uijlings, Theo Gevers, Arnold W. M. Smeulders. Segmentation as Selective
Search for Object Recognition, ICCV 2011
Object Proposals

30
Image & Region Representations
“dog”
CNN Architectures
plant, table, dog
CNN
CNN
Image Classification
Object Detection

31
Faster R-CNN
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
Ren, S., He, K., Girshick, R., & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal
networks. NIPS 2015

32
Faster R-CNN
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
Ren, S., He, K., Girshick, R., & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal
networks. NIPS 2015
Image representation
Region Representation

33
Image representation Region Representation
(for reranking)
RoI
Pooling
Conv5_3 RoI
Pooling
sum-pooling max-pooling
DD

34
Fine tuning for query objects
Faster R-CNN
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
Train object detector for query instances using query images as training data

35
FT #1: Train FC layers only
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals

36
FT #2: Train all weights after conv2
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals

37
Spatial Reranking Strategies
Class-agnostic Spatial Reranking (CA-SR)
Query Image Database
Image
FC6
Class probabilities
FC7
FC8
...
Class-specific Spatial Reranking (CS-SR)

39Query image Top N retrieved images

40
Conclusion
Faster R-CNN for Instance Search
• Suitable to obtain image and region features in a single forward pass
• Fine tuning as an effective solution to boost retrieval performance (subject to
application time constraints)
Conv
layers
Region Proposal
Network
FC6
Class probabilities
FC7
FC8
RPN Proposals
RoI
Pooling
Conv5_3
RPN Proposals
Image representation
Region Representation

41
Thank you for your attention !
Accepted at ICMR 2016
Accepted at DeepVision CVPRW 2016

Convolutional Features for Instance Search

Recommended

More Related Content

What's hot (20)

Similar to Convolutional Features for Instance Search (20)

More from Universitat Politècnica de Catalunya (20)

Recently uploaded (20)

Convolutional Features for Instance Search