SlideShare a Scribd company logo
deep learning
Algorithms and Applications
Bernardete Ribeiro, bribeiro@dei.uc.pt
University of Coimbra, Portugal
INIT/AERFAI Summer School on Machine Learning, Benicassim 22-26 June 2015
III - Deep Learning Algorithms
1
elements 3: deep neural networks
outline
∙ Learning in Deep Neural Networks
∙ Deep Learning: Evolution Timeline
∙ Deep Architectures
∙ Restricted Boltzmann Machines (RBMs)
∙ Deep Belief Networks (DBNs)
∙ Deep Models Overall Characteristics
3
learning in deep neural networks
learning in deep neural networks
1. No general learning algorithm (no-free lunch theorem by
Wolpert 1996)
2. Learning algorithm for specific tasks - perception, control,
prediction, planning reasoning, language understanding
3. Limitations of BP - local minima, optimization challenges
for non-convex objective functions
4. Hinton’s deep belief networks (DBNs) as stack of RBMs
5. LeCun’s energy based learning for DBNs
5
deep learning: evolution timeline
1. Perceptron [Frank Rosenblatt, 1959]
2. Neocognitron [K Fukushima, 1980]
3. Convolutional Neural Network (CNN) [LeCun, 1989]
4. Multi-level Hierarchy Networks [Jurgen Schmidthuber, 1992]
5. Deep Belief Networks (DBNs) as stack of RBMs [Geoffrey
Hinton, 2006]
6
deep architectures
from brain-like computing to deep learning
∙ New empirical and theoretical results have brought deep
architectures into the focus of the Machine Learning (ML)
researchers [Larochelle et al., 2007].
∙ Theoretical results suggest that deep architectures are
fundamental to learn the kind of brain-like complicated
functions that can represent high-level abstractions (e.g.
vision, speech, language) [Bengio, 2009]
8
deep concepts main idea
9
deep neural networks
∙ Convolutional Neural Networks (CNNs) [LeCun et al., 1989]
∙ Deep Belief Networks (DBNs) [Hinton et al, 2006]
∙ AutoEncoders (AEs) [Bengio et al, NIPS 2006]
∙ Sparse Autoencoders [Ranzato et al, NIPS’2006]
10
convolutional neural networks (cnns)
∙ Convolutional Neural Network consists of two basic
operations
∙ convolutional
∙ pooling
∙ Convolutional and pooling layers
are arranged alternately until
high-level features are obtained
∙ Several feature maps in each
convolutional layer
∙ Weights in the same map are
shared
NN
input C1 S2 C3 S4
1
1
I Arel, D Rose & T Karnowski, Deep Machine Learning—A New Frontier in Artificial Intelligence Research, IEEE,
CIM,2010
11
convolutional neural networks (cnns)
∙ Convolutional: suppose the size of the layer is d × d
and the size of the receptive fields are r × r, γ and x
denote respectively the values of the convolutional
layer and the previous layer:
γij = g(
r
m=1
r
n=1
xi+m−1,j+n−1.wm,n + b)
i, j = 1, · · · , (d − r + 1) where g is a nonlinear function.
∙ Pooling is following after convolution to reduce the
dimensionality of features and to introduce
translational invariance into the CNN network.
12
deep belief networks (dbns)
∙ Probabilistic generative models
contrasting with the discriminative
nature of other NNS
∙ Generative models provide a joint
probability distribution of data
and labels
∙ Unsupervised greedy-layer-wise
pre-training followed by final
tuning
image 28 x 28 pixels
visible
hidden
visible
hidden
visible
hidden
Top Level units
Labels Hidden Units
RBM Layer
RBM Layer
RBM Layer
Detection Layer
2
2
based on I Arel, D Rose & T Karnowski, Deep Machine Learning—A New Frontier in Artificial Intelligence
Research, IEEE, CIM,2010
13
autoencoders (aes)
∙ The auto-encoder has two
components:
∙ the encoder f (mapping x to h) and
∙ the decoder g (mapping h to r)
∙ An auto-encoder is a neural
network that tries to reconstruct
its input to its output
encoder f
…
…
…
…
…
…
decoder g
input x
code h
reconstruction r
3
3
based on Y Bengio, I Goodfellow and A Courville, Deep Learning, An MIT Press book (in preparation),
www.iro.umontreal.ca_~bengioy_dbook
14
deep architectures versus shallow architectures
∙ Deep architectures can be exponentially more efficient
than shallow architectures [Roux and Bengio, 2010].
∙ Functions that can be compactly represented with a Neural
Network (NN) of depth d, may require an exponential number
of computational elements for a network with depth d − 1
[Bengio, 2009].
15
deep architectures versus shallow architectures
∙ Deep architectures can be exponentially more efficient
than shallow architectures [Roux and Bengio, 2010].
∙ Functions that can be compactly represented with a Neural
Network (NN) of depth d, may require an exponential number
of computational elements for a network with depth d − 1
[Bengio, 2009].
∙ Since the number of computational elements depends on
the number of training samples available, using shallow
architectures may result in poor generalization
models [Bengio, 2009].
∙ As a result, deep architecture models tend to outperform
shallow models such as SVMs [Larochelle et al., 2007].
15
Resctricted Boltzmann Machines
Deep Belief Networks
16
restricted boltzmann machines
restricted boltzmann machines (rbms)
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
decoder
encoder
18
restricted boltzmann machines (rbms)
∙ Unsupervised
∙ Find complex regularities in
training data
∙ Bipartite Graph
∙ visible, hidden layer
∙ Binary stochastic units
∙ On/Off with probability
∙ 1 Iteration
∙ Update Hidden Units
∙ Reconstruct Visible Units
∙ Maximum Likelihood of
training data
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
encoder
19
restricted boltzmann machines (rbms)
∙ Training Goal: Best probable
reproduction
∙ unsupervised data
∙ find latent factors of data
set
∙ Adjust weights to get
maximum probability of
input data
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
encoder
20
restricted boltzmann machines (rbms)
Given an observed state, the energy of the joint configuration
of the visible units and hidden units (v, h) is given by:
E(v, h) = −
I
i=1
civi −
J
j=1
bjhj −
J
j=1
I
i=1
Wjivihj , (1)
where W is the matrix of weights, and b and c are the bias
units w.r.t. hidden and visible layers, respectively.
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
decoder
encoder
21
restricted boltzmann machines (rbms)
The Restricted Boltzmann Machine (RBM) assigns a
probability for each configuration (v, h), using:
p(v, h) =
e−E(v,h)
Z
, (2)
where Z is a normalization constant called partition function,
obtained by summing up the energy of all possible (v, h)
configurations [Bengio, 2009, Hinton, 2010,
Carreira-Perpiñán and Hinton, 2005]:
Z =
v,h
e−E(v,h)
. (3)
22
restricted boltzmann machines (rbms)
Since there are no connections between any two units within
the same layer, given a particular random input
configuration, v, all the hidden units are independent of each
other and the probability of h given v becomes:
p(h | v) =
j
p(hj = 1 | v) , (4)
where
p(hj = 1 | v) = σ(bj +
I
i=1
viWji) . (5)
23
restricted boltzmann machines (rbms)
Similarly given a specific hidden state, h, the probability of v
given h is obtained by (6):
p(v | h) =
i
p(vi = 1 | h) , (6)
where:
p(vi = 1 | h) = σ(ci +
J
j=1
hjWji) . (7)
24
restricted boltzmann machines (rbms)
Given a random training vector v, the state of a given hidden
unit j is set to 1 with probability:
p(hj = 1|v) = σ(bj +
i
viWij)
Similarly:
p(vi = 1|h) = σ(ci +
j
hjWij)
where σ (x) is the sigmoid squashing function 1
(1+e−x)
.
25
restricted boltzmann machines (rbms)
The marginal probability assigned to a visible vector, v, is
given by (8):
p(v) =
h
p(v, h) =
1
Z
h
e−E(v,h)
. (8)
Hence, given a specific training vector v its probability can be
raised by adjusting the weights and the biases in order to
lower the energy of that particular vector while raising the
energy of all the others.
26
restricted boltzmann machines (rbms)
To this end, we can perform stochastic gradient ascent
procedure on the log-likelihood obtained from training the
data vectors using ( 9):
∂ log p(v)
∂θ
= −
h
p(h | v)∂
E(v, h)
∂θ
positive phase
+
v,h
p(v, h)
∂E(v, h)
∂θ
negative phase
(9)
27
training an rbm
training an rbm
The learning rule for performing stochastic steepest ascent in
the log probability of the training data:
∂ log p(v)
∂θ
= vihj 0
− vihj ∞
(10)
where · 0 denotes expectations for the data distribution
(p0 = p(h | v)) and · ∞ denotes expectations under the
model distribution
p∞(v, h) = p(v, h) [Roux and Bengio, 2008].
h1 h2 h3 · · · hj · · · hJ 1
bias
v1 v2 · · · vi · · · vI 1
bias
visible units
hidden units
decoder
encoder
29
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
p(hj = 1|v) = σ(bj + I
i=1 viWji)
30
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
v(1)
i · · ·
p(vi = 1|h) = σ(ci + J
j=1
hjWji)
31
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
v(1)
i · · ·
h(1)
· · · j
p(hj = 1|v) = σ(bj + I
i=1 viWji)
32
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
v(1)
i · · ·
h(1)
· · · j
v(1)
i · · ·
p(vi = 1|h) = σ(ci + J
j=1
hjWji)
33
mcmc using alternating gibbs sampling
v(0) = x
i · · ·
h(0)
· · · j
vihj 0
v(1)
i · · ·
h(1)
· · · j
v(2)
i · · ·
h(2)
· · · j
v(∞)
i · · ·
h(∞)
· · · j
vihj ∞
34
contrastive divergence algorithm
contrastive divergence (cd–k)
∙ To solve this problem, Hinton proposed the Contrastive
Divergence algorithm.
∙ CD–k replaces . ∞ by · k for small values of k.
∆Wji = η( vihj 0
− vihj k
) (11)
36
contrastive divergence (cd–k)
∙ v(0) ← x
∙ Compute the binary (features) states of the hidden units,
h(0), using v(0)
∙ for n ← 1 to k
∙ Compute the “reconstruction” states for the visible units, v(n)
,
using h(n−1)
∙ Compute the “reconstruction” states for the hidden units, h(n)
,
using v(n)
∙ end for
∙ Update the weights and biases, according to:
∆Wji = η( vihj 0
− vihj k
) (12)
∆bj = η( hj 0
− hj k
) (13)
∆ci = η( vi 0 − vi k) (14)
37
deep belief networks (dbns)
deep belief networks (dbns)
x· · ·
h1· · ·
p(x|h1)p(h1|x)
x· · ·
h1· · ·
h2· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
x· · ·
h1· · ·
h2· · ·
h3· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
p(h2|h3)p(h3|h2)
39
deep belief networks (dbns)
∙ Start with a training vector
on the visible units
∙ Update all the hidden units
in parallel
∙ Update the all the visible
units in parallel to get a
“reconstruction”
∙ Update the hidden units
again
x· · ·
h1· · ·
p(x|h1)p(h1|x)
x· · ·
h1· · ·
h2· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
x· · ·
h1· · ·
h2· · ·
h3· · ·
p(x|h1)p(h1|x)
p(h1|h2)p(h2|h1)
p(h2|h3)p(h3|h2)
40
pre-training and fine tuning
RBM
data
500 hidden units
RBM
300 hidden units
500 hidden units
RBM
100 hidden units
300 hidden units
RBM
100 hidden units
10 hidden
data
update weights
500 hidden units
300 hidden units
100 hidden units
10 hidden
error < 0.001
BP
DBN Model
RBMs pre-training fine-tuning with BP
41
deep belief networks (dbns)
42
practical considerations
weights initialization
44
deep belief networks (dbns) - adaptive learning rate size
ηji =



uη(old)
ji
if ( vihj 0
− vihj k
)( vihj
(old)
0
− vihj
(old)
k
) > 0
dη(old)
ji
if ( vihj 0
− vihj k
)( vihj
(old)
0
− vihj
(old)
k
) < 0
4
4
Lopes et al., Towards Adaptive learning with improved
convergence of DBNs on GPUs, Pattern Recognition, [2014]
45
adaptive step size
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(reconstruction)
Epoch
α = 0.1
adaptive
γ = 0.1
γ = 0.4
γ = 0.7
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(reconstruction)
Epoch
α = 0.4
adaptive
γ = 0.1
γ = 0.4
γ = 0.7
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 100 200 300 400 500 600 700 800 900 1000
RMSE(reconstruction)
Epoch
α = 0.7
adaptive
γ = 0.1
γ = 0.4
γ = 0.7
Average reconstruction error (RMSE).
46
convergence results (α = 0.1)
Training images
Reconstruction
after 50 epochs
Reconstruction
after 100 epochs
Reconstruction
after 250 epochs
Reconstruction
after 500 epochs
Reconstruction
after 750 epochs
Reconstruc-
tion after
1000 epochs
Adaptive Step Size Fixed (optimized) learning rate η = 0.4
47
deep models characteristics
deep models characteristics
∙ Biological Plausibility
49
deep models characteristics
∙ Biological Plausibility
∙ DBNs are effective in a wide range of ML problems.
49
deep models characteristics
∙ Biological Plausibility
∙ DBNs are effective in a wide range of ML problems.
∙ Creating a Deep Belief Network (DBN) model is a time
consuming and computationally expensive task that
involves training several Restricted Boltzmann Machines
(RBMs) upholding considerable efforts.
49
deep models characteristics
∙ Biological Plausibility
∙ DBNs are effective in a wide range of ML problems.
∙ Creating a Deep Belief Network (DBN) model is a time
consuming and computationally expensive task that
involves training several Restricted Boltzmann Machines
(RBMs) upholding considerable efforts.
∙ The adaptive step-size procedure for tuning the learning
rate has been incorporated in the learning model with
excelling results.
49
deep models characteristics
∙ Biological Plausibility
∙ DBNs are effective in a wide range of ML problems.
∙ Creating a Deep Belief Network (DBN) model is a time
consuming and computationally expensive task that
involves training several Restricted Boltzmann Machines
(RBMs) upholding considerable efforts.
∙ The adaptive step-size procedure for tuning the learning
rate has been incorporated in the learning model with
excelling results.
∙ Graphics Processing Units (GPU) can reduce significantly
the convergence time for the data intensive tasks in DBNs
49
Bengio, Y. (2009).
Learning deep architectures for AI.
Foundations and Trends in Machine Learning, 2(1):1–127.
Carreira-Perpiñán, M. A. and Hinton, G. E. (2005).
On contrastive divergence learning.
In Proceedings of the 10th International Workshop on
Artificial Intelligence and Statistics (AISTATS 2005), pages
33–40.
Hinton, G. E. (2010).
A practical guide to training restricted Boltzmann
machines.
Technical report, Department of Computer Science,
University of Toronto.
Larochelle, H., Erhan, D., Courville, A., Bergstra, J., and
Bengio, Y. (2007).
49
An empirical evaluation of deep architectures on
problems with many factors of variation.
In Proceedings of the 24th international conference on
Machine learning (ICML 2007), pages 473–480. ACM.
Roux, N. L. and Bengio, Y. (2008).
Representational power of restricted Boltzmann
machines and deep belief networks.
Neural Computation, 20(6):1631–1649.
Roux, N. L. and Bengio, Y. (2010).
Deep belief networks are compact universal
approximators.
Neural Computation, 22(8):2192–2207.
50
Questions?
50
deep learning
Algorithms and Applications
Bernardete Ribeiro, bribeiro@dei.uc.pt
June 24, 2015
University of Coimbra, Portugal
INIT/AERFAI Summer School on Machine Learning, Benicassim 22-26 June 2015
Ad

More Related Content

What's hot (20)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Eun Ji Lee
 
Lecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationLecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image Operation
VARUN KUMAR
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
Hidekazu Oiwa
 
A new look on performance of small-cell network with design of multiple anten...
A new look on performance of small-cell network with design of multiple anten...A new look on performance of small-cell network with design of multiple anten...
A new look on performance of small-cell network with design of multiple anten...
journalBEEI
 
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
Steven Duplij (Stepan Douplii)
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Gurbinder Gill
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)
VARUN KUMAR
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
Taegyun Jeon
 
2012 cvpr gtw
2012 cvpr gtw2012 cvpr gtw
2012 cvpr gtw
Chau Phuong
 
論文紹介 Fast imagetagging
論文紹介 Fast imagetagging論文紹介 Fast imagetagging
論文紹介 Fast imagetagging
Takashi Abe
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
Taiji Suzuki
 
Collision Detection In 3D Environments
Collision Detection In 3D EnvironmentsCollision Detection In 3D Environments
Collision Detection In 3D Environments
Ung-Su Lee
 
Fisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Fisher Kernel based Relevance Feedback for Multimodal Video RetrievalFisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Fisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Ionut Mironica
 
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
vijayanand Kandaswamy
 
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
홍배 김
 
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theoryRestricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Seongwon Hwang
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
 
Optimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound imageOptimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound image
Alexander Decker
 
11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image
Alexander Decker
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Eun Ji Lee
 
Lecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image OperationLecture 19: Implementation of Histogram Image Operation
Lecture 19: Implementation of Histogram Image Operation
VARUN KUMAR
 
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via RandomizationICML2013読み会 Large-Scale Learning with Less RAM via Randomization
ICML2013読み会 Large-Scale Learning with Less RAM via Randomization
Hidekazu Oiwa
 
A new look on performance of small-cell network with design of multiple anten...
A new look on performance of small-cell network with design of multiple anten...A new look on performance of small-cell network with design of multiple anten...
A new look on performance of small-cell network with design of multiple anten...
journalBEEI
 
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theoremS.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
S.Duplij, A q-deformed generalization of the Hosszu-Gluskin theorem
Steven Duplij (Stepan Douplii)
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Gurbinder Gill
 
Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)Lecture 11 (Digital Image Processing)
Lecture 11 (Digital Image Processing)
VARUN KUMAR
 
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
Taegyun Jeon
 
論文紹介 Fast imagetagging
論文紹介 Fast imagetagging論文紹介 Fast imagetagging
論文紹介 Fast imagetagging
Takashi Abe
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
Taiji Suzuki
 
Collision Detection In 3D Environments
Collision Detection In 3D EnvironmentsCollision Detection In 3D Environments
Collision Detection In 3D Environments
Ung-Su Lee
 
Fisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Fisher Kernel based Relevance Feedback for Multimodal Video RetrievalFisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Fisher Kernel based Relevance Feedback for Multimodal Video Retrieval
Ionut Mironica
 
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
vijayanand Kandaswamy
 
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
홍배 김
 
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theoryRestricted Boltzman Machine (RBM) presentation of fundamental theory
Restricted Boltzman Machine (RBM) presentation of fundamental theory
Seongwon Hwang
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
 
Optimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound imageOptimal nonlocal means algorithm for denoising ultrasound image
Optimal nonlocal means algorithm for denoising ultrasound image
Alexander Decker
 
11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image11.optimal nonlocal means algorithm for denoising ultrasound image
11.optimal nonlocal means algorithm for denoising ultrasound image
Alexander Decker
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 

Viewers also liked (6)

A Literature Survey on Mobile-Learning Management Systems
A Literature Survey on Mobile-Learning Management SystemsA Literature Survey on Mobile-Learning Management Systems
A Literature Survey on Mobile-Learning Management Systems
AM Publications
 
Deep Learning Survey
Deep Learning SurveyDeep Learning Survey
Deep Learning Survey
Anthony Parziale
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
Akshay Hegde
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
DEEPASHRI HK
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
台灣資料科學年會
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
Lukas Masuch
 
A Literature Survey on Mobile-Learning Management Systems
A Literature Survey on Mobile-Learning Management SystemsA Literature Survey on Mobile-Learning Management Systems
A Literature Survey on Mobile-Learning Management Systems
AM Publications
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
Akshay Hegde
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
DEEPASHRI HK
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
台灣資料科學年會
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
Lukas Masuch
 
Ad

Similar to Dl1 deep learning_algorithms (20)

Vector-Based Back Propagation Algorithm of.pdf
Vector-Based Back Propagation Algorithm of.pdfVector-Based Back Propagation Algorithm of.pdf
Vector-Based Back Propagation Algorithm of.pdf
Nesrine Wagaa
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
ssusere5ddd6
 
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS Academy
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
tuxette
 
Spectral convnets
Spectral convnetsSpectral convnets
Spectral convnets
xavierbresson
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
Kenta Oono
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Soma Boubou
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
Elvis DOHMATOB
 
NN_02_Threshold_Logic_Units.pdf
NN_02_Threshold_Logic_Units.pdfNN_02_Threshold_Logic_Units.pdf
NN_02_Threshold_Logic_Units.pdf
chiron1988
 
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
Seiya Ito
 
talk_NASPDE.pdf
talk_NASPDE.pdftalk_NASPDE.pdf
talk_NASPDE.pdf
Chiheb Ben Hammouda
 
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
csandit
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
Masahiro Suzuki
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
NarenRajVivek
 
00463517b1e90c1e63000000
00463517b1e90c1e6300000000463517b1e90c1e63000000
00463517b1e90c1e63000000
Ivonne Liu
 
VoxelNet
VoxelNetVoxelNet
VoxelNet
taeseon ryu
 
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Report Satellite Navigation Systems
Report Satellite Navigation SystemsReport Satellite Navigation Systems
Report Satellite Navigation Systems
Ferro Demetrio
 
Vector-Based Back Propagation Algorithm of.pdf
Vector-Based Back Propagation Algorithm of.pdfVector-Based Back Propagation Algorithm of.pdf
Vector-Based Back Propagation Algorithm of.pdf
Nesrine Wagaa
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
ssusere5ddd6
 
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...
AILABS Academy
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
tuxette
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
Kenta Oono
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Soma Boubou
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
Elvis DOHMATOB
 
NN_02_Threshold_Logic_Units.pdf
NN_02_Threshold_Logic_Units.pdfNN_02_Threshold_Logic_Units.pdf
NN_02_Threshold_Logic_Units.pdf
chiron1988
 
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
Seiya Ito
 
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
Objective Evaluation of a Deep Neural Network Approach for Single-Channel Spe...
csandit
 
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
(DL輪読)Variational Dropout Sparsifies Deep Neural Networks
Masahiro Suzuki
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
NarenRajVivek
 
00463517b1e90c1e63000000
00463517b1e90c1e6300000000463517b1e90c1e63000000
00463517b1e90c1e63000000
Ivonne Liu
 
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
Report Satellite Navigation Systems
Report Satellite Navigation SystemsReport Satellite Navigation Systems
Report Satellite Navigation Systems
Ferro Demetrio
 
Ad

More from Armando Vieira (20)

Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Armando Vieira
 
Predicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsPredicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithms
Armando Vieira
 
Boosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithmsBoosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithms
Armando Vieira
 
Seasonality effects on second hand cars sales
Seasonality effects on second hand cars salesSeasonality effects on second hand cars sales
Seasonality effects on second hand cars sales
Armando Vieira
 
Visualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and ShinyVisualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and Shiny
Armando Vieira
 
Dl2 computing gpu
Dl2 computing gpuDl2 computing gpu
Dl2 computing gpu
Armando Vieira
 
Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015
Armando Vieira
 
Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio
Armando Vieira
 
machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...
Armando Vieira
 
Neural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective accelerationNeural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective acceleration
Armando Vieira
 
Optimization of digital marketing campaigns
Optimization of digital marketing campaignsOptimization of digital marketing campaigns
Optimization of digital marketing campaigns
Armando Vieira
 
Credit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learningCredit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learning
Armando Vieira
 
Online democracy Armando Vieira
Online democracy Armando VieiraOnline democracy Armando Vieira
Online democracy Armando Vieira
Armando Vieira
 
Invtur conference aveiro 2010
Invtur conference aveiro 2010Invtur conference aveiro 2010
Invtur conference aveiro 2010
Armando Vieira
 
Tourism with recomendation systems
Tourism with recomendation systemsTourism with recomendation systems
Tourism with recomendation systems
Armando Vieira
 
Manifold learning for bankruptcy prediction
Manifold learning for bankruptcy predictionManifold learning for bankruptcy prediction
Manifold learning for bankruptcy prediction
Armando Vieira
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
Armando Vieira
 
Requiem pelo ensino
Requiem pelo ensino Requiem pelo ensino
Requiem pelo ensino
Armando Vieira
 
Eurogen v
Eurogen vEurogen v
Eurogen v
Armando Vieira
 
Artificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysisArtificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysis
Armando Vieira
 
Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)Improving Insurance  Risk Prediction with Generative Adversarial Networks (GANs)
Improving Insurance Risk Prediction with Generative Adversarial Networks (GANs)
Armando Vieira
 
Predicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithmsPredicting online user behaviour using deep learning algorithms
Predicting online user behaviour using deep learning algorithms
Armando Vieira
 
Boosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithmsBoosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithms
Armando Vieira
 
Seasonality effects on second hand cars sales
Seasonality effects on second hand cars salesSeasonality effects on second hand cars sales
Seasonality effects on second hand cars sales
Armando Vieira
 
Visualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and ShinyVisualizations of high dimensional data using R and Shiny
Visualizations of high dimensional data using R and Shiny
Armando Vieira
 
Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015Extracting Knowledge from Pydata London 2015
Extracting Knowledge from Pydata London 2015
Armando Vieira
 
Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio Hidden Layer Leraning Vector Quantizatio
Hidden Layer Leraning Vector Quantizatio
Armando Vieira
 
machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...machine learning in the age of big data: new approaches and business applicat...
machine learning in the age of big data: new approaches and business applicat...
Armando Vieira
 
Neural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective accelerationNeural Networks and Genetic Algorithms Multiobjective acceleration
Neural Networks and Genetic Algorithms Multiobjective acceleration
Armando Vieira
 
Optimization of digital marketing campaigns
Optimization of digital marketing campaignsOptimization of digital marketing campaigns
Optimization of digital marketing campaigns
Armando Vieira
 
Credit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learningCredit risk with neural networks bankruptcy prediction machine learning
Credit risk with neural networks bankruptcy prediction machine learning
Armando Vieira
 
Online democracy Armando Vieira
Online democracy Armando VieiraOnline democracy Armando Vieira
Online democracy Armando Vieira
Armando Vieira
 
Invtur conference aveiro 2010
Invtur conference aveiro 2010Invtur conference aveiro 2010
Invtur conference aveiro 2010
Armando Vieira
 
Tourism with recomendation systems
Tourism with recomendation systemsTourism with recomendation systems
Tourism with recomendation systems
Armando Vieira
 
Manifold learning for bankruptcy prediction
Manifold learning for bankruptcy predictionManifold learning for bankruptcy prediction
Manifold learning for bankruptcy prediction
Armando Vieira
 
Artificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysisArtificial neural networks for ion beam analysis
Artificial neural networks for ion beam analysis
Armando Vieira
 

Recently uploaded (13)

Presentation Mehdi Monitorama 2022 Cancer and Monitoring
Presentation Mehdi Monitorama 2022 Cancer and MonitoringPresentation Mehdi Monitorama 2022 Cancer and Monitoring
Presentation Mehdi Monitorama 2022 Cancer and Monitoring
mdaoudi
 
Java developer-friendly frontends: Build UIs without the JavaScript hassle- JCON
Java developer-friendly frontends: Build UIs without the JavaScript hassle- JCONJava developer-friendly frontends: Build UIs without the JavaScript hassle- JCON
Java developer-friendly frontends: Build UIs without the JavaScript hassle- JCON
Jago de Vreede
 
plataforma virtual E learning y sus características.pdf
plataforma virtual E learning y sus características.pdfplataforma virtual E learning y sus características.pdf
plataforma virtual E learning y sus características.pdf
valdiviesovaleriamis
 
IoT PPT introduction to internet of things
IoT PPT introduction to internet of thingsIoT PPT introduction to internet of things
IoT PPT introduction to internet of things
VaishnaviPatil3995
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Paper: World Game (s) Great Redesign.pdf
Paper: World Game (s) Great Redesign.pdfPaper: World Game (s) Great Redesign.pdf
Paper: World Game (s) Great Redesign.pdf
Steven McGee
 
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdfGiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
Giacomo Vacca
 
Cloud-to-cloud Migration presentation.pptx
Cloud-to-cloud Migration presentation.pptxCloud-to-cloud Migration presentation.pptx
Cloud-to-cloud Migration presentation.pptx
marketing140789
 
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness GuideThe Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
russellpeter1995
 
ProjectArtificial Intelligence Good or Evil.pptx
ProjectArtificial Intelligence Good or Evil.pptxProjectArtificial Intelligence Good or Evil.pptx
ProjectArtificial Intelligence Good or Evil.pptx
OlenaKotovska
 
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
werhkr1
 
Breaking Down the Latest Spectrum Internet Plans.pdf
Breaking Down the Latest Spectrum Internet Plans.pdfBreaking Down the Latest Spectrum Internet Plans.pdf
Breaking Down the Latest Spectrum Internet Plans.pdf
Internet Bundle Now
 
introduction to html and cssIntroHTML.ppt
introduction to html and cssIntroHTML.pptintroduction to html and cssIntroHTML.ppt
introduction to html and cssIntroHTML.ppt
SherifElGohary7
 
Presentation Mehdi Monitorama 2022 Cancer and Monitoring
Presentation Mehdi Monitorama 2022 Cancer and MonitoringPresentation Mehdi Monitorama 2022 Cancer and Monitoring
Presentation Mehdi Monitorama 2022 Cancer and Monitoring
mdaoudi
 
Java developer-friendly frontends: Build UIs without the JavaScript hassle- JCON
Java developer-friendly frontends: Build UIs without the JavaScript hassle- JCONJava developer-friendly frontends: Build UIs without the JavaScript hassle- JCON
Java developer-friendly frontends: Build UIs without the JavaScript hassle- JCON
Jago de Vreede
 
plataforma virtual E learning y sus características.pdf
plataforma virtual E learning y sus características.pdfplataforma virtual E learning y sus características.pdf
plataforma virtual E learning y sus características.pdf
valdiviesovaleriamis
 
IoT PPT introduction to internet of things
IoT PPT introduction to internet of thingsIoT PPT introduction to internet of things
IoT PPT introduction to internet of things
VaishnaviPatil3995
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Paper: World Game (s) Great Redesign.pdf
Paper: World Game (s) Great Redesign.pdfPaper: World Game (s) Great Redesign.pdf
Paper: World Game (s) Great Redesign.pdf
Steven McGee
 
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdfGiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
Giacomo Vacca
 
Cloud-to-cloud Migration presentation.pptx
Cloud-to-cloud Migration presentation.pptxCloud-to-cloud Migration presentation.pptx
Cloud-to-cloud Migration presentation.pptx
marketing140789
 
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness GuideThe Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
russellpeter1995
 
ProjectArtificial Intelligence Good or Evil.pptx
ProjectArtificial Intelligence Good or Evil.pptxProjectArtificial Intelligence Good or Evil.pptx
ProjectArtificial Intelligence Good or Evil.pptx
OlenaKotovska
 
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
werhkr1
 
Breaking Down the Latest Spectrum Internet Plans.pdf
Breaking Down the Latest Spectrum Internet Plans.pdfBreaking Down the Latest Spectrum Internet Plans.pdf
Breaking Down the Latest Spectrum Internet Plans.pdf
Internet Bundle Now
 
introduction to html and cssIntroHTML.ppt
introduction to html and cssIntroHTML.pptintroduction to html and cssIntroHTML.ppt
introduction to html and cssIntroHTML.ppt
SherifElGohary7
 

Dl1 deep learning_algorithms

  • 1. deep learning Algorithms and Applications Bernardete Ribeiro, bribeiro@dei.uc.pt University of Coimbra, Portugal INIT/AERFAI Summer School on Machine Learning, Benicassim 22-26 June 2015
  • 2. III - Deep Learning Algorithms 1
  • 3. elements 3: deep neural networks
  • 4. outline ∙ Learning in Deep Neural Networks ∙ Deep Learning: Evolution Timeline ∙ Deep Architectures ∙ Restricted Boltzmann Machines (RBMs) ∙ Deep Belief Networks (DBNs) ∙ Deep Models Overall Characteristics 3
  • 5. learning in deep neural networks
  • 6. learning in deep neural networks 1. No general learning algorithm (no-free lunch theorem by Wolpert 1996) 2. Learning algorithm for specific tasks - perception, control, prediction, planning reasoning, language understanding 3. Limitations of BP - local minima, optimization challenges for non-convex objective functions 4. Hinton’s deep belief networks (DBNs) as stack of RBMs 5. LeCun’s energy based learning for DBNs 5
  • 7. deep learning: evolution timeline 1. Perceptron [Frank Rosenblatt, 1959] 2. Neocognitron [K Fukushima, 1980] 3. Convolutional Neural Network (CNN) [LeCun, 1989] 4. Multi-level Hierarchy Networks [Jurgen Schmidthuber, 1992] 5. Deep Belief Networks (DBNs) as stack of RBMs [Geoffrey Hinton, 2006] 6
  • 9. from brain-like computing to deep learning ∙ New empirical and theoretical results have brought deep architectures into the focus of the Machine Learning (ML) researchers [Larochelle et al., 2007]. ∙ Theoretical results suggest that deep architectures are fundamental to learn the kind of brain-like complicated functions that can represent high-level abstractions (e.g. vision, speech, language) [Bengio, 2009] 8
  • 11. deep neural networks ∙ Convolutional Neural Networks (CNNs) [LeCun et al., 1989] ∙ Deep Belief Networks (DBNs) [Hinton et al, 2006] ∙ AutoEncoders (AEs) [Bengio et al, NIPS 2006] ∙ Sparse Autoencoders [Ranzato et al, NIPS’2006] 10
  • 12. convolutional neural networks (cnns) ∙ Convolutional Neural Network consists of two basic operations ∙ convolutional ∙ pooling ∙ Convolutional and pooling layers are arranged alternately until high-level features are obtained ∙ Several feature maps in each convolutional layer ∙ Weights in the same map are shared NN input C1 S2 C3 S4 1 1 I Arel, D Rose & T Karnowski, Deep Machine Learning—A New Frontier in Artificial Intelligence Research, IEEE, CIM,2010 11
  • 13. convolutional neural networks (cnns) ∙ Convolutional: suppose the size of the layer is d × d and the size of the receptive fields are r × r, γ and x denote respectively the values of the convolutional layer and the previous layer: γij = g( r m=1 r n=1 xi+m−1,j+n−1.wm,n + b) i, j = 1, · · · , (d − r + 1) where g is a nonlinear function. ∙ Pooling is following after convolution to reduce the dimensionality of features and to introduce translational invariance into the CNN network. 12
  • 14. deep belief networks (dbns) ∙ Probabilistic generative models contrasting with the discriminative nature of other NNS ∙ Generative models provide a joint probability distribution of data and labels ∙ Unsupervised greedy-layer-wise pre-training followed by final tuning image 28 x 28 pixels visible hidden visible hidden visible hidden Top Level units Labels Hidden Units RBM Layer RBM Layer RBM Layer Detection Layer 2 2 based on I Arel, D Rose & T Karnowski, Deep Machine Learning—A New Frontier in Artificial Intelligence Research, IEEE, CIM,2010 13
  • 15. autoencoders (aes) ∙ The auto-encoder has two components: ∙ the encoder f (mapping x to h) and ∙ the decoder g (mapping h to r) ∙ An auto-encoder is a neural network that tries to reconstruct its input to its output encoder f … … … … … … decoder g input x code h reconstruction r 3 3 based on Y Bengio, I Goodfellow and A Courville, Deep Learning, An MIT Press book (in preparation), www.iro.umontreal.ca_~bengioy_dbook 14
  • 16. deep architectures versus shallow architectures ∙ Deep architectures can be exponentially more efficient than shallow architectures [Roux and Bengio, 2010]. ∙ Functions that can be compactly represented with a Neural Network (NN) of depth d, may require an exponential number of computational elements for a network with depth d − 1 [Bengio, 2009]. 15
  • 17. deep architectures versus shallow architectures ∙ Deep architectures can be exponentially more efficient than shallow architectures [Roux and Bengio, 2010]. ∙ Functions that can be compactly represented with a Neural Network (NN) of depth d, may require an exponential number of computational elements for a network with depth d − 1 [Bengio, 2009]. ∙ Since the number of computational elements depends on the number of training samples available, using shallow architectures may result in poor generalization models [Bengio, 2009]. ∙ As a result, deep architecture models tend to outperform shallow models such as SVMs [Larochelle et al., 2007]. 15
  • 20. restricted boltzmann machines (rbms) h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units decoder encoder 18
  • 21. restricted boltzmann machines (rbms) ∙ Unsupervised ∙ Find complex regularities in training data ∙ Bipartite Graph ∙ visible, hidden layer ∙ Binary stochastic units ∙ On/Off with probability ∙ 1 Iteration ∙ Update Hidden Units ∙ Reconstruct Visible Units ∙ Maximum Likelihood of training data h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units encoder 19
  • 22. restricted boltzmann machines (rbms) ∙ Training Goal: Best probable reproduction ∙ unsupervised data ∙ find latent factors of data set ∙ Adjust weights to get maximum probability of input data h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units encoder 20
  • 23. restricted boltzmann machines (rbms) Given an observed state, the energy of the joint configuration of the visible units and hidden units (v, h) is given by: E(v, h) = − I i=1 civi − J j=1 bjhj − J j=1 I i=1 Wjivihj , (1) where W is the matrix of weights, and b and c are the bias units w.r.t. hidden and visible layers, respectively. h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units decoder encoder 21
  • 24. restricted boltzmann machines (rbms) The Restricted Boltzmann Machine (RBM) assigns a probability for each configuration (v, h), using: p(v, h) = e−E(v,h) Z , (2) where Z is a normalization constant called partition function, obtained by summing up the energy of all possible (v, h) configurations [Bengio, 2009, Hinton, 2010, Carreira-Perpiñán and Hinton, 2005]: Z = v,h e−E(v,h) . (3) 22
  • 25. restricted boltzmann machines (rbms) Since there are no connections between any two units within the same layer, given a particular random input configuration, v, all the hidden units are independent of each other and the probability of h given v becomes: p(h | v) = j p(hj = 1 | v) , (4) where p(hj = 1 | v) = σ(bj + I i=1 viWji) . (5) 23
  • 26. restricted boltzmann machines (rbms) Similarly given a specific hidden state, h, the probability of v given h is obtained by (6): p(v | h) = i p(vi = 1 | h) , (6) where: p(vi = 1 | h) = σ(ci + J j=1 hjWji) . (7) 24
  • 27. restricted boltzmann machines (rbms) Given a random training vector v, the state of a given hidden unit j is set to 1 with probability: p(hj = 1|v) = σ(bj + i viWij) Similarly: p(vi = 1|h) = σ(ci + j hjWij) where σ (x) is the sigmoid squashing function 1 (1+e−x) . 25
  • 28. restricted boltzmann machines (rbms) The marginal probability assigned to a visible vector, v, is given by (8): p(v) = h p(v, h) = 1 Z h e−E(v,h) . (8) Hence, given a specific training vector v its probability can be raised by adjusting the weights and the biases in order to lower the energy of that particular vector while raising the energy of all the others. 26
  • 29. restricted boltzmann machines (rbms) To this end, we can perform stochastic gradient ascent procedure on the log-likelihood obtained from training the data vectors using ( 9): ∂ log p(v) ∂θ = − h p(h | v)∂ E(v, h) ∂θ positive phase + v,h p(v, h) ∂E(v, h) ∂θ negative phase (9) 27
  • 31. training an rbm The learning rule for performing stochastic steepest ascent in the log probability of the training data: ∂ log p(v) ∂θ = vihj 0 − vihj ∞ (10) where · 0 denotes expectations for the data distribution (p0 = p(h | v)) and · ∞ denotes expectations under the model distribution p∞(v, h) = p(v, h) [Roux and Bengio, 2008]. h1 h2 h3 · · · hj · · · hJ 1 bias v1 v2 · · · vi · · · vI 1 bias visible units hidden units decoder encoder 29
  • 32. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 p(hj = 1|v) = σ(bj + I i=1 viWji) 30
  • 33. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 v(1) i · · · p(vi = 1|h) = σ(ci + J j=1 hjWji) 31
  • 34. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 v(1) i · · · h(1) · · · j p(hj = 1|v) = σ(bj + I i=1 viWji) 32
  • 35. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 v(1) i · · · h(1) · · · j v(1) i · · · p(vi = 1|h) = σ(ci + J j=1 hjWji) 33
  • 36. mcmc using alternating gibbs sampling v(0) = x i · · · h(0) · · · j vihj 0 v(1) i · · · h(1) · · · j v(2) i · · · h(2) · · · j v(∞) i · · · h(∞) · · · j vihj ∞ 34
  • 38. contrastive divergence (cd–k) ∙ To solve this problem, Hinton proposed the Contrastive Divergence algorithm. ∙ CD–k replaces . ∞ by · k for small values of k. ∆Wji = η( vihj 0 − vihj k ) (11) 36
  • 39. contrastive divergence (cd–k) ∙ v(0) ← x ∙ Compute the binary (features) states of the hidden units, h(0), using v(0) ∙ for n ← 1 to k ∙ Compute the “reconstruction” states for the visible units, v(n) , using h(n−1) ∙ Compute the “reconstruction” states for the hidden units, h(n) , using v(n) ∙ end for ∙ Update the weights and biases, according to: ∆Wji = η( vihj 0 − vihj k ) (12) ∆bj = η( hj 0 − hj k ) (13) ∆ci = η( vi 0 − vi k) (14) 37
  • 41. deep belief networks (dbns) x· · · h1· · · p(x|h1)p(h1|x) x· · · h1· · · h2· · · p(x|h1)p(h1|x) p(h1|h2)p(h2|h1) x· · · h1· · · h2· · · h3· · · p(x|h1)p(h1|x) p(h1|h2)p(h2|h1) p(h2|h3)p(h3|h2) 39
  • 42. deep belief networks (dbns) ∙ Start with a training vector on the visible units ∙ Update all the hidden units in parallel ∙ Update the all the visible units in parallel to get a “reconstruction” ∙ Update the hidden units again x· · · h1· · · p(x|h1)p(h1|x) x· · · h1· · · h2· · · p(x|h1)p(h1|x) p(h1|h2)p(h2|h1) x· · · h1· · · h2· · · h3· · · p(x|h1)p(h1|x) p(h1|h2)p(h2|h1) p(h2|h3)p(h3|h2) 40
  • 43. pre-training and fine tuning RBM data 500 hidden units RBM 300 hidden units 500 hidden units RBM 100 hidden units 300 hidden units RBM 100 hidden units 10 hidden data update weights 500 hidden units 300 hidden units 100 hidden units 10 hidden error < 0.001 BP DBN Model RBMs pre-training fine-tuning with BP 41
  • 44. deep belief networks (dbns) 42
  • 47. deep belief networks (dbns) - adaptive learning rate size ηji =    uη(old) ji if ( vihj 0 − vihj k )( vihj (old) 0 − vihj (old) k ) > 0 dη(old) ji if ( vihj 0 − vihj k )( vihj (old) 0 − vihj (old) k ) < 0 4 4 Lopes et al., Towards Adaptive learning with improved convergence of DBNs on GPUs, Pattern Recognition, [2014] 45
  • 48. adaptive step size 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0 100 200 300 400 500 600 700 800 900 1000 RMSE(reconstruction) Epoch α = 0.1 adaptive γ = 0.1 γ = 0.4 γ = 0.7 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0 100 200 300 400 500 600 700 800 900 1000 RMSE(reconstruction) Epoch α = 0.4 adaptive γ = 0.1 γ = 0.4 γ = 0.7 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0 100 200 300 400 500 600 700 800 900 1000 RMSE(reconstruction) Epoch α = 0.7 adaptive γ = 0.1 γ = 0.4 γ = 0.7 Average reconstruction error (RMSE). 46
  • 49. convergence results (α = 0.1) Training images Reconstruction after 50 epochs Reconstruction after 100 epochs Reconstruction after 250 epochs Reconstruction after 500 epochs Reconstruction after 750 epochs Reconstruc- tion after 1000 epochs Adaptive Step Size Fixed (optimized) learning rate η = 0.4 47
  • 51. deep models characteristics ∙ Biological Plausibility 49
  • 52. deep models characteristics ∙ Biological Plausibility ∙ DBNs are effective in a wide range of ML problems. 49
  • 53. deep models characteristics ∙ Biological Plausibility ∙ DBNs are effective in a wide range of ML problems. ∙ Creating a Deep Belief Network (DBN) model is a time consuming and computationally expensive task that involves training several Restricted Boltzmann Machines (RBMs) upholding considerable efforts. 49
  • 54. deep models characteristics ∙ Biological Plausibility ∙ DBNs are effective in a wide range of ML problems. ∙ Creating a Deep Belief Network (DBN) model is a time consuming and computationally expensive task that involves training several Restricted Boltzmann Machines (RBMs) upholding considerable efforts. ∙ The adaptive step-size procedure for tuning the learning rate has been incorporated in the learning model with excelling results. 49
  • 55. deep models characteristics ∙ Biological Plausibility ∙ DBNs are effective in a wide range of ML problems. ∙ Creating a Deep Belief Network (DBN) model is a time consuming and computationally expensive task that involves training several Restricted Boltzmann Machines (RBMs) upholding considerable efforts. ∙ The adaptive step-size procedure for tuning the learning rate has been incorporated in the learning model with excelling results. ∙ Graphics Processing Units (GPU) can reduce significantly the convergence time for the data intensive tasks in DBNs 49
  • 56. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1–127. Carreira-Perpiñán, M. A. and Hinton, G. E. (2005). On contrastive divergence learning. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS 2005), pages 33–40. Hinton, G. E. (2010). A practical guide to training restricted Boltzmann machines. Technical report, Department of Computer Science, University of Toronto. Larochelle, H., Erhan, D., Courville, A., Bergstra, J., and Bengio, Y. (2007). 49
  • 57. An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th international conference on Machine learning (ICML 2007), pages 473–480. ACM. Roux, N. L. and Bengio, Y. (2008). Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation, 20(6):1631–1649. Roux, N. L. and Bengio, Y. (2010). Deep belief networks are compact universal approximators. Neural Computation, 22(8):2192–2207. 50
  • 59. deep learning Algorithms and Applications Bernardete Ribeiro, bribeiro@dei.uc.pt June 24, 2015 University of Coimbra, Portugal INIT/AERFAI Summer School on Machine Learning, Benicassim 22-26 June 2015
  翻译: