SlideShare a Scribd company logo
Copyright 2011 Trend Micro Inc. 1
Introduction to Deep Neural Network
Liwei Ren, Ph.D
San Jose, California, Nov, 2016
Copyright 2011 Trend Micro Inc.
Agenda
• What a DNN is
• How a DNN works
• Why a DNN works
• Those DNNs in action
• Where the challenges are
• Successful stories
• Security problems
• Summary
• Quiz
• What else
2
Copyright 2011 Trend Micro Inc.
What is a DNN?
• DNN and AI in the secular world
3
Copyright 2011 Trend Micro Inc.
What is a DNN?
• DNN and AI in the secular world
4
Copyright 2011 Trend Micro Inc.
What is a DNN?
• DNN and AI in the secular world
5
Copyright 2011 Trend Micro Inc.
What is a DNN?
• DNN in the technical world
6
Copyright 2011 Trend Micro Inc.
What is a DNN?
• DNN in the technical world
7
Copyright 2011 Trend Micro Inc.
What is a DNN?
• DNN in the technical world
8
Copyright 2011 Trend Micro Inc.
What is a DNN?
• Categorizing the DNNs :
9
Copyright 2011 Trend Micro Inc.
What is a DNN?
• Three technical elements
• Architecture: the graph, weights/biases, activation functions
• Activity Rule: weights/biases, activation functions
• Learning Rule: a typical one is backpropagation algorithm
• Three masters in this area:
10
Copyright 2011 Trend Micro Inc.
What is a DNN?
• Given a practical problem , we have two approaches
to solve it.
11
Copyright 2011 Trend Micro Inc.
What is a DNN?
• An example: image recognition
12
Copyright 2011 Trend Micro Inc.
What is a DNN?
• An example: image recognition
13
Copyright 2011 Trend Micro Inc.
What is a DNN?
• In the mathematical world
– A DNN is a mathematical function f: D  S, where D ⊆ Rn and S ⊆ Rm,
which is constructed by a directed graph based architecture.
– A DNN is also a composition of functions from a network of primitive
functions.
14
Copyright 2011 Trend Micro Inc.
What is a DNN?
• We denote the a feed-forward DNN function by O= f(I) which
is determined by a few parameters G, Φ ,W,B
• Hyper-parameters:
– G is the directed graph which presents the structure
– Φ presents one or multiple activation functions for activating the nodes
• Parameters:
– W is the vector of weights relevant to the edges
– B is the vector of biases relevant to the nodes
15
Copyright 2011 Trend Micro Inc.
What is a DNN?
• Activation at a node:
16
Copyright 2011 Trend Micro Inc.
What is a DNN?
• Activation function:
17
Copyright 2011 Trend Micro Inc.
What is a DNN?
• G=(V,E) is a graph and Φ is a set of activation functions.
• <G,Φ> constructs a family of functions F:
– F(G,Φ) = { f | f is a function constructed by <G, Φ ,W> where WϵRN }
• N= total number of weights at all nodes of output layer and hidden layers.
• Each f(I) can be denoted by f(I ,W).
18
Copyright 2011 Trend Micro Inc.
What is a DNN?
• Mathematically, a DNN based supervised machine
learning technology can be described as follows :
– Given g ϵ { h | h:D  S where D ⊆ Rn and S ⊆ Rm} and δ>0 , find f ϵ
F(G,Φ) such that 𝑓 − 𝑔 < δ.
• Essentially, it is to identify a W ϵ RN such that 𝑓(∗, 𝑊) − 𝑔 < δ
• However, in practice, g is not explicitly expressed . It
usually appears in a sequence of samples:
– { <I(j),T(j)> | T(j) =g(I(j)), j=1, 2, …,M}
• where I(j) is an input vector and T(j) is its corresponding target vector.
19
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• The function g is not explicitly expressed, we are not able
to calculate g − f(∗, W)
• Instead, we evaluate the error function E(W)=
1
2𝑀
∑||T(j) -
f(I(j),W)||2
• We expect to determine W such that E(W) < δ
• How to identify W ϵ RN so that E(W) < δ ? Lets solve the
nonlinear optimization problem min{E(W)| W ϵ RN} , i.e.:
min{
1
2𝑀
∑|| T(j) - f(I(j),W) ||2 | W ϵ RN } (P1)
20
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• (P1) is for batch mode training, however ,it is too
expensive.
• In order to reduce the computational cost, a sequential
mode is introduced.
• Picking <I,T> ϵ {<I(1),T(1) >, <I(2),T(2)> ,…, <I(M),T(M)>}
sequentially, let the output of the network as O= f(I,W) for any
W:
• Error function E(W)= ||T- f(I,W)||2 /2 = ∑(Tj-Oj)2 /2
• Each Oj can be considered as a function of W. We denote it as Oj(W).
• We have the optimization problem for training with
sequential mode:
– min{ ∑(Tj-Oj(W) )2 /2 | W ϵ RN} (P2)
21
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• One may ask whether we get the same solution for
both batch mode and sequential mode ?
• BTW
– batch mode = offline mode
– sequential mode = online mode
• We focus on online mode in this talk
22
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• How to solve the unconstrained nonlinear optimization
problem (P2)?
• The general approach of unconstrained nonlinear
optimization is to find local minima of E(W) by using
the iterative process of Gradient Descent.
•∂E = (∂E/∂W1, ∂E/∂W2, …, ∂E/∂WT)
• The iterations:
– ΔWj = - γ ∂E/∂Wj for j=1, …,T
– Updating W in each step by
• Wj
(k+1) = Wj
(k) - γ ∂E(W (k))/∂Wj for j=1, …,T (A1)
• until E(W (k+1)) < δ or E(W (k+1)) can not be reduced anymore
23
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• The algorithm of Gradient Descent:
24
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• From the perspective of mathematics, the process of
Gradient Descent is straightforward.
• However, from the perspective of scientific computing, it is
quite challenging to calculate the values of all ∂E/∂Wj for
j=1, …,N:
– The complexity of presenting each ∂E/∂Wj where j=1, …,N.
– There are (k+1)-layer function compositions for a DNN of k hidden
layers.
25
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• For example, we have a very simple network as follows
with the activation function φ(v)=1/(1 + 𝑒−𝑣
).
• E(W) = [ T - f(I,W) ]2 /2= [T – φ(w1φ(w3I+ w2) + w0)]2 /2, we
have:
– ∂E/∂w0 = -[T – φ(w1φ(w3I + w2) + w0)] φ’(w1φ(w3I+w2) + w0)
– ∂E/∂w1 = -[T – φ(w1φ(w3I + w2) + w0)] φ’(w1φ(w3I+w2) + w0) φ(w3I+w2)
– ∂E/∂w2 = - w1 [T – φ(w1φ(w3I + w2) + w0)] φ’(w1φ(w3I+w2) + w0)
φ’(w3I+w2)
– ∂E/∂w3 = - I w1 [T – φ(w1φ(w3I + w2) + w0)] φ’(w1φ(w3I+w2) + w0)
φ’(w3I+w2)
26
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• Lets imagine a network of N inputs, M outputs and K
hidden layers each of which has L nodes.
– It is a daunting task to express ∂E/∂wj explicitly. Last simple
example already shows this.
• The backpropagation (BP) algorithm was proposed as a
rescue:
– Main idea : the weights of (k-1)-th hidden layer can be expressed
by the k-th layer recursively.
– We can start with the output layer which is considered as (L+1)-
layer.
27
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• BP algorithm has the following major steps:
1. Feed-forward computation
2. Back-propagation to the output layer
3. Back-propagation to the hidden layers
4. Weight updates
28
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
29
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• A general DNN can be drawn as follows
30
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• How to express the weights of (k-1)-th hidden layer by the
weights of k-th layer recursively?
31
Copyright 2011 Trend Micro Inc.
How Does a DNN work ?
• Let us experience the BP with our small network.
– E(W) = [ T - f(I,W) ]2 /2= [T – φ(w1φ(w3I+ w2) + w0)]2 /2.
• ∂E/∂w0 = - φ’(O) (T – O)
• ∂E/∂w1 = -φ’(O) (T – O) φ(O)
• ∂E/∂w2 = - φ’(O) (T – O) φ’(H) w1 * 1
• ∂E/∂w3 = - φ’(O) (T – O) φ’(H) w1 * I
– Let H0
(1)= 1, H1
(1) = H = φ(w3I+ w2), H1
(0) = I, we verify the follows:
• δ1
(2)= φ’(O) (T – O)
• w0
+ = w0 + γ δ1
(2) H0
(1) , w1
+ = w1 + γ δ1
(2) H1
(1)
• δ1
(1)= φ’(H1
(1)) δ1
(2) w1
• w2
+ = w2 + γ δ1
(1) H0
(0) , w3
+ = w3 + γ δ1
(1) H1
(0)
• where w0 = w0,1
(2) , w1 = w1,1
(2), w2 = w0,1
(1) , w2 = w1,1
(1)
32
Copyright 2011 Trend Micro Inc.
Why Does a DNN Work?
• It is amazing ! However, why does it work?
• For a FNN, it is to ask whether the following approximation
problem has a solution:
– Given g ϵ { h | h:D  S where D ⊆ Rn and S ⊆ Rm} and δ>0 , find a W ϵ
RN such that 𝑓(∗, 𝑊) − 𝑔 < δ.
• Universal approximation theorem (S):
– Let φ(.) be a bounded and monotonically-increasing continuous function. Let
Im denote the m-dimensional unit hypercube [0,1]m . The space of continuous
functions on Im is denoted by C(Im) . Then, given any function f ϵ C(Im) and
ε>0 , there exists an integer N , real constants vi, bi ϵ R and real vectors wi ϵ
Rm, where i=1, …, N , such that
|F(x)-f(x)| < ε
for all x ϵ Im , where F(x) = vi φ(wi T x + bi)𝑵
𝒊=𝟏 is an approximation to the
function f which is independent of φ .
33
Im
Copyright 2011 Trend Micro Inc.
Why Does a DNN Work?
• Its corresponding network with only one hidden layer
– NOTE : this is not even a general case for one hidden layer. It is a
special case. WHY?
– However, it is powerful and encouraging from the mathematical
perspective.
34
Im
Copyright 2011 Trend Micro Inc.
Why Does a DNN Work?
The general networks have a general version of Universal
Approximation Theorem accordingly:
35
Im
Copyright 2011 Trend Micro Inc.
Why Does a DNN Work?
• Universal approximation theorem (G):
– Let φ(.) be a bounded and monotonically-increasing continuous function. Let
S be a compact space in Rm . Let C(S ) = {g | g:S ⊂ Rm  Rn is continuous}.
Then, given any function f ϵ C(S) and ε>0 , there exists a FNN as shown
above which constructs the network function F such that
|| F(x)-f(x) || < ε
where F is an approximation to the function f which is independent of φ .
• It seems both shallow and deep neural networks can
construct an approximation to a given function.
– Which is better?
– Or which is more efficient in terms of using less nodes ?
36
Im
Rm
Copyright 2011 Trend Micro Inc.
Why Does a DNN Work?
• Mathematical foundation of neural networks:
37
Im
Rm
Copyright 2011 Trend Micro Inc.
Those DNNs in action
• DNN has three elements
• Architecture: the graph , weights/biases, activation functions
• Activity Rule: weights/biases, activation functions
• Learning Rule: a typical one is backpropagation algorithm
• The architecture basically determines the capability of a specific DNN
– Different architectures are suitable for different applications.
– The most general architecture of an ANN is a DAG ( directed acyclic graph).
38
Copyright 2011 Trend Micro Inc.
Those DNNs in action
• There are a few well-known categories of DNNs.
39
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• Given a specific problem, there are a few questions
before one starts the journey with DNNs:
– Do you understand the problem that you need to solve?
– Do you really want to solve this problem with DNN, why?
• Do you have an alternative yet effective solution?
– Do you know how to describe the problem in DNN mathematically ?
– Do you know how to implement a DNN , beyond a few APIs and
sizzling hype?
– How to collect sufficient data for training?
– How to solve the problem efficiently and cost-effectively?
40
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• 1st Challenge:
– a full mesh network has the curse of dimensionality.
41
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• Many tasks of FNN do not need a full mesh network.
• For example, if we can present the input vector as a grid, the nearest-
neighborhood models can be used when constructing an effective FNN
which can reduce connections
– Image recognition
– GO (圍棋) : a game that two players play on a 19x19 grid of lines.
42
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• The 2nd challenge is how to describe a technical problem in terms of
DNN, i.e., mathematical modeling. There are generally two
approaches:
– Applying a well-learned DNN architecture to describe the problem. Deep
understanding of the specific network is usually required!
• Two general DNN architectures are well-known
– FNN: feedforward neural network. Its special architecture CNN (convolutional
neural network) is widely used in many applications such as image recognition,
GO, and etc.
– RNN: recurrent neural network. Its special architecture is LSTM (long short-
term memory) which has been applied successfully in speech recognition,
language translation, and etc.
• For example, if we want to try a FNN, how to describe the problem in
terms of <Input vector, Output vector> with fixed dimension ?
– Creating a novel DNN architecture from ground if none of the existing
models fits your problem. Deep understanding of DNN theory /
algorithms is required.
43
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• Handwriting digit recognition:
– Modeling this problem is straightforward
44
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• Image Recognition is also straightforward
45
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• However, due to the curse of dimensionality, we can use a special FFN:
– Convolutional neural network (CNN)
46
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• How to construct a DNN to describe language translation ?
– They use LSTM networks
• How to construct a DNN to describe the problem of malware
classification?
• How to construct a DNN to describe the network traffic for
security purpose?
47
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• The 3rd challenge is how to collect sufficient training data.
To achieve required accuracy, sufficient training data is
necessary. WHY?
48
Copyright 2011 Trend Micro Inc.
What Are the Challenges?
• The 4th challenge is how to identify various talents for
providing a DNN solution to solve specific problems.
– Who knows how to use existing DL APIs such as TensorFlow
– Who understands various DNN architectures in depth so that he/she
knows how to evaluate and identify a suitable DNN architecture to
solve the problem.
– Who understands the theory and algorithms of the DNN in depth so
that he/she can create and design a novel DNN from ground.
49
Copyright 2011 Trend Micro Inc.
Successful Stories
• ImageNet : 1M+ images, 1000+ categories, CNN
50
Copyright 2011 Trend Micro Inc.
Successful Stories
• Unsupervised learning neural networks… YouTube and the
Cat .
51
Copyright 2011 Trend Micro Inc.
Successful Stories
• AlphaGo, a significant milestone in AI history
– More significant than DeepBlue
• Both Policy Network and Value Network are CNNs.
52
Copyright 2011 Trend Micro Inc.
Successful Stories
• Google Machine Neural Translation… LSTM (Long
Short Term Memory) network
53
Copyright 2011 Trend Micro Inc.
Successful Stories
• Microsoft Speech Recognition … LSTM and TDNN
(Time Delay Neural Networks )
54
Copyright 2011 Trend Micro Inc.
Security Problems
• Not disclosed for the public version.
55
Copyright 2011 Trend Micro Inc.
Summary
• What a DNN is
• How a DNN works
• Why a DNN works
• The categories of DNNs
• Some challenges
• Well-known stories
• Security problems
56
Copyright 2011 Trend Micro Inc.
Quiz
• Why do we choose the activation function as a
nonlinear function?
• Why Deep? Why deep networks are better than
shallow networks?
• What is the difference between online and batch
mode training?
• Will online and batch mode training converge to the
same solution?
• Why do we need the backpropagation algorithm?
• Why do we apply convolutional neural networks to
image recognition? 57
Copyright 2011 Trend Micro Inc.
Quiz
• If we solve a problem with a FNN,
– how many deep layers should we go?
– How many nodes are good for each layer?
– How to estimate and optimize the cost?
• Is it guaranteed that the backpropagation algorithm converge
to a solution?
• Why do we need sufficient data for training in order to achieve
certain accuracy?
• Can a DNN do some tasks more than extending human’s
capabilities or automating extensive manual tasks ?
– To prove a mathematical theorem ... or to introduce an interesting
concept… or to appreciate a poem… or to love…
58
Copyright 2011 Trend Micro Inc.
Quiz
• AlphaGo is trained for 19x19 lattice. If we play GO
game on 20x20 board, can AlphaGo handle it?
• ImageNet is trained for 1000 categories. If we add the
1001-th category, what should we do?
• People do consider a special DNN as a black box.
Why?
• More questions from you …
59
Copyright 2011 Trend Micro Inc.
What Else?
• What to share next from me? Why do you care?
– Various DNNs: principles, examples, analysis and
experiments…
•ImageNet, AlphaGO, GNMT and etc..
– My Ph.D work and its relevance to DNN
– Little History of AI and Artificial Neural Network
– Various Schools of the AI Discipline
– Strong AI vs. Weak AI
60
Copyright 2011 Trend Micro Inc.
What Else?
• What to share next from me? Why do you care?
– Questions when thinking about AI:
• Are we able to understand how we learn?
• Are we going the right directions mathematically and scientifically?
• Are there simple principles for cognition like what Newton and Einstein
established for understanding our universe?
• What are we lack between now and the coming of so called Strong AI?
61
Copyright 2011 Trend Micro Inc.
What Else?
• What to share next from me? Why do you care?
•Questions about who we are.
– Are we created? 
– Are we the AI of the creator?
•My little theory about the Universe
62
Ad

More Related Content

What's hot (20)

Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
Jörgen Sandig
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
Massimiliano Ruocco
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
omaraldabash
 
Artificial nueral network slideshare
Artificial nueral network slideshareArtificial nueral network slideshare
Artificial nueral network slideshare
Red Innovators
 
Neural networks.ppt
Neural networks.pptNeural networks.ppt
Neural networks.ppt
SrinivashR3
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
Yan Xu
 
Deep learning
Deep learningDeep learning
Deep learning
Rostom Mamadji
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Preferred Networks
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
Akash Goel
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and Techniques
Rui Pedro Paiva
 
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Indraneel Pole
 
Presentation Fuzzy
Presentation FuzzyPresentation Fuzzy
Presentation Fuzzy
Ali
 
Multi tasking learning
Multi tasking learningMulti tasking learning
Multi tasking learning
ShreyusPuthiyapurail
 
Tensorflow - Intro (2017)
Tensorflow - Intro (2017)Tensorflow - Intro (2017)
Tensorflow - Intro (2017)
Alessio Tonioni
 
Poisoning attacks on Federated Learning based IoT Intrusion Detection System
Poisoning attacks on Federated Learning based IoT Intrusion Detection SystemPoisoning attacks on Federated Learning based IoT Intrusion Detection System
Poisoning attacks on Federated Learning based IoT Intrusion Detection System
Sai Kiran Kadam
 
Notes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgNotes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew Ng
dataHacker. rs
 
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
Edureka!
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structure
Rajesh Piryani
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
David Voyles
 
Useful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceUseful Techniques in Artificial Intelligence
Useful Techniques in Artificial Intelligence
Ila Group
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
Jörgen Sandig
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
omaraldabash
 
Artificial nueral network slideshare
Artificial nueral network slideshareArtificial nueral network slideshare
Artificial nueral network slideshare
Red Innovators
 
Neural networks.ppt
Neural networks.pptNeural networks.ppt
Neural networks.ppt
SrinivashR3
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
Yan Xu
 
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Preferred Networks
 
Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders Intro to Deep learning - Autoencoders
Intro to Deep learning - Autoencoders
Akash Goel
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and Techniques
Rui Pedro Paiva
 
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Indraneel Pole
 
Presentation Fuzzy
Presentation FuzzyPresentation Fuzzy
Presentation Fuzzy
Ali
 
Tensorflow - Intro (2017)
Tensorflow - Intro (2017)Tensorflow - Intro (2017)
Tensorflow - Intro (2017)
Alessio Tonioni
 
Poisoning attacks on Federated Learning based IoT Intrusion Detection System
Poisoning attacks on Federated Learning based IoT Intrusion Detection SystemPoisoning attacks on Federated Learning based IoT Intrusion Detection System
Poisoning attacks on Federated Learning based IoT Intrusion Detection System
Sai Kiran Kadam
 
Notes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgNotes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew Ng
dataHacker. rs
 
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
What is Deep Learning | Deep Learning Simplified | Deep Learning Tutorial | E...
Edureka!
 
Optics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structureOptics ordering points to identify the clustering structure
Optics ordering points to identify the clustering structure
Rajesh Piryani
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
David Voyles
 
Useful Techniques in Artificial Intelligence
Useful Techniques in Artificial IntelligenceUseful Techniques in Artificial Intelligence
Useful Techniques in Artificial Intelligence
Ila Group
 

Viewers also liked (20)

企业安全市场综述
企业安全市场综述 企业安全市场综述
企业安全市场综述
Liwei Ren任力偉
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
Ahmed_hashmi
 
neural network
neural networkneural network
neural network
STUDENT
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
Nagarajan
 
Artificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKSArtificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKS
REHMAT ULLAH
 
聊一聊大明朝的火器
聊一聊大明朝的火器聊一聊大明朝的火器
聊一聊大明朝的火器
Liwei Ren任力偉
 
Artificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google BrainArtificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google Brain
Rawan Al-Omari
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
stellajoseph
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
DEEPASHRI HK
 
硅谷的那点事儿
硅谷的那点事儿硅谷的那点事儿
硅谷的那点事儿
Liwei Ren任力偉
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
Introduction to Artificial Neural Network
Introduction to Artificial Neural Network Introduction to Artificial Neural Network
Introduction to Artificial Neural Network
Qingkai Kong
 
Neural network 20161210_jintaekseo
Neural network 20161210_jintaekseoNeural network 20161210_jintaekseo
Neural network 20161210_jintaekseo
JinTaek Seo
 
Boosted decision tree를 활용한 lending club의 채무자 원리금 상환 여부 예측
Boosted decision tree를 활용한 lending club의 채무자 원리금 상환 여부 예측Boosted decision tree를 활용한 lending club의 채무자 원리금 상환 여부 예측
Boosted decision tree를 활용한 lending club의 채무자 원리금 상환 여부 예측
FAST CAMPUS
 
Árboles de Decisión en Weka
Árboles de Decisión en WekaÁrboles de Decisión en Weka
Árboles de Decisión en Weka
Lorena Quiñónez
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep Learning
Anastasiia Kornilova
 
Perceptron Simple y Regla Aprendizaje
Perceptron  Simple y  Regla  AprendizajePerceptron  Simple y  Regla  Aprendizaje
Perceptron Simple y Regla Aprendizaje
Roberth Figueroa-Diaz
 
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a function
Taisuke Oe
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
Gilles Louppe
 
Neural network
Neural networkNeural network
Neural network
KRISH na TimeTraveller
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
Ahmed_hashmi
 
neural network
neural networkneural network
neural network
STUDENT
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
Nagarajan
 
Artificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKSArtificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKS
REHMAT ULLAH
 
Artificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google BrainArtificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google Brain
Rawan Al-Omari
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
stellajoseph
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
DEEPASHRI HK
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
Chun-Hao Chang
 
Introduction to Artificial Neural Network
Introduction to Artificial Neural Network Introduction to Artificial Neural Network
Introduction to Artificial Neural Network
Qingkai Kong
 
Neural network 20161210_jintaekseo
Neural network 20161210_jintaekseoNeural network 20161210_jintaekseo
Neural network 20161210_jintaekseo
JinTaek Seo
 
Boosted decision tree를 활용한 lending club의 채무자 원리금 상환 여부 예측
Boosted decision tree를 활용한 lending club의 채무자 원리금 상환 여부 예측Boosted decision tree를 활용한 lending club의 채무자 원리금 상환 여부 예측
Boosted decision tree를 활용한 lending club의 채무자 원리금 상환 여부 예측
FAST CAMPUS
 
Árboles de Decisión en Weka
Árboles de Decisión en WekaÁrboles de Decisión en Weka
Árboles de Decisión en Weka
Lorena Quiñónez
 
Perceptron Simple y Regla Aprendizaje
Perceptron  Simple y  Regla  AprendizajePerceptron  Simple y  Regla  Aprendizaje
Perceptron Simple y Regla Aprendizaje
Roberth Figueroa-Diaz
 
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a function
Taisuke Oe
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
Gilles Louppe
 
Ad

Similar to Introduction to Deep Neural Network (20)

UofT_ML_lecture.pptx
UofT_ML_lecture.pptxUofT_ML_lecture.pptx
UofT_ML_lecture.pptx
abcdefghijklmn19
 
Introduction to Neural Netwoks
Introduction to Neural Netwoks Introduction to Neural Netwoks
Introduction to Neural Netwoks
Abdallah Bashir
 
Learning Deep Learning
Learning Deep LearningLearning Deep Learning
Learning Deep Learning
simaokasonse
 
Lecture 1 and 2 of Data Structures & Algorithms
Lecture 1 and 2 of Data Structures & AlgorithmsLecture 1 and 2 of Data Structures & Algorithms
Lecture 1 and 2 of Data Structures & Algorithms
haseebanjum2611
 
How to calculate complexity in Data Structure
How to calculate complexity in Data StructureHow to calculate complexity in Data Structure
How to calculate complexity in Data Structure
debasisdas225831
 
Time complexity.ppt
Time complexity.pptTime complexity.ppt
Time complexity.ppt
YekoyeTigabuYeko
 
Time complexity.pptr56435 erfgegr t 45t 35
Time complexity.pptr56435 erfgegr t 45t 35Time complexity.pptr56435 erfgegr t 45t 35
Time complexity.pptr56435 erfgegr t 45t 35
DickyNsjg1
 
how to calclute time complexity of algortihm
how to calclute time complexity of algortihmhow to calclute time complexity of algortihm
how to calclute time complexity of algortihm
Sajid Marwat
 
Find all hazards in this circuit. Redesign the circuit as a three-le.pdf
Find all hazards in this circuit.  Redesign the circuit as a three-le.pdfFind all hazards in this circuit.  Redesign the circuit as a three-le.pdf
Find all hazards in this circuit. Redesign the circuit as a three-le.pdf
Arrowdeepak
 
Chapter One.pdf
Chapter One.pdfChapter One.pdf
Chapter One.pdf
abay golla
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
Fabian Pedregosa
 
Digital Signal Processing Tutorial:Chapt 1 signal and systems
Digital Signal Processing Tutorial:Chapt 1 signal and systemsDigital Signal Processing Tutorial:Chapt 1 signal and systems
Digital Signal Processing Tutorial:Chapt 1 signal and systems
Chandrashekhar Padole
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
台灣資料科學年會
 
Scala and Deep Learning
Scala and Deep LearningScala and Deep Learning
Scala and Deep Learning
Oswald Campesato
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
Dongang (Sean) Wang
 
Activation function
Activation functionActivation function
Activation function
RakshithGowdakodihal
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithmsMm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithms
Eellekwameowusu
 
Cdc18 dg lee
Cdc18 dg leeCdc18 dg lee
Cdc18 dg lee
whatthehellisit
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
nyomans1
 
Recursion in Java
Recursion in JavaRecursion in Java
Recursion in Java
Fulvio Corno
 
Introduction to Neural Netwoks
Introduction to Neural Netwoks Introduction to Neural Netwoks
Introduction to Neural Netwoks
Abdallah Bashir
 
Learning Deep Learning
Learning Deep LearningLearning Deep Learning
Learning Deep Learning
simaokasonse
 
Lecture 1 and 2 of Data Structures & Algorithms
Lecture 1 and 2 of Data Structures & AlgorithmsLecture 1 and 2 of Data Structures & Algorithms
Lecture 1 and 2 of Data Structures & Algorithms
haseebanjum2611
 
How to calculate complexity in Data Structure
How to calculate complexity in Data StructureHow to calculate complexity in Data Structure
How to calculate complexity in Data Structure
debasisdas225831
 
Time complexity.pptr56435 erfgegr t 45t 35
Time complexity.pptr56435 erfgegr t 45t 35Time complexity.pptr56435 erfgegr t 45t 35
Time complexity.pptr56435 erfgegr t 45t 35
DickyNsjg1
 
how to calclute time complexity of algortihm
how to calclute time complexity of algortihmhow to calclute time complexity of algortihm
how to calclute time complexity of algortihm
Sajid Marwat
 
Find all hazards in this circuit. Redesign the circuit as a three-le.pdf
Find all hazards in this circuit.  Redesign the circuit as a three-le.pdfFind all hazards in this circuit.  Redesign the circuit as a three-le.pdf
Find all hazards in this circuit. Redesign the circuit as a three-le.pdf
Arrowdeepak
 
Chapter One.pdf
Chapter One.pdfChapter One.pdf
Chapter One.pdf
abay golla
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
Fabian Pedregosa
 
Digital Signal Processing Tutorial:Chapt 1 signal and systems
Digital Signal Processing Tutorial:Chapt 1 signal and systemsDigital Signal Processing Tutorial:Chapt 1 signal and systems
Digital Signal Processing Tutorial:Chapt 1 signal and systems
Chandrashekhar Padole
 
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
[DSC 2016] 系列活動:李宏毅 / 一天搞懂深度學習
台灣資料科學年會
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
Dongang (Sean) Wang
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithmsMm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithms
Eellekwameowusu
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
nyomans1
 
Ad

More from Liwei Ren任力偉 (20)

信息安全领域里的创新和机遇
信息安全领域里的创新和机遇信息安全领域里的创新和机遇
信息安全领域里的创新和机遇
Liwei Ren任力偉
 
防火牆們的故事
防火牆們的故事防火牆們的故事
防火牆們的故事
Liwei Ren任力偉
 
移动互联网时代下创新的思维
移动互联网时代下创新的思维移动互联网时代下创新的思维
移动互联网时代下创新的思维
Liwei Ren任力偉
 
非齐次特征值问题解存在性研究
非齐次特征值问题解存在性研究非齐次特征值问题解存在性研究
非齐次特征值问题解存在性研究
Liwei Ren任力偉
 
世纪猜想
世纪猜想世纪猜想
世纪猜想
Liwei Ren任力偉
 
Arm the World with SPN based Security
Arm the World with SPN based SecurityArm the World with SPN based Security
Arm the World with SPN based Security
Liwei Ren任力偉
 
Extending Boyer-Moore Algorithm to an Abstract String Matching Problem
Extending Boyer-Moore Algorithm to an Abstract String Matching ProblemExtending Boyer-Moore Algorithm to an Abstract String Matching Problem
Extending Boyer-Moore Algorithm to an Abstract String Matching Problem
Liwei Ren任力偉
 
Near Duplicate Document Detection: Mathematical Modeling and Algorithms
Near Duplicate Document Detection: Mathematical Modeling and AlgorithmsNear Duplicate Document Detection: Mathematical Modeling and Algorithms
Near Duplicate Document Detection: Mathematical Modeling and Algorithms
Liwei Ren任力偉
 
Monotonicity of Phaselocked Solutions in Chains and Arrays of Nearest-Neighbo...
Monotonicity of Phaselocked Solutions in Chains and Arrays of Nearest-Neighbo...Monotonicity of Phaselocked Solutions in Chains and Arrays of Nearest-Neighbo...
Monotonicity of Phaselocked Solutions in Chains and Arrays of Nearest-Neighbo...
Liwei Ren任力偉
 
Phase locking in chains of multiple-coupled oscillators
Phase locking in chains of multiple-coupled oscillatorsPhase locking in chains of multiple-coupled oscillators
Phase locking in chains of multiple-coupled oscillators
Liwei Ren任力偉
 
On existence of the solution of inhomogeneous eigenvalue problem
On existence of the solution of inhomogeneous eigenvalue problemOn existence of the solution of inhomogeneous eigenvalue problem
On existence of the solution of inhomogeneous eigenvalue problem
Liwei Ren任力偉
 
Math stories
Math storiesMath stories
Math stories
Liwei Ren任力偉
 
Binary Similarity : Theory, Algorithms and Tool Evaluation
Binary Similarity :  Theory, Algorithms and  Tool EvaluationBinary Similarity :  Theory, Algorithms and  Tool Evaluation
Binary Similarity : Theory, Algorithms and Tool Evaluation
Liwei Ren任力偉
 
IoT Security: Problems, Challenges and Solutions
IoT Security: Problems, Challenges and SolutionsIoT Security: Problems, Challenges and Solutions
IoT Security: Problems, Challenges and Solutions
Liwei Ren任力偉
 
Taxonomy of Differential Compression
Taxonomy of Differential CompressionTaxonomy of Differential Compression
Taxonomy of Differential Compression
Liwei Ren任力偉
 
Bytewise Approximate Match: Theory, Algorithms and Applications
Bytewise Approximate Match:  Theory, Algorithms and ApplicationsBytewise Approximate Match:  Theory, Algorithms and Applications
Bytewise Approximate Match: Theory, Algorithms and Applications
Liwei Ren任力偉
 
Overview of Data Loss Prevention (DLP) Technology
Overview of Data Loss Prevention (DLP) TechnologyOverview of Data Loss Prevention (DLP) Technology
Overview of Data Loss Prevention (DLP) Technology
Liwei Ren任力偉
 
DLP Systems: Models, Architecture and Algorithms
DLP Systems: Models, Architecture and AlgorithmsDLP Systems: Models, Architecture and Algorithms
DLP Systems: Models, Architecture and Algorithms
Liwei Ren任力偉
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical Problems
Liwei Ren任力偉
 
Securing Your Data for Your Journey to the Cloud
Securing Your Data for Your Journey to the CloudSecuring Your Data for Your Journey to the Cloud
Securing Your Data for Your Journey to the Cloud
Liwei Ren任力偉
 
信息安全领域里的创新和机遇
信息安全领域里的创新和机遇信息安全领域里的创新和机遇
信息安全领域里的创新和机遇
Liwei Ren任力偉
 
移动互联网时代下创新的思维
移动互联网时代下创新的思维移动互联网时代下创新的思维
移动互联网时代下创新的思维
Liwei Ren任力偉
 
非齐次特征值问题解存在性研究
非齐次特征值问题解存在性研究非齐次特征值问题解存在性研究
非齐次特征值问题解存在性研究
Liwei Ren任力偉
 
Arm the World with SPN based Security
Arm the World with SPN based SecurityArm the World with SPN based Security
Arm the World with SPN based Security
Liwei Ren任力偉
 
Extending Boyer-Moore Algorithm to an Abstract String Matching Problem
Extending Boyer-Moore Algorithm to an Abstract String Matching ProblemExtending Boyer-Moore Algorithm to an Abstract String Matching Problem
Extending Boyer-Moore Algorithm to an Abstract String Matching Problem
Liwei Ren任力偉
 
Near Duplicate Document Detection: Mathematical Modeling and Algorithms
Near Duplicate Document Detection: Mathematical Modeling and AlgorithmsNear Duplicate Document Detection: Mathematical Modeling and Algorithms
Near Duplicate Document Detection: Mathematical Modeling and Algorithms
Liwei Ren任力偉
 
Monotonicity of Phaselocked Solutions in Chains and Arrays of Nearest-Neighbo...
Monotonicity of Phaselocked Solutions in Chains and Arrays of Nearest-Neighbo...Monotonicity of Phaselocked Solutions in Chains and Arrays of Nearest-Neighbo...
Monotonicity of Phaselocked Solutions in Chains and Arrays of Nearest-Neighbo...
Liwei Ren任力偉
 
Phase locking in chains of multiple-coupled oscillators
Phase locking in chains of multiple-coupled oscillatorsPhase locking in chains of multiple-coupled oscillators
Phase locking in chains of multiple-coupled oscillators
Liwei Ren任力偉
 
On existence of the solution of inhomogeneous eigenvalue problem
On existence of the solution of inhomogeneous eigenvalue problemOn existence of the solution of inhomogeneous eigenvalue problem
On existence of the solution of inhomogeneous eigenvalue problem
Liwei Ren任力偉
 
Binary Similarity : Theory, Algorithms and Tool Evaluation
Binary Similarity :  Theory, Algorithms and  Tool EvaluationBinary Similarity :  Theory, Algorithms and  Tool Evaluation
Binary Similarity : Theory, Algorithms and Tool Evaluation
Liwei Ren任力偉
 
IoT Security: Problems, Challenges and Solutions
IoT Security: Problems, Challenges and SolutionsIoT Security: Problems, Challenges and Solutions
IoT Security: Problems, Challenges and Solutions
Liwei Ren任力偉
 
Taxonomy of Differential Compression
Taxonomy of Differential CompressionTaxonomy of Differential Compression
Taxonomy of Differential Compression
Liwei Ren任力偉
 
Bytewise Approximate Match: Theory, Algorithms and Applications
Bytewise Approximate Match:  Theory, Algorithms and ApplicationsBytewise Approximate Match:  Theory, Algorithms and Applications
Bytewise Approximate Match: Theory, Algorithms and Applications
Liwei Ren任力偉
 
Overview of Data Loss Prevention (DLP) Technology
Overview of Data Loss Prevention (DLP) TechnologyOverview of Data Loss Prevention (DLP) Technology
Overview of Data Loss Prevention (DLP) Technology
Liwei Ren任力偉
 
DLP Systems: Models, Architecture and Algorithms
DLP Systems: Models, Architecture and AlgorithmsDLP Systems: Models, Architecture and Algorithms
DLP Systems: Models, Architecture and Algorithms
Liwei Ren任力偉
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical Problems
Liwei Ren任力偉
 
Securing Your Data for Your Journey to the Cloud
Securing Your Data for Your Journey to the CloudSecuring Your Data for Your Journey to the Cloud
Securing Your Data for Your Journey to the Cloud
Liwei Ren任力偉
 

Recently uploaded (20)

Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Build With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdfBuild With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdf
Google Developer Group - Harare
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 

Introduction to Deep Neural Network

  • 1. Copyright 2011 Trend Micro Inc. 1 Introduction to Deep Neural Network Liwei Ren, Ph.D San Jose, California, Nov, 2016
  • 2. Copyright 2011 Trend Micro Inc. Agenda • What a DNN is • How a DNN works • Why a DNN works • Those DNNs in action • Where the challenges are • Successful stories • Security problems • Summary • Quiz • What else 2
  • 3. Copyright 2011 Trend Micro Inc. What is a DNN? • DNN and AI in the secular world 3
  • 4. Copyright 2011 Trend Micro Inc. What is a DNN? • DNN and AI in the secular world 4
  • 5. Copyright 2011 Trend Micro Inc. What is a DNN? • DNN and AI in the secular world 5
  • 6. Copyright 2011 Trend Micro Inc. What is a DNN? • DNN in the technical world 6
  • 7. Copyright 2011 Trend Micro Inc. What is a DNN? • DNN in the technical world 7
  • 8. Copyright 2011 Trend Micro Inc. What is a DNN? • DNN in the technical world 8
  • 9. Copyright 2011 Trend Micro Inc. What is a DNN? • Categorizing the DNNs : 9
  • 10. Copyright 2011 Trend Micro Inc. What is a DNN? • Three technical elements • Architecture: the graph, weights/biases, activation functions • Activity Rule: weights/biases, activation functions • Learning Rule: a typical one is backpropagation algorithm • Three masters in this area: 10
  • 11. Copyright 2011 Trend Micro Inc. What is a DNN? • Given a practical problem , we have two approaches to solve it. 11
  • 12. Copyright 2011 Trend Micro Inc. What is a DNN? • An example: image recognition 12
  • 13. Copyright 2011 Trend Micro Inc. What is a DNN? • An example: image recognition 13
  • 14. Copyright 2011 Trend Micro Inc. What is a DNN? • In the mathematical world – A DNN is a mathematical function f: D  S, where D ⊆ Rn and S ⊆ Rm, which is constructed by a directed graph based architecture. – A DNN is also a composition of functions from a network of primitive functions. 14
  • 15. Copyright 2011 Trend Micro Inc. What is a DNN? • We denote the a feed-forward DNN function by O= f(I) which is determined by a few parameters G, Φ ,W,B • Hyper-parameters: – G is the directed graph which presents the structure – Φ presents one or multiple activation functions for activating the nodes • Parameters: – W is the vector of weights relevant to the edges – B is the vector of biases relevant to the nodes 15
  • 16. Copyright 2011 Trend Micro Inc. What is a DNN? • Activation at a node: 16
  • 17. Copyright 2011 Trend Micro Inc. What is a DNN? • Activation function: 17
  • 18. Copyright 2011 Trend Micro Inc. What is a DNN? • G=(V,E) is a graph and Φ is a set of activation functions. • <G,Φ> constructs a family of functions F: – F(G,Φ) = { f | f is a function constructed by <G, Φ ,W> where WϵRN } • N= total number of weights at all nodes of output layer and hidden layers. • Each f(I) can be denoted by f(I ,W). 18
  • 19. Copyright 2011 Trend Micro Inc. What is a DNN? • Mathematically, a DNN based supervised machine learning technology can be described as follows : – Given g ϵ { h | h:D  S where D ⊆ Rn and S ⊆ Rm} and δ>0 , find f ϵ F(G,Φ) such that 𝑓 − 𝑔 < δ. • Essentially, it is to identify a W ϵ RN such that 𝑓(∗, 𝑊) − 𝑔 < δ • However, in practice, g is not explicitly expressed . It usually appears in a sequence of samples: – { <I(j),T(j)> | T(j) =g(I(j)), j=1, 2, …,M} • where I(j) is an input vector and T(j) is its corresponding target vector. 19
  • 20. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • The function g is not explicitly expressed, we are not able to calculate g − f(∗, W) • Instead, we evaluate the error function E(W)= 1 2𝑀 ∑||T(j) - f(I(j),W)||2 • We expect to determine W such that E(W) < δ • How to identify W ϵ RN so that E(W) < δ ? Lets solve the nonlinear optimization problem min{E(W)| W ϵ RN} , i.e.: min{ 1 2𝑀 ∑|| T(j) - f(I(j),W) ||2 | W ϵ RN } (P1) 20
  • 21. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • (P1) is for batch mode training, however ,it is too expensive. • In order to reduce the computational cost, a sequential mode is introduced. • Picking <I,T> ϵ {<I(1),T(1) >, <I(2),T(2)> ,…, <I(M),T(M)>} sequentially, let the output of the network as O= f(I,W) for any W: • Error function E(W)= ||T- f(I,W)||2 /2 = ∑(Tj-Oj)2 /2 • Each Oj can be considered as a function of W. We denote it as Oj(W). • We have the optimization problem for training with sequential mode: – min{ ∑(Tj-Oj(W) )2 /2 | W ϵ RN} (P2) 21
  • 22. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • One may ask whether we get the same solution for both batch mode and sequential mode ? • BTW – batch mode = offline mode – sequential mode = online mode • We focus on online mode in this talk 22
  • 23. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • How to solve the unconstrained nonlinear optimization problem (P2)? • The general approach of unconstrained nonlinear optimization is to find local minima of E(W) by using the iterative process of Gradient Descent. •∂E = (∂E/∂W1, ∂E/∂W2, …, ∂E/∂WT) • The iterations: – ΔWj = - γ ∂E/∂Wj for j=1, …,T – Updating W in each step by • Wj (k+1) = Wj (k) - γ ∂E(W (k))/∂Wj for j=1, …,T (A1) • until E(W (k+1)) < δ or E(W (k+1)) can not be reduced anymore 23
  • 24. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • The algorithm of Gradient Descent: 24
  • 25. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • From the perspective of mathematics, the process of Gradient Descent is straightforward. • However, from the perspective of scientific computing, it is quite challenging to calculate the values of all ∂E/∂Wj for j=1, …,N: – The complexity of presenting each ∂E/∂Wj where j=1, …,N. – There are (k+1)-layer function compositions for a DNN of k hidden layers. 25
  • 26. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • For example, we have a very simple network as follows with the activation function φ(v)=1/(1 + 𝑒−𝑣 ). • E(W) = [ T - f(I,W) ]2 /2= [T – φ(w1φ(w3I+ w2) + w0)]2 /2, we have: – ∂E/∂w0 = -[T – φ(w1φ(w3I + w2) + w0)] φ’(w1φ(w3I+w2) + w0) – ∂E/∂w1 = -[T – φ(w1φ(w3I + w2) + w0)] φ’(w1φ(w3I+w2) + w0) φ(w3I+w2) – ∂E/∂w2 = - w1 [T – φ(w1φ(w3I + w2) + w0)] φ’(w1φ(w3I+w2) + w0) φ’(w3I+w2) – ∂E/∂w3 = - I w1 [T – φ(w1φ(w3I + w2) + w0)] φ’(w1φ(w3I+w2) + w0) φ’(w3I+w2) 26
  • 27. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • Lets imagine a network of N inputs, M outputs and K hidden layers each of which has L nodes. – It is a daunting task to express ∂E/∂wj explicitly. Last simple example already shows this. • The backpropagation (BP) algorithm was proposed as a rescue: – Main idea : the weights of (k-1)-th hidden layer can be expressed by the k-th layer recursively. – We can start with the output layer which is considered as (L+1)- layer. 27
  • 28. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • BP algorithm has the following major steps: 1. Feed-forward computation 2. Back-propagation to the output layer 3. Back-propagation to the hidden layers 4. Weight updates 28
  • 29. Copyright 2011 Trend Micro Inc. How Does a DNN work ? 29
  • 30. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • A general DNN can be drawn as follows 30
  • 31. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • How to express the weights of (k-1)-th hidden layer by the weights of k-th layer recursively? 31
  • 32. Copyright 2011 Trend Micro Inc. How Does a DNN work ? • Let us experience the BP with our small network. – E(W) = [ T - f(I,W) ]2 /2= [T – φ(w1φ(w3I+ w2) + w0)]2 /2. • ∂E/∂w0 = - φ’(O) (T – O) • ∂E/∂w1 = -φ’(O) (T – O) φ(O) • ∂E/∂w2 = - φ’(O) (T – O) φ’(H) w1 * 1 • ∂E/∂w3 = - φ’(O) (T – O) φ’(H) w1 * I – Let H0 (1)= 1, H1 (1) = H = φ(w3I+ w2), H1 (0) = I, we verify the follows: • δ1 (2)= φ’(O) (T – O) • w0 + = w0 + γ δ1 (2) H0 (1) , w1 + = w1 + γ δ1 (2) H1 (1) • δ1 (1)= φ’(H1 (1)) δ1 (2) w1 • w2 + = w2 + γ δ1 (1) H0 (0) , w3 + = w3 + γ δ1 (1) H1 (0) • where w0 = w0,1 (2) , w1 = w1,1 (2), w2 = w0,1 (1) , w2 = w1,1 (1) 32
  • 33. Copyright 2011 Trend Micro Inc. Why Does a DNN Work? • It is amazing ! However, why does it work? • For a FNN, it is to ask whether the following approximation problem has a solution: – Given g ϵ { h | h:D  S where D ⊆ Rn and S ⊆ Rm} and δ>0 , find a W ϵ RN such that 𝑓(∗, 𝑊) − 𝑔 < δ. • Universal approximation theorem (S): – Let φ(.) be a bounded and monotonically-increasing continuous function. Let Im denote the m-dimensional unit hypercube [0,1]m . The space of continuous functions on Im is denoted by C(Im) . Then, given any function f ϵ C(Im) and ε>0 , there exists an integer N , real constants vi, bi ϵ R and real vectors wi ϵ Rm, where i=1, …, N , such that |F(x)-f(x)| < ε for all x ϵ Im , where F(x) = vi φ(wi T x + bi)𝑵 𝒊=𝟏 is an approximation to the function f which is independent of φ . 33 Im
  • 34. Copyright 2011 Trend Micro Inc. Why Does a DNN Work? • Its corresponding network with only one hidden layer – NOTE : this is not even a general case for one hidden layer. It is a special case. WHY? – However, it is powerful and encouraging from the mathematical perspective. 34 Im
  • 35. Copyright 2011 Trend Micro Inc. Why Does a DNN Work? The general networks have a general version of Universal Approximation Theorem accordingly: 35 Im
  • 36. Copyright 2011 Trend Micro Inc. Why Does a DNN Work? • Universal approximation theorem (G): – Let φ(.) be a bounded and monotonically-increasing continuous function. Let S be a compact space in Rm . Let C(S ) = {g | g:S ⊂ Rm  Rn is continuous}. Then, given any function f ϵ C(S) and ε>0 , there exists a FNN as shown above which constructs the network function F such that || F(x)-f(x) || < ε where F is an approximation to the function f which is independent of φ . • It seems both shallow and deep neural networks can construct an approximation to a given function. – Which is better? – Or which is more efficient in terms of using less nodes ? 36 Im Rm
  • 37. Copyright 2011 Trend Micro Inc. Why Does a DNN Work? • Mathematical foundation of neural networks: 37 Im Rm
  • 38. Copyright 2011 Trend Micro Inc. Those DNNs in action • DNN has three elements • Architecture: the graph , weights/biases, activation functions • Activity Rule: weights/biases, activation functions • Learning Rule: a typical one is backpropagation algorithm • The architecture basically determines the capability of a specific DNN – Different architectures are suitable for different applications. – The most general architecture of an ANN is a DAG ( directed acyclic graph). 38
  • 39. Copyright 2011 Trend Micro Inc. Those DNNs in action • There are a few well-known categories of DNNs. 39
  • 40. Copyright 2011 Trend Micro Inc. What Are the Challenges? • Given a specific problem, there are a few questions before one starts the journey with DNNs: – Do you understand the problem that you need to solve? – Do you really want to solve this problem with DNN, why? • Do you have an alternative yet effective solution? – Do you know how to describe the problem in DNN mathematically ? – Do you know how to implement a DNN , beyond a few APIs and sizzling hype? – How to collect sufficient data for training? – How to solve the problem efficiently and cost-effectively? 40
  • 41. Copyright 2011 Trend Micro Inc. What Are the Challenges? • 1st Challenge: – a full mesh network has the curse of dimensionality. 41
  • 42. Copyright 2011 Trend Micro Inc. What Are the Challenges? • Many tasks of FNN do not need a full mesh network. • For example, if we can present the input vector as a grid, the nearest- neighborhood models can be used when constructing an effective FNN which can reduce connections – Image recognition – GO (圍棋) : a game that two players play on a 19x19 grid of lines. 42
  • 43. Copyright 2011 Trend Micro Inc. What Are the Challenges? • The 2nd challenge is how to describe a technical problem in terms of DNN, i.e., mathematical modeling. There are generally two approaches: – Applying a well-learned DNN architecture to describe the problem. Deep understanding of the specific network is usually required! • Two general DNN architectures are well-known – FNN: feedforward neural network. Its special architecture CNN (convolutional neural network) is widely used in many applications such as image recognition, GO, and etc. – RNN: recurrent neural network. Its special architecture is LSTM (long short- term memory) which has been applied successfully in speech recognition, language translation, and etc. • For example, if we want to try a FNN, how to describe the problem in terms of <Input vector, Output vector> with fixed dimension ? – Creating a novel DNN architecture from ground if none of the existing models fits your problem. Deep understanding of DNN theory / algorithms is required. 43
  • 44. Copyright 2011 Trend Micro Inc. What Are the Challenges? • Handwriting digit recognition: – Modeling this problem is straightforward 44
  • 45. Copyright 2011 Trend Micro Inc. What Are the Challenges? • Image Recognition is also straightforward 45
  • 46. Copyright 2011 Trend Micro Inc. What Are the Challenges? • However, due to the curse of dimensionality, we can use a special FFN: – Convolutional neural network (CNN) 46
  • 47. Copyright 2011 Trend Micro Inc. What Are the Challenges? • How to construct a DNN to describe language translation ? – They use LSTM networks • How to construct a DNN to describe the problem of malware classification? • How to construct a DNN to describe the network traffic for security purpose? 47
  • 48. Copyright 2011 Trend Micro Inc. What Are the Challenges? • The 3rd challenge is how to collect sufficient training data. To achieve required accuracy, sufficient training data is necessary. WHY? 48
  • 49. Copyright 2011 Trend Micro Inc. What Are the Challenges? • The 4th challenge is how to identify various talents for providing a DNN solution to solve specific problems. – Who knows how to use existing DL APIs such as TensorFlow – Who understands various DNN architectures in depth so that he/she knows how to evaluate and identify a suitable DNN architecture to solve the problem. – Who understands the theory and algorithms of the DNN in depth so that he/she can create and design a novel DNN from ground. 49
  • 50. Copyright 2011 Trend Micro Inc. Successful Stories • ImageNet : 1M+ images, 1000+ categories, CNN 50
  • 51. Copyright 2011 Trend Micro Inc. Successful Stories • Unsupervised learning neural networks… YouTube and the Cat . 51
  • 52. Copyright 2011 Trend Micro Inc. Successful Stories • AlphaGo, a significant milestone in AI history – More significant than DeepBlue • Both Policy Network and Value Network are CNNs. 52
  • 53. Copyright 2011 Trend Micro Inc. Successful Stories • Google Machine Neural Translation… LSTM (Long Short Term Memory) network 53
  • 54. Copyright 2011 Trend Micro Inc. Successful Stories • Microsoft Speech Recognition … LSTM and TDNN (Time Delay Neural Networks ) 54
  • 55. Copyright 2011 Trend Micro Inc. Security Problems • Not disclosed for the public version. 55
  • 56. Copyright 2011 Trend Micro Inc. Summary • What a DNN is • How a DNN works • Why a DNN works • The categories of DNNs • Some challenges • Well-known stories • Security problems 56
  • 57. Copyright 2011 Trend Micro Inc. Quiz • Why do we choose the activation function as a nonlinear function? • Why Deep? Why deep networks are better than shallow networks? • What is the difference between online and batch mode training? • Will online and batch mode training converge to the same solution? • Why do we need the backpropagation algorithm? • Why do we apply convolutional neural networks to image recognition? 57
  • 58. Copyright 2011 Trend Micro Inc. Quiz • If we solve a problem with a FNN, – how many deep layers should we go? – How many nodes are good for each layer? – How to estimate and optimize the cost? • Is it guaranteed that the backpropagation algorithm converge to a solution? • Why do we need sufficient data for training in order to achieve certain accuracy? • Can a DNN do some tasks more than extending human’s capabilities or automating extensive manual tasks ? – To prove a mathematical theorem ... or to introduce an interesting concept… or to appreciate a poem… or to love… 58
  • 59. Copyright 2011 Trend Micro Inc. Quiz • AlphaGo is trained for 19x19 lattice. If we play GO game on 20x20 board, can AlphaGo handle it? • ImageNet is trained for 1000 categories. If we add the 1001-th category, what should we do? • People do consider a special DNN as a black box. Why? • More questions from you … 59
  • 60. Copyright 2011 Trend Micro Inc. What Else? • What to share next from me? Why do you care? – Various DNNs: principles, examples, analysis and experiments… •ImageNet, AlphaGO, GNMT and etc.. – My Ph.D work and its relevance to DNN – Little History of AI and Artificial Neural Network – Various Schools of the AI Discipline – Strong AI vs. Weak AI 60
  • 61. Copyright 2011 Trend Micro Inc. What Else? • What to share next from me? Why do you care? – Questions when thinking about AI: • Are we able to understand how we learn? • Are we going the right directions mathematically and scientifically? • Are there simple principles for cognition like what Newton and Einstein established for understanding our universe? • What are we lack between now and the coming of so called Strong AI? 61
  • 62. Copyright 2011 Trend Micro Inc. What Else? • What to share next from me? Why do you care? •Questions about who we are. – Are we created?  – Are we the AI of the creator? •My little theory about the Universe 62
  翻译: