SlideShare a Scribd company logo
Reducing the Dimensionality of Data
with Neural Networks
@St_Hakky
Geoffrey E. Hinton; R. R. Salakhutdinov (2006-07-28). “Reducing the
Dimensionality of Data with Neural Networks”. Science 313 (5786)
Dimensionality Reduction
• Dimensionality Reduction facipitates…
• Classification
• Visualization
• Communication
• Storage of high-dimensional data
Principal Components Analysis
• PCA(Principal Components Analysis)
• A simple and widely used method
• Finds the directions of greatest variance in the data set
• Represents each data point by its coordinates along each
of these directions
“Encoder” and “Decoder” Network
• This paper describe a nonlinear generalization of
PCA(This is autoencoder)
• use an adaptive, multilayer “encoder” network to
transform the high-dimensional data into a low-
dimensional code
• a similar “decoder” network to recover the data from
the code
AutoEncoder
Code
Input Output
Encoder Decoder
AutoEncoder
Input data
Reconstructing data
Hidden layer
Input layer
Outputlayer
Dimensionality
Reduction
How to train the AutoEncoder
・ Starting with random
weights in the two networks
Input data
Reconstructing data
Hidden layer
Input layer
Outputlayer
Dimensionality
Reduction
・ They are trained by
minimizing the discrepancy
between the original data
and its reconstruction.
・ Gradients are obtained by
the chain rule to back-
propagate error from the
decoder network to encoder
network.
It is difficult to optimize multilayer
autoencoder
• It is difficult to optimize the weights in nonlinear
autoencoders that have multiple hidden layers(2-4).
• With large initial weights:
• autoencoders typically find poor local minima
• With small initial weights:
• the gradients in the early layers are tiny, making it infeasible to
train autoencoders with many hidden layers
• If the initial weights are close to a good solution,
gradient decent works well. However finding such
initial weights is very difficult.
Pretraining
• This paper introduce this “pretraining” procedure
for binary data, generalize it to real-valued data,
and show that it works well for a variety of data
sets.
Restricted Boltzmann Machine(RBM)
Visible units
Hidden units
The input data correspond
to “visible” units of the RBM
and the feature detectors
correspond to “hidden” units.
A joint configuration (𝑣, ℎ) of
the visible and hidden units
has an energy given by (1).
𝑣𝑖
ℎ𝑗
𝑏𝑖, 𝑏𝑗: 𝑏𝑖𝑎𝑠
𝑤𝑖𝑗
The network assigns a
probability to every possible
data via this energy function.
Pretraining consits of learning a stack
of RBMs
・ The first layer of feature
detectors then become the visible
units for learning the next RBM.
・ This layer-by-layer learning can
be repeated as many times as
desired.
Experiment(2-A)
The six units in the code layer were linear
and all the other units were logistic.
The network was trained on 20,000
images and tested on 10,000 new images.
The autoencoder discovered how to
convert each 784-pixel image into six
real numbers that allow almost perfect
reconstruction.
Data
The function of layer
Encoder
Decoder
28 * 28
28 * 28
400
400
200
200
100
100
50
50
25
25
6
6
Used AutoEncoder’s Network
Observed Results
Experiment(2-A)
(1) Random samples of curves from the
test data set
(2) Reconstructions produced by the six-
dimensional deep autoencoder
(3) Reconstructions by logistic PCA using
six components
(4) Reconstructions by logistic PCA
The average squared error per image for
the last four rows is 1.44, 7.64, 2.45, 5.90.
(5) Standard PCA using 18 components.
(1)
(3)
(5)
(4)
(2)
Experiment(2-B)
Used AutoEncoder’s Network
The 30 units in the code layer were linear
and all the other units were logistic.
The function of layer
The network was trained on 60,000
images and tested on 10,000 new images.
Data
Encoder
Decoder
1000
1000
784
784
500
250
250
30
30
500
Experiment(2-B):MNIST
The average squared errors for the last
three rows are 3.00, 8.01, and 13.87.
(1)
(3)
(2)
(4)
(1) A random test image from each class
(2) Reconstructions by the 30-dimensional
autoencoder
(3) Reconstructions by 30- dimensional
logistic PCA
(4) Reconstructions by standard PCA
Experiment(2-B)
A two-dimensional autoencoder produced a better visualization of the data than
did the first two principal components.
(A) The two-dimensional codes for 500
digits of each class produced by taking
the first two principal components of
all 60,000 training images.
(B) The two-dimensional codes
found by a 784- 1000-500-250-2
autoencoder.
Experiment(2-C)
Used AutoEncoder’s Network
The 30 units in the code layer were linear
and all the other units were logistic.
The function of layer
Olivetti face data set
Data
Encoder
Decoder
2000
2000
625
625
1000
500
500
30
30
1000
Observed Results
The autoencoder clearly outperformed PCA
Experiment(2-C)
(1) Random samples from the test data set
(1)
(3)
(2)
(2) Reconstructions by the 30-dimensional autoencoder
(3) Reconstructions by 30-dimensional PCA.
The average squared errors are 126 and 135.
Conclusion
• It has been obvious since the 1980s that
backpropagation through deep autoencoders would
be very effective for nonlinear dimensionality
reduction in the situation of…
• Computers were fast enough
• Data sets were big enough
• The initial weights were close enough to a good solution.
Conclusion
• Autoencoders give mappings in both directions
between the data and code spaces.
• They can be applied to very large data sets.
• The reason is that both the pretraining and the fine-
tuning scale linearly in time and space with the
number of training cases.
Ad

More Related Content

What's hot (20)

Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
Chanuk Lim
 
[DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
 [DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se... [DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
[DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
Deep Learning JP
 
20190706cvpr2019_3d_shape_representation
20190706cvpr2019_3d_shape_representation20190706cvpr2019_3d_shape_representation
20190706cvpr2019_3d_shape_representation
Takuya Minagawa
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Xavier Amatriain
 
Action Recognitionの歴史と最新動向
Action Recognitionの歴史と最新動向Action Recognitionの歴史と最新動向
Action Recognitionの歴史と最新動向
Ohnishi Katsunori
 
【DL輪読会】Semi-Parametric Neural Image Synthesis
【DL輪読会】Semi-Parametric Neural Image Synthesis【DL輪読会】Semi-Parametric Neural Image Synthesis
【DL輪読会】Semi-Parametric Neural Image Synthesis
Deep Learning JP
 
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
Taehoon Kim
 
PyTorch, PixyzによるGenerative Query Networkの実装
PyTorch, PixyzによるGenerative Query Networkの実装PyTorch, PixyzによるGenerative Query Networkの実装
PyTorch, PixyzによるGenerative Query Networkの実装
Shohei Taniguchi
 
[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎
Deep Learning JP
 
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII
 
物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)
HironoriKanazawa
 
Go-ICP: グローバル最適(Globally optimal) なICPの解説
Go-ICP: グローバル最適(Globally optimal) なICPの解説Go-ICP: グローバル最適(Globally optimal) なICPの解説
Go-ICP: グローバル最適(Globally optimal) なICPの解説
Yusuke Sekikawa
 
グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)
グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)
グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)
STAIR Lab, Chiba Institute of Technology
 
Ml10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsMl10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topics
ankit_ppt
 
はじパタ8章 svm
はじパタ8章 svmはじパタ8章 svm
はじパタ8章 svm
tetsuro ito
 
Graph Attention Network
Graph Attention NetworkGraph Attention Network
Graph Attention Network
Takahiro Kubo
 
[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...
[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...
[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...
Deep Learning JP
 
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック 大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
西岡 賢一郎
 
A3C解説
A3C解説A3C解説
A3C解説
harmonylab
 
【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Deep Learning JP
 
[DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
 [DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se... [DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
[DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
Deep Learning JP
 
20190706cvpr2019_3d_shape_representation
20190706cvpr2019_3d_shape_representation20190706cvpr2019_3d_shape_representation
20190706cvpr2019_3d_shape_representation
Takuya Minagawa
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Xavier Amatriain
 
Action Recognitionの歴史と最新動向
Action Recognitionの歴史と最新動向Action Recognitionの歴史と最新動向
Action Recognitionの歴史と最新動向
Ohnishi Katsunori
 
【DL輪読会】Semi-Parametric Neural Image Synthesis
【DL輪読会】Semi-Parametric Neural Image Synthesis【DL輪読会】Semi-Parametric Neural Image Synthesis
【DL輪読会】Semi-Parametric Neural Image Synthesis
Deep Learning JP
 
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
지적 대화를 위한 깊고 넓은 딥러닝 PyCon APAC 2016
Taehoon Kim
 
PyTorch, PixyzによるGenerative Query Networkの実装
PyTorch, PixyzによるGenerative Query Networkの実装PyTorch, PixyzによるGenerative Query Networkの実装
PyTorch, PixyzによるGenerative Query Networkの実装
Shohei Taniguchi
 
[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎
Deep Learning JP
 
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII2020 [OS2-02] 教師あり事前学習を凌駕する「弱」教師あり事前学習
SSII
 
物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)物体検出の歴史(R-CNNからSSD・YOLOまで)
物体検出の歴史(R-CNNからSSD・YOLOまで)
HironoriKanazawa
 
Go-ICP: グローバル最適(Globally optimal) なICPの解説
Go-ICP: グローバル最適(Globally optimal) なICPの解説Go-ICP: グローバル最適(Globally optimal) なICPの解説
Go-ICP: グローバル最適(Globally optimal) なICPの解説
Yusuke Sekikawa
 
グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)
グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)
グラフ構造データに対する深層学習〜創薬・材料科学への応用とその問題点〜 (第26回ステアラボ人工知能セミナー)
STAIR Lab, Chiba Institute of Technology
 
Ml10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsMl10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topics
ankit_ppt
 
はじパタ8章 svm
はじパタ8章 svmはじパタ8章 svm
はじパタ8章 svm
tetsuro ito
 
Graph Attention Network
Graph Attention NetworkGraph Attention Network
Graph Attention Network
Takahiro Kubo
 
[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...
[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...
[DLHacks 実装]Network Dissection: Quantifying Interpretability of Deep Visual R...
Deep Learning JP
 
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック 大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
大域的探索から局所的探索へデータ拡張 (Data Augmentation)を用いた学習の探索テクニック
西岡 賢一郎
 
【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders
Deep Learning JP
 

Viewers also liked (17)

強くなるロボティック・ ゲームプレイヤーの作り方3章
強くなるロボティック・ ゲームプレイヤーの作り方3章強くなるロボティック・ ゲームプレイヤーの作り方3章
強くなるロボティック・ ゲームプレイヤーの作り方3章
Hakky St
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...
Hakky St
 
Tensorflow
TensorflowTensorflow
Tensorflow
Hakky St
 
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Hakky St
 
[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions
[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions
[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions
Deep Learning JP
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 4.2節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 4.2節スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 4.2節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 4.2節
Hakky St
 
【機械学習プロフェッショナルシリーズ】グラフィカルモデル2章
【機械学習プロフェッショナルシリーズ】グラフィカルモデル2章 【機械学習プロフェッショナルシリーズ】グラフィカルモデル2章
【機械学習プロフェッショナルシリーズ】グラフィカルモデル2章
Hakky St
 
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPsDeep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Hakky St
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 3.3節と3.4節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 3.3節と3.4節スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 3.3節と3.4節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 3.3節と3.4節
Hakky St
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
Hakky St
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 1章
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 1章スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 1章
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 1章
Hakky St
 
【機械学習プロフェッショナルシリーズ】グラフィカルモデル1章
【機械学習プロフェッショナルシリーズ】グラフィカルモデル1章【機械学習プロフェッショナルシリーズ】グラフィカルモデル1章
【機械学習プロフェッショナルシリーズ】グラフィカルモデル1章
Hakky St
 
Diet networks thin parameters for fat genomic
Diet networks thin parameters for fat genomicDiet networks thin parameters for fat genomic
Diet networks thin parameters for fat genomic
Hakky St
 
スパース性に基づく機械学習 2章 データからの学習
スパース性に基づく機械学習 2章 データからの学習スパース性に基づく機械学習 2章 データからの学習
スパース性に基づく機械学習 2章 データからの学習
hagino 3000
 
劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章
Hakky St
 
Greed is Good: 劣モジュラ関数最大化とその発展
Greed is Good: 劣モジュラ関数最大化とその発展Greed is Good: 劣モジュラ関数最大化とその発展
Greed is Good: 劣モジュラ関数最大化とその発展
Yuichi Yoshida
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hakky St
 
強くなるロボティック・ ゲームプレイヤーの作り方3章
強くなるロボティック・ ゲームプレイヤーの作り方3章強くなるロボティック・ ゲームプレイヤーの作り方3章
強くなるロボティック・ ゲームプレイヤーの作り方3章
Hakky St
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...
Hakky St
 
Tensorflow
TensorflowTensorflow
Tensorflow
Hakky St
 
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Hakky St
 
[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions
[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions
[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions
Deep Learning JP
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 4.2節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 4.2節スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 4.2節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 4.2節
Hakky St
 
【機械学習プロフェッショナルシリーズ】グラフィカルモデル2章
【機械学習プロフェッショナルシリーズ】グラフィカルモデル2章 【機械学習プロフェッショナルシリーズ】グラフィカルモデル2章
【機械学習プロフェッショナルシリーズ】グラフィカルモデル2章
Hakky St
 
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPsDeep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Hakky St
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 3.3節と3.4節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 3.3節と3.4節スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 3.3節と3.4節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 3.3節と3.4節
Hakky St
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 2.3節〜2.5節
Hakky St
 
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 1章
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 1章スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 1章
スパース性に基づく機械学習(機械学習プロフェッショナルシリーズ) 1章
Hakky St
 
【機械学習プロフェッショナルシリーズ】グラフィカルモデル1章
【機械学習プロフェッショナルシリーズ】グラフィカルモデル1章【機械学習プロフェッショナルシリーズ】グラフィカルモデル1章
【機械学習プロフェッショナルシリーズ】グラフィカルモデル1章
Hakky St
 
Diet networks thin parameters for fat genomic
Diet networks thin parameters for fat genomicDiet networks thin parameters for fat genomic
Diet networks thin parameters for fat genomic
Hakky St
 
スパース性に基づく機械学習 2章 データからの学習
スパース性に基づく機械学習 2章 データからの学習スパース性に基づく機械学習 2章 データからの学習
スパース性に基づく機械学習 2章 データからの学習
hagino 3000
 
劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章
Hakky St
 
Greed is Good: 劣モジュラ関数最大化とその発展
Greed is Good: 劣モジュラ関数最大化とその発展Greed is Good: 劣モジュラ関数最大化とその発展
Greed is Good: 劣モジュラ関数最大化とその発展
Yuichi Yoshida
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hakky St
 
Ad

Similar to Reducing the dimensionality of data with neural networks (20)

11.secure compressed image transmission using self organizing feature maps
11.secure compressed image transmission using self organizing feature maps11.secure compressed image transmission using self organizing feature maps
11.secure compressed image transmission using self organizing feature maps
Alexander Decker
 
Teach a neural network to read handwriting
Teach a neural network to read handwritingTeach a neural network to read handwriting
Teach a neural network to read handwriting
Vipul Kaushal
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
Mostafa G. M. Mostafa
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
Dr.(Mrs).Gethsiyal Augasta
 
Neural networks
Neural networksNeural networks
Neural networks
HarshitGupta367
 
A new gridding technique for high density microarray
A new gridding technique for high density microarrayA new gridding technique for high density microarray
A new gridding technique for high density microarray
Alexander Decker
 
employed to cover the tampering traces of a tampered image. Image tampering
employed to cover the tampering traces of a tampered image. Image tamperingemployed to cover the tampering traces of a tampered image. Image tampering
employed to cover the tampering traces of a tampered image. Image tampering
rapellisrikanth
 
Convolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep LearningConvolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep Learning
alihassaah1994
 
HS Demo
HS DemoHS Demo
HS Demo
Horace Sklar
 
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network AlgorithmsWeb Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
aciijournal
 
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network AlgorithmsWeb Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
aciijournal
 
Web spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsWeb spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithms
aciijournal
 
LIDAR- Light Detection and Ranging.
LIDAR- Light Detection and Ranging.LIDAR- Light Detection and Ranging.
LIDAR- Light Detection and Ranging.
Gaurav Agarwal
 
Cnn
CnnCnn
Cnn
Mehrnaz Faraz
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
milad abbasi
 
phase 2 ppt.pptx
phase 2 ppt.pptxphase 2 ppt.pptx
phase 2 ppt.pptx
bharatt7
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
All projects
All projectsAll projects
All projects
Karishma Jain
 
ai7.ppt
ai7.pptai7.ppt
ai7.ppt
qwerty432737
 
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET Journal
 
11.secure compressed image transmission using self organizing feature maps
11.secure compressed image transmission using self organizing feature maps11.secure compressed image transmission using self organizing feature maps
11.secure compressed image transmission using self organizing feature maps
Alexander Decker
 
Teach a neural network to read handwriting
Teach a neural network to read handwritingTeach a neural network to read handwriting
Teach a neural network to read handwriting
Vipul Kaushal
 
Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)Neural Networks: Principal Component Analysis (PCA)
Neural Networks: Principal Component Analysis (PCA)
Mostafa G. M. Mostafa
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
Dr.(Mrs).Gethsiyal Augasta
 
A new gridding technique for high density microarray
A new gridding technique for high density microarrayA new gridding technique for high density microarray
A new gridding technique for high density microarray
Alexander Decker
 
employed to cover the tampering traces of a tampered image. Image tampering
employed to cover the tampering traces of a tampered image. Image tamperingemployed to cover the tampering traces of a tampered image. Image tampering
employed to cover the tampering traces of a tampered image. Image tampering
rapellisrikanth
 
Convolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep LearningConvolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep Learning
alihassaah1994
 
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network AlgorithmsWeb Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
aciijournal
 
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network AlgorithmsWeb Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
aciijournal
 
Web spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsWeb spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithms
aciijournal
 
LIDAR- Light Detection and Ranging.
LIDAR- Light Detection and Ranging.LIDAR- Light Detection and Ranging.
LIDAR- Light Detection and Ranging.
Gaurav Agarwal
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
milad abbasi
 
phase 2 ppt.pptx
phase 2 ppt.pptxphase 2 ppt.pptx
phase 2 ppt.pptx
bharatt7
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET Journal
 
Ad

Recently uploaded (20)

Controlling Financial Processes at a Municipality
Controlling Financial Processes at a MunicipalityControlling Financial Processes at a Municipality
Controlling Financial Processes at a Municipality
Process mining Evangelist
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Z14_IBM__APL_by_Christian_Demmer_IBM.pdf
Z14_IBM__APL_by_Christian_Demmer_IBM.pdfZ14_IBM__APL_by_Christian_Demmer_IBM.pdf
Z14_IBM__APL_by_Christian_Demmer_IBM.pdf
Fariborz Seyedloo
 
Ann Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdfAnn Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdf
আন্ নাসের নাবিল
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 
Understanding Complex Development Processes
Understanding Complex Development ProcessesUnderstanding Complex Development Processes
Understanding Complex Development Processes
Process mining Evangelist
 
Lagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdfLagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdf
benuju2016
 
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docxAnalysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
hershtara1
 
CS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docxCS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docx
nidarizvitit
 
Fundamentals of Data Analysis, its types, tools, algorithms
Fundamentals of Data Analysis, its types, tools, algorithmsFundamentals of Data Analysis, its types, tools, algorithms
Fundamentals of Data Analysis, its types, tools, algorithms
priyaiyerkbcsc
 
Transforming health care with ai powered
Transforming health care with ai poweredTransforming health care with ai powered
Transforming health care with ai powered
gowthamarvj
 
Dynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics DynamicsDynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics Dynamics
heyoubro69
 
Process Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital TransformationsProcess Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital Transformations
Process mining Evangelist
 
AWS RDS Presentation to make concepts easy.pptx
AWS RDS Presentation to make concepts easy.pptxAWS RDS Presentation to make concepts easy.pptx
AWS RDS Presentation to make concepts easy.pptx
bharatkumarbhojwani
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Time series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdfTime series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdf
asmaamahmoudsaeed
 
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
disnakertransjabarda
 
2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf
dominikamizerska1
 
Controlling Financial Processes at a Municipality
Controlling Financial Processes at a MunicipalityControlling Financial Processes at a Municipality
Controlling Financial Processes at a Municipality
Process mining Evangelist
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Z14_IBM__APL_by_Christian_Demmer_IBM.pdf
Z14_IBM__APL_by_Christian_Demmer_IBM.pdfZ14_IBM__APL_by_Christian_Demmer_IBM.pdf
Z14_IBM__APL_by_Christian_Demmer_IBM.pdf
Fariborz Seyedloo
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 
Lagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdfLagos School of Programming Final Project Updated.pdf
Lagos School of Programming Final Project Updated.pdf
benuju2016
 
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docxAnalysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
hershtara1
 
CS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docxCS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docx
nidarizvitit
 
Fundamentals of Data Analysis, its types, tools, algorithms
Fundamentals of Data Analysis, its types, tools, algorithmsFundamentals of Data Analysis, its types, tools, algorithms
Fundamentals of Data Analysis, its types, tools, algorithms
priyaiyerkbcsc
 
Transforming health care with ai powered
Transforming health care with ai poweredTransforming health care with ai powered
Transforming health care with ai powered
gowthamarvj
 
Dynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics DynamicsDynamics 365 Business Rules Dynamics Dynamics
Dynamics 365 Business Rules Dynamics Dynamics
heyoubro69
 
Process Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital TransformationsProcess Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital Transformations
Process mining Evangelist
 
AWS RDS Presentation to make concepts easy.pptx
AWS RDS Presentation to make concepts easy.pptxAWS RDS Presentation to make concepts easy.pptx
AWS RDS Presentation to make concepts easy.pptx
bharatkumarbhojwani
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Time series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdfTime series for yotube_1_data anlysis.pdf
Time series for yotube_1_data anlysis.pdf
asmaamahmoudsaeed
 
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...
disnakertransjabarda
 
2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf
dominikamizerska1
 

Reducing the dimensionality of data with neural networks

  • 1. Reducing the Dimensionality of Data with Neural Networks @St_Hakky Geoffrey E. Hinton; R. R. Salakhutdinov (2006-07-28). “Reducing the Dimensionality of Data with Neural Networks”. Science 313 (5786)
  • 2. Dimensionality Reduction • Dimensionality Reduction facipitates… • Classification • Visualization • Communication • Storage of high-dimensional data
  • 3. Principal Components Analysis • PCA(Principal Components Analysis) • A simple and widely used method • Finds the directions of greatest variance in the data set • Represents each data point by its coordinates along each of these directions
  • 4. “Encoder” and “Decoder” Network • This paper describe a nonlinear generalization of PCA(This is autoencoder) • use an adaptive, multilayer “encoder” network to transform the high-dimensional data into a low- dimensional code • a similar “decoder” network to recover the data from the code
  • 6. AutoEncoder Input data Reconstructing data Hidden layer Input layer Outputlayer Dimensionality Reduction
  • 7. How to train the AutoEncoder ・ Starting with random weights in the two networks Input data Reconstructing data Hidden layer Input layer Outputlayer Dimensionality Reduction ・ They are trained by minimizing the discrepancy between the original data and its reconstruction. ・ Gradients are obtained by the chain rule to back- propagate error from the decoder network to encoder network.
  • 8. It is difficult to optimize multilayer autoencoder • It is difficult to optimize the weights in nonlinear autoencoders that have multiple hidden layers(2-4). • With large initial weights: • autoencoders typically find poor local minima • With small initial weights: • the gradients in the early layers are tiny, making it infeasible to train autoencoders with many hidden layers • If the initial weights are close to a good solution, gradient decent works well. However finding such initial weights is very difficult.
  • 9. Pretraining • This paper introduce this “pretraining” procedure for binary data, generalize it to real-valued data, and show that it works well for a variety of data sets.
  • 10. Restricted Boltzmann Machine(RBM) Visible units Hidden units The input data correspond to “visible” units of the RBM and the feature detectors correspond to “hidden” units. A joint configuration (𝑣, ℎ) of the visible and hidden units has an energy given by (1). 𝑣𝑖 ℎ𝑗 𝑏𝑖, 𝑏𝑗: 𝑏𝑖𝑎𝑠 𝑤𝑖𝑗 The network assigns a probability to every possible data via this energy function.
  • 11. Pretraining consits of learning a stack of RBMs ・ The first layer of feature detectors then become the visible units for learning the next RBM. ・ This layer-by-layer learning can be repeated as many times as desired.
  • 12. Experiment(2-A) The six units in the code layer were linear and all the other units were logistic. The network was trained on 20,000 images and tested on 10,000 new images. The autoencoder discovered how to convert each 784-pixel image into six real numbers that allow almost perfect reconstruction. Data The function of layer Encoder Decoder 28 * 28 28 * 28 400 400 200 200 100 100 50 50 25 25 6 6 Used AutoEncoder’s Network Observed Results
  • 13. Experiment(2-A) (1) Random samples of curves from the test data set (2) Reconstructions produced by the six- dimensional deep autoencoder (3) Reconstructions by logistic PCA using six components (4) Reconstructions by logistic PCA The average squared error per image for the last four rows is 1.44, 7.64, 2.45, 5.90. (5) Standard PCA using 18 components. (1) (3) (5) (4) (2)
  • 14. Experiment(2-B) Used AutoEncoder’s Network The 30 units in the code layer were linear and all the other units were logistic. The function of layer The network was trained on 60,000 images and tested on 10,000 new images. Data Encoder Decoder 1000 1000 784 784 500 250 250 30 30 500
  • 15. Experiment(2-B):MNIST The average squared errors for the last three rows are 3.00, 8.01, and 13.87. (1) (3) (2) (4) (1) A random test image from each class (2) Reconstructions by the 30-dimensional autoencoder (3) Reconstructions by 30- dimensional logistic PCA (4) Reconstructions by standard PCA
  • 16. Experiment(2-B) A two-dimensional autoencoder produced a better visualization of the data than did the first two principal components. (A) The two-dimensional codes for 500 digits of each class produced by taking the first two principal components of all 60,000 training images. (B) The two-dimensional codes found by a 784- 1000-500-250-2 autoencoder.
  • 17. Experiment(2-C) Used AutoEncoder’s Network The 30 units in the code layer were linear and all the other units were logistic. The function of layer Olivetti face data set Data Encoder Decoder 2000 2000 625 625 1000 500 500 30 30 1000 Observed Results The autoencoder clearly outperformed PCA
  • 18. Experiment(2-C) (1) Random samples from the test data set (1) (3) (2) (2) Reconstructions by the 30-dimensional autoencoder (3) Reconstructions by 30-dimensional PCA. The average squared errors are 126 and 135.
  • 19. Conclusion • It has been obvious since the 1980s that backpropagation through deep autoencoders would be very effective for nonlinear dimensionality reduction in the situation of… • Computers were fast enough • Data sets were big enough • The initial weights were close enough to a good solution.
  • 20. Conclusion • Autoencoders give mappings in both directions between the data and code spaces. • They can be applied to very large data sets. • The reason is that both the pretraining and the fine- tuning scale linearly in time and space with the number of training cases.
  翻译: