SlideShare a Scribd company logo
MACHINE
LEARNING –
CONVOLUTIONAL
NEURAL NETWORK
Introduction to Computer Vision
 Computer vision is concerned with the automatic extraction, analysis and
understanding of useful information from a single image or a sequence of
images.
- The British Machine Vision Association and Society for Pattern Recognition (BMVA)
(or)
 It is an interdisciplinary field that deals with how computers can be made to
gain high-level understanding from digital images or videos.
- Wikipedia
2
What is CNN(Convolution Neural Network)
3
● It is a class of deep learning.
● Convolutional neural network (ConvNet’s or CNNs) is one of the main
categories to do images recognition, images classifications, objects
detections, recognition faces etc.,
● It is similar to the basic neural network. CNN also have learnable
parameter like neural network i.e., weights, biases etc.
● CNN is heavily used in computer vision
● There 3 basic components to define CNN
○ The Convolution Layer
○ The Pooling Layer
○ The Output Layer (or) Fully Connected Layer
Basic Structure of
CNN
• Input Layer: Accepts input images as
pixel data.
• Convolutional Layer: Applies filters
to extract features.
• ReLU Layer: Introduces non-linearity
to the network.
• Pooling Layer: Reduces spatial
dimensions of feature maps.
• Fully Connected Layer: Final layer for
classification.
Convolutional Layer
• Filters/Kernels:
Detect specific
features in input
images.
• Stride:
Controls the
movement of
filters across the
input.
• Padding: Adds
pixels around
the input to
maintain
dimensions.
• Output:
Produces
feature maps
indicating
detected
features.
Architecture of CNN
6
Convolution Layer
7
Images source: Analytics
Vidhya
Padding in CNN
• Zero Padding: Adds zeros
around the input image to
preserve dimensions.
• Valid Padding: No padding,
reduces the size of output
feature maps.
• Role: Helps preserve edge
information during
convolution.
9
The concept of stride :
● The weight of a matrix moves 1 pixel at a time is called as stride 1 (as we did in above
case).
What if we increase the stride value?
Images source: Analytics
10
• As we can see in above image the increase in the stride
value decreases the size of the image (which may
cause in losing the features of the image).
• Padding the input image across it solves our problem,
we add more than one layer of zeros around the image
in case of higher stride values.
Images source: Analytics
11
• when the input of 6x6 is padded around with zeros we get the output with same
dimensions of 6x6 this is known as ‘Same Padding’.
● The middle 4x4 pixel remains the same, here we have retained the more information from
borders and also preserved the size of image.
Images source: Analytics
Pooling Layer
• Purpose: Reduces dimensionality
and computation in the network.
• Max Pooling: Selects the maximum
value from each pooling region.
• Average Pooling: Takes the average
value from each pooling region.
• Impact: Retains important features
while reducing overfitting.
Basic Mathematics of CNN (B&W
Image)
• Convolution: Applies a filter matrix
across the image to detect features.
• Example: Sliding a 3x3 filter over a
grayscale image, producing a feature
map.
• ReLU: Applies non-linearity after
convolution.
• Pooling: Reduces the size of the
resulting feature map.
Basic Mathematics of CNN (Colored
Image)
• Convolution: Applies the same filter across each
RGB channel.
• Result: Produces a combined feature map from
all channels.
• Example: Sliding a filter across an RGB image and
summing up feature maps.
• Pooling: Reduces the size of the resulting feature
map while preserving important information.
Fully Connected Layer
• Purpose: Flattens the output and connects to a fully
connected layer.
• Function: Combines features for final classification.
• Uses: Softmax or sigmoid activation functions for output.
Types of CNN
● Based on the problems, we have the different CNN’s which are used in
computer vision.
● The five major computer vision techniques which can be addressed using
CNN.
■ Image Classification
■ Object Detection
■ Object Tracking
■ Semantic Segmentation
■ Instance Segmentation
16
Types of CNN
Image Classification:
● In an image classification we can use the traditional CNN models or there also
many architectures designed by developers to decrease the error rate and
increasing the trainable parameters.
■ LeNet (1998)
■ AlexNet (2012)
■ ZFNet (2013)
■ GoogLeNet19 (2014)
■ VGGNet 16 (2014)
17
LeNet-5 Architecture
• Designed for handwritten digit
recognition (MNIST dataset).
• Structure: 2 convolutional
layers, 2 subsampling layers, 2
fully connected layers.
• Key Feature: Simple and
efficient, early CNN model.
AlexNet Architecture
• Winner of the ImageNet
competition in 2012.
• Structure: 5 convolutional layers, 3
fully connected layers.
• Features: Uses ReLU, dropout, and
data augmentation.
• Impact: Revolutionized deep
learning and computer vision.
VGG-16 Architecture
• Uses 16 layers (13
convolutional, 3 fully connected).
• Features: Smaller filters (3x3)
with deeper networks.
• Strength: Achieves high
accuracy with a simple structure.
ResNet Architecture
• Introduces Residual Learning to
combat vanishing gradients.
• Structure: Skip connections or
shortcuts between layers.
• Impact: Allows very deep
networks (e.g., ResNet-50,
ResNet-101).
Inception (GoogLeNet)
Architecture
• Introduces Inception modules:
parallel convolutional filters.
• Structure: Multiple filter sizes (1x1,
3x3, 5x5) in parallel.
• Impact: Efficient and scalable for
large-scale image recognition.
Transfer Learning
• Concept: Uses a pre-trained model on a new but related
task.
• Benefits: Speeds up training, requires less data, and
improves performance.
• Example: Using a pre-trained model like ResNet for a new
image classification task.
Object Localization
• Purpose: Identifies the location of objects within an image.
• Methods: Bounding box regression, Region Proposal
Networks (RPNs).
• Applications: Object detection, image segmentation.
Landmark Detection
• Definition: Detects specific key
points or landmarks within an image.
• Applications: Facial recognition,
medical imaging (e.g., key anatomical
points).
• Methods: CNNs used to detect and
regress the position of landmarks.
Applications of Computer Vision
● Computer vision, an AI technology that allows computers to understand
and label images, is now used in convenience stores, driverless car
testing, daily medical diagnostics, and in monitoring the health of crops
and livestock.
● Different use cases found in the computer vision as follows
■ Retail and Retail Security
■ Automotive
■ Healthcare
■ Banking
■ Agriculture 26
Conclusion
• CNNs have revolutionized computer vision tasks.
• Architectures like LeNet, AlexNet, VGG, ResNet, and
Inception paved the way for modern image processing.
• Transfer learning, object localization, and landmark
detection expand the versatility of CNNs.
Thank you!
28
Ad

More Related Content

Similar to Introduction to Convolutional Neural Networks (CNNs).pptx (20)

Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
MarcinJedyk
 
interface and user experience. Responsive Design: Ensure the app is user-frie...
interface and user experience. Responsive Design: Ensure the app is user-frie...interface and user experience. Responsive Design: Ensure the app is user-frie...
interface and user experience. Responsive Design: Ensure the app is user-frie...
rairaistar863
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
RishabhTyagi48
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
Marcin Jedyk
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
CHENHuiMei
 
cnn.pdf
cnn.pdfcnn.pdf
cnn.pdf
Amnaalia
 
Cnn
CnnCnn
Cnn
Nirthika Rajendran
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
Aditya Bhattacharya
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
cnn ppt.pptx
cnn ppt.pptxcnn ppt.pptx
cnn ppt.pptx
rohithprabhas1
 
DL.pdf
DL.pdfDL.pdf
DL.pdf
ssuserd23711
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
Kasun Chinthaka Piyarathna
 
Deep Computer Vision - 1.pptx
Deep Computer Vision - 1.pptxDeep Computer Vision - 1.pptx
Deep Computer Vision - 1.pptx
JawadHaider36
 
Convolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep LearningConvolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep Learning
alihassaah1994
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
RaghunandanJairam
 
Mnist report
Mnist reportMnist report
Mnist report
RaghunandanJairam
 
Deep learning with keras
Deep learning with kerasDeep learning with keras
Deep learning with keras
MOHITKUMAR1379
 
CNN_Presentation to learn the basics of CNN Model.pptx
CNN_Presentation  to learn the basics of CNN Model.pptxCNN_Presentation  to learn the basics of CNN Model.pptx
CNN_Presentation to learn the basics of CNN Model.pptx
bani30122004
 
11_Saloni Malhotra_SummerTraining_PPT.pptx
11_Saloni Malhotra_SummerTraining_PPT.pptx11_Saloni Malhotra_SummerTraining_PPT.pptx
11_Saloni Malhotra_SummerTraining_PPT.pptx
SaloniMalhotra23
 
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningMakine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Ali Alkan
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
MarcinJedyk
 
interface and user experience. Responsive Design: Ensure the app is user-frie...
interface and user experience. Responsive Design: Ensure the app is user-frie...interface and user experience. Responsive Design: Ensure the app is user-frie...
interface and user experience. Responsive Design: Ensure the app is user-frie...
rairaistar863
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
RishabhTyagi48
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
Marcin Jedyk
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
CHENHuiMei
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
Aditya Bhattacharya
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
Kasun Chinthaka Piyarathna
 
Deep Computer Vision - 1.pptx
Deep Computer Vision - 1.pptxDeep Computer Vision - 1.pptx
Deep Computer Vision - 1.pptx
JawadHaider36
 
Convolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep LearningConvolutional Neural Network (CNN)of Deep Learning
Convolutional Neural Network (CNN)of Deep Learning
alihassaah1994
 
Deep learning with keras
Deep learning with kerasDeep learning with keras
Deep learning with keras
MOHITKUMAR1379
 
CNN_Presentation to learn the basics of CNN Model.pptx
CNN_Presentation  to learn the basics of CNN Model.pptxCNN_Presentation  to learn the basics of CNN Model.pptx
CNN_Presentation to learn the basics of CNN Model.pptx
bani30122004
 
11_Saloni Malhotra_SummerTraining_PPT.pptx
11_Saloni Malhotra_SummerTraining_PPT.pptx11_Saloni Malhotra_SummerTraining_PPT.pptx
11_Saloni Malhotra_SummerTraining_PPT.pptx
SaloniMalhotra23
 
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningMakine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Ali Alkan
 

Recently uploaded (20)

May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Ad

Introduction to Convolutional Neural Networks (CNNs).pptx

  • 2. Introduction to Computer Vision  Computer vision is concerned with the automatic extraction, analysis and understanding of useful information from a single image or a sequence of images. - The British Machine Vision Association and Society for Pattern Recognition (BMVA) (or)  It is an interdisciplinary field that deals with how computers can be made to gain high-level understanding from digital images or videos. - Wikipedia 2
  • 3. What is CNN(Convolution Neural Network) 3 ● It is a class of deep learning. ● Convolutional neural network (ConvNet’s or CNNs) is one of the main categories to do images recognition, images classifications, objects detections, recognition faces etc., ● It is similar to the basic neural network. CNN also have learnable parameter like neural network i.e., weights, biases etc. ● CNN is heavily used in computer vision ● There 3 basic components to define CNN ○ The Convolution Layer ○ The Pooling Layer ○ The Output Layer (or) Fully Connected Layer
  • 4. Basic Structure of CNN • Input Layer: Accepts input images as pixel data. • Convolutional Layer: Applies filters to extract features. • ReLU Layer: Introduces non-linearity to the network. • Pooling Layer: Reduces spatial dimensions of feature maps. • Fully Connected Layer: Final layer for classification.
  • 5. Convolutional Layer • Filters/Kernels: Detect specific features in input images. • Stride: Controls the movement of filters across the input. • Padding: Adds pixels around the input to maintain dimensions. • Output: Produces feature maps indicating detected features.
  • 8. Padding in CNN • Zero Padding: Adds zeros around the input image to preserve dimensions. • Valid Padding: No padding, reduces the size of output feature maps. • Role: Helps preserve edge information during convolution.
  • 9. 9 The concept of stride : ● The weight of a matrix moves 1 pixel at a time is called as stride 1 (as we did in above case). What if we increase the stride value? Images source: Analytics
  • 10. 10 • As we can see in above image the increase in the stride value decreases the size of the image (which may cause in losing the features of the image). • Padding the input image across it solves our problem, we add more than one layer of zeros around the image in case of higher stride values. Images source: Analytics
  • 11. 11 • when the input of 6x6 is padded around with zeros we get the output with same dimensions of 6x6 this is known as ‘Same Padding’. ● The middle 4x4 pixel remains the same, here we have retained the more information from borders and also preserved the size of image. Images source: Analytics
  • 12. Pooling Layer • Purpose: Reduces dimensionality and computation in the network. • Max Pooling: Selects the maximum value from each pooling region. • Average Pooling: Takes the average value from each pooling region. • Impact: Retains important features while reducing overfitting.
  • 13. Basic Mathematics of CNN (B&W Image) • Convolution: Applies a filter matrix across the image to detect features. • Example: Sliding a 3x3 filter over a grayscale image, producing a feature map. • ReLU: Applies non-linearity after convolution. • Pooling: Reduces the size of the resulting feature map.
  • 14. Basic Mathematics of CNN (Colored Image) • Convolution: Applies the same filter across each RGB channel. • Result: Produces a combined feature map from all channels. • Example: Sliding a filter across an RGB image and summing up feature maps. • Pooling: Reduces the size of the resulting feature map while preserving important information.
  • 15. Fully Connected Layer • Purpose: Flattens the output and connects to a fully connected layer. • Function: Combines features for final classification. • Uses: Softmax or sigmoid activation functions for output.
  • 16. Types of CNN ● Based on the problems, we have the different CNN’s which are used in computer vision. ● The five major computer vision techniques which can be addressed using CNN. ■ Image Classification ■ Object Detection ■ Object Tracking ■ Semantic Segmentation ■ Instance Segmentation 16
  • 17. Types of CNN Image Classification: ● In an image classification we can use the traditional CNN models or there also many architectures designed by developers to decrease the error rate and increasing the trainable parameters. ■ LeNet (1998) ■ AlexNet (2012) ■ ZFNet (2013) ■ GoogLeNet19 (2014) ■ VGGNet 16 (2014) 17
  • 18. LeNet-5 Architecture • Designed for handwritten digit recognition (MNIST dataset). • Structure: 2 convolutional layers, 2 subsampling layers, 2 fully connected layers. • Key Feature: Simple and efficient, early CNN model.
  • 19. AlexNet Architecture • Winner of the ImageNet competition in 2012. • Structure: 5 convolutional layers, 3 fully connected layers. • Features: Uses ReLU, dropout, and data augmentation. • Impact: Revolutionized deep learning and computer vision.
  • 20. VGG-16 Architecture • Uses 16 layers (13 convolutional, 3 fully connected). • Features: Smaller filters (3x3) with deeper networks. • Strength: Achieves high accuracy with a simple structure.
  • 21. ResNet Architecture • Introduces Residual Learning to combat vanishing gradients. • Structure: Skip connections or shortcuts between layers. • Impact: Allows very deep networks (e.g., ResNet-50, ResNet-101).
  • 22. Inception (GoogLeNet) Architecture • Introduces Inception modules: parallel convolutional filters. • Structure: Multiple filter sizes (1x1, 3x3, 5x5) in parallel. • Impact: Efficient and scalable for large-scale image recognition.
  • 23. Transfer Learning • Concept: Uses a pre-trained model on a new but related task. • Benefits: Speeds up training, requires less data, and improves performance. • Example: Using a pre-trained model like ResNet for a new image classification task.
  • 24. Object Localization • Purpose: Identifies the location of objects within an image. • Methods: Bounding box regression, Region Proposal Networks (RPNs). • Applications: Object detection, image segmentation.
  • 25. Landmark Detection • Definition: Detects specific key points or landmarks within an image. • Applications: Facial recognition, medical imaging (e.g., key anatomical points). • Methods: CNNs used to detect and regress the position of landmarks.
  • 26. Applications of Computer Vision ● Computer vision, an AI technology that allows computers to understand and label images, is now used in convenience stores, driverless car testing, daily medical diagnostics, and in monitoring the health of crops and livestock. ● Different use cases found in the computer vision as follows ■ Retail and Retail Security ■ Automotive ■ Healthcare ■ Banking ■ Agriculture 26
  • 27. Conclusion • CNNs have revolutionized computer vision tasks. • Architectures like LeNet, AlexNet, VGG, ResNet, and Inception paved the way for modern image processing. • Transfer learning, object localization, and landmark detection expand the versatility of CNNs.

Editor's Notes

  • #2: Start the discussion with the human eye and take them to the computer vision. Explain about computer vision definition and speak about what are the different fields it deals with. Take the topic to machine learning
  • #3: Say why CNN why not Feed forward NN(example MNIST image 28 x 28 x 1(black & white image contains only 1 channel) Total number of neurons in input layer will 28 x 28 = 784, this can be manageable. What if the size of image is 1000 x 1000, which means you need 10⁶ neurons in input layer.
  • #6: Explain the Architecture of CNN
  • #7: Explain briefly the image
  • #9: What is stride and explain with image Increase in stride value loss of pixels
  • #10: Discuss the same padding concept: when the input of 6x6 is padded around with zeros we get the output with same dimensions of 6x6. And feature are extracted without loss.
  • #11: The output of the Convolution layer is passes through the activation function
  • #26: Discuss Amazon Go store for retail and security Google cars for Automotive Cheque sign recognition in banks
  翻译: