Deep learning

Deep learning

Deep learning is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervisedsemi-supervised or unsupervised

Deep-learning architectures such as deep neural networksdeep belief networksdeep reinforcement learningrecurrent neural networksconvolutional neural networks and transformers have been applied to fields including computer visionspeech recognitionnatural language processingmachine translationbioinformaticsdrug designmedical image analysisclimate science, material inspection and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance.

Artificial neural networks (ANNs) were inspired by information processing and distributed communication nodes in biological systems. ANNs have various differences from biological brains. Specifically, artificial neural networks tend to be static and symbolic, while the biological brain of most living organisms is dynamic (plastic) and analog.

The adjective "deep" in deep learning refers to the use of multiple layers in the network. Early work showed that a linear perceptron cannot be a universal classifier, but that a network with a nonpolynomial activation function with one hidden layer of unbounded width can. Deep learning is a modern variation that is concerned with an unbounded number of layers of bounded size, which permits practical application and optimized implementation, while retaining theoretical universality under mild conditions. In deep learning the layers are also permitted to be heterogeneous and to deviate widely from biologically informed connectionist models, for the sake of efficiency, trainability and understandability.


Definition

Deep learning is a class of machine learning algorithms that 199–200  uses multiple layers to progressively extract higher-level features from the raw input. For example, in image processing, lower layers may identify edges, while higher layers may identify the concepts relevant to a human such as digits or letters or faces.

Overview

Most modern deep learning models are based on artificial neural networks, specifically convolutional neural networks (CNN)s, although they can also include propositional formulas or latent variables organized layer-wise in deep generative models such as the nodes in deep belief networks and deep Boltzmann machines.

In deep learning, each level learns to transform its input data into a slightly more abstract and composite representation. In an image recognition application, the raw input may be a matrix of pixels; the first representational layer may abstract the pixels and encode edges; the second layer may compose and encode arrangements of edges; the third layer may encode a nose and eyes; and the fourth layer may recognize that the image contains a face. Importantly, a deep learning process can learn which features to optimally place in which level on its own. This does not eliminate the need for hand-tuning; for example, varying numbers of layers and layer sizes can provide different degrees of abstraction.[

The word "deep" in "deep learning" refers to the number of layers through which the data is transformed. More precisely, deep learning systems have a substantial credit assignment path (CAP) depth. The CAP is the chain of transformations from input to output. CAPs describe potentially causal connections between input and output. For a feedforward neural network, the depth of the CAPs is that of the network and is the number of hidden layers plus one (as the output layer is also parameterized). For recurrent neural networks, in which a signal may propagate through a layer more than once, the CAP depth is potentially unlimited.No universally agreed-upon threshold of depth divides shallow learning from deep learning, but most researchers agree that deep learning involves CAP depth higher than 2. CAP of depth 2 has been shown to be a universal approximator in the sense that it can emulate any function. Beyond that, more layers do not add to the function approximator ability of the network. Deep models (CAP > 2) are able to extract better features than shallow models and hence, extra layers help in learning the features effectively.

Deep learning architectures can be constructed with a greedy layer-by-layer method.Deep learning helps to disentangle these abstractions and pick out which features improve performance.

For supervised learning tasks, deep learning methods eliminate feature engineering, by translating the data into compact intermediate representations akin to principal components, and derive layered structures that remove redundancy in representation.

Deep learning algorithms can be applied to unsupervised learning tasks. This is an important benefit because unlabeled data are more abundant than the labeled data. Examples of deep structures that can be trained in an unsupervised manner are deep belief networks.


History

Some sources point out that Frank Rosenblatt developed and explored all of the basic ingredients of the deep learning systems of today.He described it in his book "Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms", published by Cornell Aeronautical Laboratory, Inc., Cornell University in 1962.

The first general, working learning algorithm for supervised, deep, feedforward, multilayer perceptrons was published by Alexey Ivakhnenko and Lapa in 1967.A 1971 paper described a deep network with eight layers trained by the group method of data handling.Other deep learning working architectures, specifically those built for computer vision, began with the Neocognitron introduced by Kunihiko Fukushima in 1980.

The term Deep Learning was introduced to the machine learning community by Rina Dechter in 1986, and to artificial neural networks by Igor Aizenberg and colleagues in 2000, in the context of Boolean threshold neurons.

In 1989, Yann LeCun et al. applied the standard backpropagation algorithm, which had been around as the reverse mode of automatic differentiation since 1970, to a deep neural network with the purpose of recognizing handwritten ZIP codes on mail. While the algorithm worked, training required 3 days.

To view or add a comment, sign in

More articles by Nivedita singh

  • Front-End vs. Back-End: What’s the Difference?

    Front-End Development Front-end development focuses on the user-facing side of a website. Front-end developers ensure…

  • Talend

    What is Talend? Talend is an open source software platform which offers data integration and data management solutions.…

  • Snowflake

    Snowflake Inc. is a cloud computing–based data cloud company based in Bozeman, Montana.

  • Data Profiling

    What Is Data Profiling? Data profiling is the process of reviewing source data, understanding structure, content and…

  • Data Engineering

    In the modern world, it is tough to think of any industry that has not been revolutionized by data science. Although…

  • Data Scrubbing

    What is Data Scrubbing? If in the course of doing household chores, someone told you to clean the floor, you most…

  • Computer Vision

    What is computer vision? Computer vision is a field of artificial intelligence (AI) that enables computers and systems…

  • CSS

    What is CSS? Cascading Style Sheets (CSS) is used to format the layout of a webpage. With CSS, you can control the…

  • Microsoft 365

    Microsoft 365 is a product family of productivity software, collaboration and cloud-based services owned by Microsoft…

    2 Comments
  • Front-End Developer

    Front-End Front-End Development Front-end development focuses on the user-facing side of a website. Front-end…

Insights from the community

Others also viewed

Explore topics