Building Deep Learning Solutions in the Real World: Debugging and Interpretability

Jesus Rodriguez

CTO-CPO of Sentora, Co-Founder of LayerLens, Faktory,and NeuralFabric, Founder of The Sequence AI Newsletter, Guest Lecturer at Columbia, Guest Lecturer at Wharton Business School, Investor, Author.

Published Sep 14, 2018

The implementation of large scale deep learning solutions in the real world is a road full of challenges. Many of those challenges are based on the fact that many of the tools and techniques we use during the lifecycle of a typical software application don’t apply in the deep learning space. As a result, data scientists and engineers are constantly trying to re-imagine solutions to problems that have been solved in traditional software development for decades. One of those areas that we often ignore and that can become a nightmare for data science teams is debugging.

What makes debugging so challenges in deep learning applications? The answer can be summarized in two main factors: unpredictability and the friction between interpretability-accuracy.

Unpredictability

The unpredictable nature of deep learning programs is a result of its dynamic nature. While most software programs exhibit regular runtime patterns based on static code created by a programmer, the runtime behavior of deep learning applications is changing all the time. Let’s take the example of deep neural network that has been regularly achieving a 3.5% error rate. However, after retraining the model with a new dataset, the neural network exhibits an improved 3% error rate. For a data scientist, its almost impossible to determine whether the new behavior its optimal or not and what cause the improvement.

The Interpretability vs. Accuracy Friction

The interpretability/accuracy friction is one of those unfortunate dynamics that rules the current generation of deep learning technologies. Do you care about obtaining the best results or do you care about understanding how those results were produced? That’s a question that data scientists need to answer in every deep learning scenario. Many deep learning techniques are complex in nature and, although they result very accurate in many scenarios, they can become incredibly difficult to interpret.

In order to understand the behavior of deep learning programs, data scientists need to focus on two main tasks: improving interpretability and get really good at debugging 😉.

Practical Tips for Improving Interpretability

Interpretability is one of those elements of deep learning applications that is both broadly defined and difficult to quantify. However, there are some very practical methods that can we apply into deep learning programs to improve their interpretability. In a recent paper, researchers from Google proposed four fundamental elements that can improve the interpretability of deep learning models:

· Understanding what Hidden Layers Do: The bulk of the knowledge in a deep learning model is formed in the hidden layers. Understanding the functionality of the different hidden layers at a macro level is essential to be able to interpret a deep learning model.

Understanding what Hidden Layers Do: The bulk of the knowledge in a deep learning model is formed in the hidden layers. Understanding the functionality of the different hidden layers at a macro level is essential to be able to interpret a deep learning model.

· Understanding How Nodes are Activated: The key to interpretability is not to understand the functionality of individual neurons in a network but rather groups of interconnected neurons that fire together in the same spatial location. Segmenting a network by groups of interconnected neurons will provide a simpler level of abstraction to understand its functionality.

· Understanding How Concepts are Formed: Understanding how deep neural network forms individual concepts that can then be assembled into the final output is another key building block of interpretability.

Practical Tips for Deep Learning Debugging

The complex structure of deep neural networks and the lack of sophisticated tools makes the debugging of deep learning applications nothing short of a nightmare. However, there are a few practical tips that can help you to be more efficient when debugging deep learning programs:

1 — Visualize the Network and its Results

A pretty obvious point; When building a deep learning application, it is imperative to leverage tools that can help to visualize the connected graph and the results of the model based on certain inputs. This will give developers a visually intuitive way to reason through the model and try to understand the behavior of the algorithms.

2 — Analyze Training and Test Errors

The training and test errors in a deep learning model can offer helpful clues about potential problems before they occur. For instance, if a model is overfitting( test error is high) but the training error remains low, then is likely that there are errors in the algorithm. However, if the training error is high then the model is underfitting and we are likely to find an error in the training procedure.

3 — Test with Small Datasets

Building on the previous point; if a model is underfitting, we need to determine whether is a code or data defect. A way to achieve that is to test with a very small number of examples. If the model fails, then it is most likely due to issues with the code.

4 — Monitor Activations and Gradient Values

Keeping an eye on the activations of hidden units and the values of the gradients are essential measures to optimize a deep learning model. The number of node activations are an important metric to understand if a neural network is saturated. Similarly, getting a histogram view of the value of the gradients is a super helpful technique to understand the potential for future optimizations in the model.

Debugging and understanding deep learning programs feels unnatural to many mainstream software engineers. Improving the interpretability of deep learning architectures and setting up the right debugging processes are two of the factors that data science teams should consider implementing very early in the development lifecycle. As deep learning research evolves, the architecture of deep neural networks should become more interpretable and, consequently, easier to debug.

Guido Cecilio

Principal Software Engineer at Mastercard

I need to request a smiling face to Linkedin! By the way, this is a very interesting topic.

To view or add a comment, sign in

Building Deep Learning Solutions in the Real World: Debugging and Interpretability

Jesus Rodriguez

CTO-CPO of Sentora, Co-Founder of LayerLens, Faktory,and NeuralFabric, Founder of The Sequence AI Newsletter, Guest Lecturer at Columbia, Guest Lecturer at Wharton Business School, Investor, Author.

Unpredictability

The Interpretability vs. Accuracy Friction

Practical Tips for Improving Interpretability

Practical Tips for Deep Learning Debugging

More articles by Jesus Rodriguez

Insights from the community

Others also viewed

Which Model and How Much Data? Recipes for Answering The Two Questions That Drive Data Scientist Crazy

Deep learning, labels and AI strategy: a quick survival guide for executives.

Deep Learning or classical Machine Learning — which one to use for your project?

How Is Deep Learning Different From Machine Learning?

Master in Deep Learning for Audio and Video Signal Processing @ University Autonoma of Madrid (UAM) in Spain

Technology Fridays: ONNX Wants to Become the Universal Deep Learning Language

Deep learning is coming to a chip near you

Deep Learning for Everyone: A Step-by-Step Guide from Basics to Pro

Computers Binge Watching Films to Push Deep Learning Frontiers

Demystifying Deep Learning

Explore topics

Unpredictability

The Interpretability vs. Accuracy Friction

Practical Tips for Improving Interpretability

Practical Tips for Deep Learning Debugging

More articles by Jesus Rodriguez

Robust Agents Are All We Need: Faktory Emerges from Stealth Mode with a Private Alpha

Google’s BLEURT is BERT for Evaluating Natural Language Generation Models

Two Deep Learning Frameworks and an AI Super-Computer: Microsoft Launches New Efforts to Achieve Large-Scale AI

Uber Open Sources a New Framework for Designing Optimal Statistical Experiments

Uber Unveils Its New Data Quality Management Solution

LinkedIn Open Sources a Small Component to Simplify the TensorFlow-Spark Interoperability

Google Unveils TAPAS, a BERT-Based Neural Network for Querying Tables Using Natural Language

Facebook Open Sources Blender, the Largest-Ever Open Domain Chatbot

Microsoft Research Unveils Three Efforts to Advance Deep Generative Models

Facebook and Amazon Bring Two Projects to PyTorch 1.5 that Streamline the Lifecycle of Production-Ready Deep Learning Models

Insights from the community

Others also viewed

Which Model and How Much Data? Recipes for Answering The Two Questions That Drive Data Scientist Crazy

Deep learning, labels and AI strategy: a quick survival guide for executives.

Deep Learning or classical Machine Learning — which one to use for your project?

How Is Deep Learning Different From Machine Learning?

Master in Deep Learning for Audio and Video Signal Processing @ University Autonoma of Madrid (UAM) in Spain

Technology Fridays: ONNX Wants to Become the Universal Deep Learning Language

Deep learning is coming to a chip near you

Deep Learning for Everyone: A Step-by-Step Guide from Basics to Pro

Computers Binge Watching Films to Push Deep Learning Frontiers

Demystifying Deep Learning

Explore topics