Einstein Summation in Geometric Deep Learning

Einstein Summation in Geometric Deep Learning

The einsum function in NumPy and PyTorch, which implements Einstein summation notation, provides a powerful and efficient way to perform complex tensor and matrix operations. This function plays a crucial role in two major libraries for geometric deep learning: Geomstats and PyTorch Geometric.

This article explores the various applications and use cases of this essential function.

Introduction
       Overview
       Numpy einsum
       PyTorch einsum
Applications
       Matrix Operations
       Neural Network Linear Layer
References        

What you will learn: How to leverage NumPy’s einsum method to efficiently perform complex linear algebra operations, as demonstrated in PyTorch Geometric and the Geomstats Python library.



The complete article, featuring design principles, detailed implementation, in-depth analysis, and exercises, is available Dive into Einstein Notation in Geometric Deep Learning


Notes:

  • Libraries: Python 3.12, Numpy 2.2.0, PyTorch 2.5.1, PyTorch Geometric 2.6.1, Geomstats 2.8.0
  • Source code is available at GitHub/geometriclearning/util.ein_sum
  • To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statements.


Introduction

Overview

The Einstein summation convention is widely used in differential geometry, particularly in fields such as physics, quantum mechanics, general relativity, and geometric deep learning. Two prominent libraries in geometric learning, PyTorch Geometric and Geomstats, heavily rely on Einstein summation for efficient tensor operations.

But what is the Einstein summation?

The Einstein convention implies a summation over a set of indexed terms in a formula ref 1]. Here are some example for Linear Algebra:

Article content

📌 The Einstein summation convention is also extensively used in differential calculus. The following formulas represent the divergencegradient, and Laplacian of a function or vector field [ref 2].

Article content

Numpy einsum

The numpy.einsum function provides a flexible and efficient way to perform tensor operations using Einstein summation notation [ref 3]. It simplifies operations such as dot products, matrix multiplications, transpositions, outer products or tensor contractions.

The method einsum implements the Einstein summation indices convention with 3 different operational modes.

  • Implicit Subscripts: This is the default and simplest format of the function. The sequence of operations is inferred based on the input subscripts. 
  • Explicit Subscripts: Some operations may require more specific instruction for processing indices. In this case, the output format is explicitly specified. 
  • Broadcasting: This convenient mode is supported by using ellipsis notation. 

Article content
Table. 1 einsum notation and corresponding Numpy functions for key matrix operations


PyTorch einsum

The torch.einsum method in PyTorch provides a flexible and efficient way to perform tensor operations based on Einstein summation notation [ref 4]. It allows for concise, readable, and optimized implementations of dot products, matrix multiplications, outer products, transpositions, and more complex tensor contractions.

As expected, the PyTorch definition follows the same protocol or convention that its counter part in Numpy.

Article content
Table 2 Einsum notation and corresponding PyTorch functions for tensor operations



Applications

Matrix Operations

The NumPy einsum method offers an elegant alternative to the commonly used @operator or dedicated NumPy methods.

Matrix multiplication


einsum can be used as an alternative to the Numpy operator @ and matmul method.

The sequence of operations are 

  1. multiply a and b in a particular way to create new array of products; and then maybe
  2. sum this new array along particular axes
  3. transpose the axes of the new array in a particular order, if necessary.

Article content

Output

[[0.7  2.25]
 [2.   4.75]]        

Let’s look at PyTorch implementation of einsum for matrix multiplication:

Article content
tensor([[ 1.1400, -0.4900, -0.2500],
        [ 0.1300, -0.3000,  0.3700],
        [-0.3600,  0.5700, -0.5100]])        

Dot product


Let’s look at the computation of the dot product of two vectors using einsum methods in Numpy and PyTorch.

Article content
3.5        

Matrix outer product


The NumPy outer method is specifically designed for vector operations. When applied to matrices, the outer product is flattened into a single matrix, requiring reshaping to properly compare results with the einsum function.

This demonstrates another advantage of using einsum, as it avoids the need for reshaping and provides a more intuitive representation of tensor operations.

Article content
Outer Product Vectors
[[0.1  2.   1.2 ]
 [0.05 1.   0.6 ]
 [0.2  4.   2.4 ]
 [0.07 1.4  0.84]
 [0.09 1.8  1.08]]

Outer Product Matrices
[[[[ 0.1   2.  ]
   [ 1.2   0.5 ]]
  [[ 0.05  1.  ]
   [ 0.6   0.25]]
  [[ 0.04  0.8 ]
   [ 0.48  0.2 ]]]
 [[[ 0.05  1.  ]
   [ 0.6   0.25]]
  [[-0.03 -0.6 ]
   [-0.36 -0.15]]
  [[ 0.02  0.4 ]
   [ 0.24  0.1 ]]]
 [[[-0.1  -2.  ]
   [-1.2  -0.5 ]]
  [[ 0.04  0.8 ]
   [ 0.48  0.2 ]]
  [[-0.03 -0.6 ]
   [-0.36 -0.15]]]]        

Matrix Transpose


The simplest way to transpose a matrix in both NumPy and PyTorch is by using the .T operator or calling the matrix.transpose and torch.transpose methods

This operation can also be effortlessly performed using einsum, simply by reversing the order of indices.

Article content
[[ 1.   0.5 -1. ]
 [ 0.5 -0.3  0.4]
 [ 0.4  0.2 -0.3]]        

Matrix Trace


The trace of a square matrix is the sum of its main diagonal elements and can also be interpreted as the sum of some of its eigenvalues. It serves as a linear mapping for any square matrix and remains invariant under transposition, meaning the trace of a matrix equals the trace of its transpose. In NumPy, the trace is easily computed using the trace() method:

The einsum implementation is self-explanatory.

Article content
3.99999        


Neural Network Linear Layer

I thought it would be interesting to look at few applications that leverages einsum function.

The linear transformation of a layer of multi-layer perceptron [ref 5] is defined as y = W.x+b

with 

  • W is the weight matrix,
  • x is the input vector,
  • b is the bias vector,
  • y is the output vector.

Article content
[2.3  2.46 1.29]        



Thank you for reading this article! I hope you found this overview insightful. For a detailed exploration of the topic, check out Dive into Einstein Notation in Geometric Deep Learning



References

  1. Einstein Summation Notation Course
  2. Einstein Summation Notation A. Sengupta
  3. Numpy einsum - numpy.org
  4. Torch einsum - PyTorch.org
  5. Building Multilayer Perceptron Models in PyTorch A. Tam - Machine Learning Mastery



Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning.He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning", Packt Publishing ISBN 978-1-78712-238-3 and Hands-on Geometric Deep Learning newsletter.


#EinsteinSummation #einsum #Numpy #PyTorch #PyG


To view or add a comment, sign in

More articles by Patrick Nicolas

  • Exploring Geometric Learning with Geomstats

    Would an intuitive Python library focused on the core principles of geometric deep learning, such as differential…

  • Your First Look at Geometric Deep Learning

    Facing challenges with high-dimensional, densely packed but limited data, and complex distributions? Geometric Deep…

  • Sampling methods for Graph Neural Networks

    The versatility of graph representations makes them highly valuable for solving a wide range of problems, each with its…

  • Taming Graph Neural Networks with PyTorch Geometric

    Overwhelmed by the functionality and complexity of the PyTorch Geometric API? You're not alone. Since its introduction…

    1 Comment
  • Reusable Neural Blocks in PyTorch

    At some point, we all encounter the challenges of complexity and repetition when building deep learning models…

  • Learning Riemannian Manifolds with Python

    The hypersphere is the simplest and most visually intuitive geometric structure for applying the knowledge of…

  • Riemannian Manifolds for Geometric Learning

    Intrigued by the idea of applying differential geometry to machine learning but feel daunted? Beyond theoretical…

  • Visualization of Graph Neural Networks

    Have you ever found it challenging to represent a graph from a very large dataset while building a graph neural network…

  • Modeling Graph Neural Networks with PyTorch

    Have you ever wondered how to get started with Graph Neural Networks (GNNs)? Torch Geometric (PyG) provides a…

  • Approximating PCA on Manifolds

    Have you ever wondered how to perform Principal Component Analysis on manifolds? An approximate solution relies on the…

Insights from the community

Others also viewed

Explore topics