BxD Primer Series: Liquid State Machine (LSM) Neural Networks

BxD Primer Series: Liquid State Machine (LSM) Neural Networks

Hey there 👋

Welcome to BxD Primer Series where we are covering topics such as Machine learning models, Neural Nets, GPT, Ensemble models, Hyper-automation in ‘one-post-one-topic’ format. Today’s post is on Liquid State Machine (LSM) Neural Networks. Let’s get started:

The What:

Liquid State Machine (LSM) is a type of recurrent neural network used for processing sequential data.

In an LSM, the input is fed into a large pool of neurons called ‘liquid reservoir’. The liquid is connected randomly and have a rich internal dynamics. The neurons are constantly firing and interacting with each other in a nonlinear way. This results in a high-dimensional representation of input signal, which is then fed into a readout layer for classification or regression tasks.

Readout layer receive the output of liquid reservoir and perform a classification or regression task. It is trained to map the output of liquid reservoir to desired output of network.

LSMs are able to learn quickly and adapt to changing environment. They perform computation in a distributed and parallel manner, which makes it more efficient than traditional sequential processing models.

Additionally, LSMs require relatively little training data compared to other types of neural networks. They have been successfully applied for speech recognition, gesture recognition, time-series prediction, music analysis, and language modeling tasks.

Applications of LSMs:

  1. Speech Recognition, where input is an audio signal and output is the recognized text.
  2. Time-Series Prediction, where input is a time-varying signal and output is a prediction of future values of signal.
  3. Robot Control, where input is the state of robot and output is the action to be taken by robot.
  4. Gesture Recognition, where input is a sequence of images or sensor readings and output is the recognized gesture.
  5. Image Recognition, where input is an image and output is the recognized object or scene.
  6. Anomaly Detection, where input is a time-varying signal and output is an indication of whether the signal contains anomalous behavior.
  7. Language Modeling, where input is a sequence of words and output is a prediction of next word in sequence.

An Analogy:

Say, you are randomly throwing stones into water. Depending of what kind of stones you have throwed into the water, there's a wave pattern that changes with each timestamp.

From this wave pattern you can make conclusions about the features of different stones. Out of this pattern you can tell what kind of stones you threw in.

Similarly, LSM’s liquid reservoir have a pool of neurons that fire up differently at different input sequences. From this pattern of firing neurons, the readout layer is able to make a prediction or classification regarding input.

Note: Training of the readout layer is separate from the training of liquid reservoir.

  • Liquid reservoir is typically pre-trained using an unsupervised learning algorithm which involves randomly initializing the reservoir and allowing it to settle into a stable state in response to a set of input signals.
  • Readout layer is trained to predict corresponding output for each input signal using internal state of liquid reservoir as input. This is done by supervised technique of minimizing the mean squared error between predicted and desired output.

Anatomy of a Liquid State Machine:

No alt text provided for this image

Echo State Property of LSM:

The liquid reservoir of an LSM can be thought as a ‘memory’ that retains information about past input signals. When a new input signal is presented, the reservoir’s response is influenced by its previous states (only) to produce a output that reflects the current input. This is as if LSM is echoing the past inputs.

Echo state property is important for time-series prediction tasks because it enables the LSM to capture complex temporal dependencies in input. By "remembering" past states of input, LSM is able to predict the future states with high accuracy.

Time Constant in LSM:

A time constant is typically set by user and controls the rate at which internal state of reservoir forgets past inputs and adapts to new ones.

  • Longer time constant will result in a slower adaptation of internal state
  • Shorter time constant will result in a faster adaptation

The choice of time constant depends on the characteristics of input signal and task at hand.

  • In tasks where input signal changes slowly over time, a longer time constant is more appropriate to allow the reservoir to capture the temporal dynamics of signal.
  • In tasks where input signal changes rapidly, a shorter time constant is more appropriate to allow the reservoir to respond quickly to changes in signal.

Time constant also affect the performance of LSM.

  • Longer time constant improve the stability of network and prevent it from overfitting to noisy or irrelevant features of input signal. Vice-versa for shorter time constant.
  • Longer time constant leads to slower convergence during training and slower response times during inference. Vice-versa for shorter time constant.

Use of Spiking Neurons:

Spiking neurons are commonly used in liquid reservoir of LSM. They mimic the behavior of biological neurons, which communicate through generation and propagation of impulses called spikes.

Consider the basic model for spiking neurons called the Leaky Integrate-and-Fire (LIF) neuron model, which is defined by following equations:

✪ Membrane potential dynamics:

No alt text provided for this image

Where,

  • V is the membrane potential
  • τ_m is the membrane time constant
  • V_rest is the resting potential
  • R is the membrane resistance
  • I(t) is the input current at time t.

Spike generation: If the membrane potential V reaches a threshold value V_th, a spike is generated, and the membrane potential is reset to a reset value V_reset.

Refractory period: After a spike is generated, the neuron enters a refractory period during which it is unable to generate additional spikes for refractory period duration t_ref.

Liquid reservoir is formed by a large number of randomly interconnected spiking neurons.

The How:

Understanding and developing a working LSM is typically involves below steps:

Architecture: LSM consists of three main components: input, liquid reservoir, and readout.

Input: Denote the input signal as x(t), where t represents the time index.

Liquid Reservoir is a randomly connected network of neurons. Denote the reservoir state at time t as r(t). The dynamics of reservoir neurons can be described by following equation:

No alt text provided for this image

Where,

  • W_in is input weight matrix of size N_r*N_in
  • N_r is number of neurons in reservoir
  • N_in is the dimensionality of input signal
  • W is the recurrent weight matrix of size N_r*N_r. Elements of W are randomly initialized and remain fixed during training.
  • f(·) is the activation function applied element-wise to reservoir's inputs

Readout is responsible for mapping the reservoir states to desired output. Denote readout weights as W_out, a matrix of size N_out*N_r, where N_out is the dimensionality of output. The readout computes its output y(t) based on current reservoir state r(t):

No alt text provided for this image

Training: Readout is trained using supervised learning. Given a training dataset of input-output pairs x(t), y(t), where y(t) is desired output at time t, the readout weights are learned to minimize a regularized mean squared error:

No alt text provided for this image

β is a regularization parameter to prevent overfitting. Optimization problem is expressed as:

No alt text provided for this image

Prediction: Once the LSM is trained, it can be used for prediction. Given a new input signal x_test(t), the reservoir state r_test(t) is computed using reservoir dynamics equation:

No alt text provided for this image

The readout then generates predictions y_test(t) based on current reservoir state:

No alt text provided for this image

Adaptation: The readout can be fine-tuned periodically to maintain accuracy in prediction tasks. This is useful in online-learning tasks.

Effect of Parameters on Performance:

✪ Spectral Radius refers to the maximum absolute eigenvalue of weights of recurrent connections in reservoir.

  • If spectral radius is too small, the internal dynamics of reservoir will be weak, and the network will not be able to generate a rich and diverse feature space.
  • If spectral radius is too large, the internal dynamics of reservoir may become chaotic, leading to unstable behavior and poor performance.
  • Empirically, it has been found that a spectral radius of around 1 provides good performance on a wide range of tasks.
  • Spectral radius can be adjusted by multiplying the weight matrix of recurrent connections by a scalar value, or by adjusting the gain of activation function used in reservoir.

✪ Sparsity of Liquid Reservoir refers to the percentage of connections that are present between neurons in the reservoir. It is set by randomly selecting a fixed percentage of connections and setting weights of remaining connections to zero.

  • Sparse reservoir has fewer connections between neurons. Dense reservoir has more connections between neurons.
  • Dense reservoir leads to a more diverse feature space improving network's ability to process complex inputs.
  • Dense reservoir can also lead to overfitting, as the network may memorize input data instead of learning to generalize to new inputs.
  • Sparse reservoir leads to a decrease in network's memory capacity, as there are fewer connections to store information. Vice-versa for dense reservoir.
  • Sparsity level of around 10-20% is used in practice.

✪ Size of Input Window refers to the number of consecutive time steps of input that are presented to the network as a single input vector.

  • When input window size is too small, the network is not be able to capture long-term dependencies in input signal.
  • When input window size is too large, the network becomes overwhelmed with information and struggle to learn meaningful representations of input signal.

The Why:

Reasons for using Liquid State Machine:

  1. Robustness: LSMs are robust to noise and disturbances in input signal. This is because the internal state of liquid reservoir acts as a filter that can smooth out noise in input signal, making it easier for the readout layer to extract meaningful information.
  2. Nonlinearity: Recurrent connections in liquid reservoir creates complex, nonlinear dynamics that can be used to encode information.
  3. Dynamic System: LSM network's internal state evolves over time to perform computations on input signal. This is useful for time-dependent inputs such as speech or video frames.
  4. Reservoir Computing Paradigm has shown to be effective for time series prediction, speech recognition, and robotic control tasks
  5. Low Power Consumption: LSMs can be implemented on low-power hardware due to their low computation requirements. Suitable for IoT devices or implantable medical devices.

The Why Not:

Reasons for not using Liquid State Machine:

  1. More complex to design and train than other types of neural networks, due to the need to optimize both the structure and the dynamics of liquid reservoir.
  2. Highly dependent on initial conditions of liquid reservoir, such as connectivity and weight distribution.
  3. Training time for an LSM is relatively long.
  4. High degree of internal complexity makes it difficult to control the network's behavior.
  5. LSMs have limited state memory, meaning that they may not be effective for tasks that require long-term memory or context, such as machine translation.

Time for you to support:

  1. Reply to this article with your question
  2. Forward/Share to a friend who can benefit from this
  3. Chat on Substack with BxD (here)
  4. Engage with BxD on LinkedIN (here)

In next edition, we will cover Extreme Learning Machine Neural Networks.

Let us know your feedback!

Until then,

Have a great time! 😊

#businessxdata #bxd #Liquid #State #Machine #neuralnetworks #primer

To view or add a comment, sign in

More articles by Mayank K.

  • What we look for in new recruits?

    Personalization is the #1 use case of most of AI technology (including Generative AI, Knowledge Graphs…

  • 500+ Enrollments, 🌟🌟🌟🌟🌟 Ratings and a Podcast

    We are all in for AI Driven Marketing Personalization. This is the niche where we want to build this business.

  • What you mean 'Build A Business'?

    We are all in for AI Driven Personalization in Business. This is the niche where we want to build this business.

  • Why 'AI-Driven Personalization' niche?

    We are all in for AI Driven Personalization in Business. In fact, this is the niche where we want to build this…

  • Entering the next chapter of BxD

    We are all in for AI Driven Personalization in Business. And recently we created a course about it.

    1 Comment
  • We are ranking #1

    We are all in for AI Driven Personalization in Business. And recently we created a course about it.

  • My favorites from the new release

    The Full version of BxD newsletter has a new home. Subscribe on LinkedIn: 🌟 https://www.

  • Many senior level jobs inside....

    Hi friend - As you know, we recently completed 100 editions of this newsletter and I was the primary publisher so far…

  • People need more jobs and videos.

    From the 100th edition celebration survey conducted last week- one point is standing out that people need more jobs and…

  • BxD Saturday Letter #202425

    Please take 2 mins to send your feedback. Link: https://forms.

Insights from the community

Others also viewed

Explore topics