Least Mean Square Algorithm Part II
Adaptive Filter: Signal Processing | Neural Networks
Abstract
The Least Mean Square Algorithm adaptively adjusts filter weights to minimize the mean squared error between the desired output and the actual output. The Linear Least Squares Filter estimates the filter weight to minimize the sum of squared errors over all samples. LMS algorithm updates the weights using a stochastic gradient descent approach, where the gradient is estimated using only the current (instantaneous) error sample. This article dives into the mathematical and theoretical concepts related to the derivation and understanding of the LMS Algorithm.
1. Introduction
The least-mean-square (LMS) algorithm is configured to minimize the instantaneous value of the cost function[1]. Section 2 includes the discussion of the Wiener solution, Section 3 outlines the algorithm, and Section 4 includes the detailed mathematical derivation of LMS required in both signal processing and learning models.
2. Wiener Filter
The Wiener solution, also known as the Wiener filter, represents the optimal linear filter that minimizes the mean square error (MSE).
**Mathematical Formulation (Discrete-Time Case):**
Let:
d(n) be the desired signal.
x(n) be the input signal.
w be the filter weight vector (for a finite impulse response - FIR filter).
y(n) = wᵀ x(n) be the output of the filter.
e(n) = d(n) - y(n) be the error signal.
The goal is to find the weight vector w that minimizes E[e²(n)],
where E[⋅] denotes the expectation operator.
The Wiener solution for the optimal weight vector w_opt is
given by the Wiener-Hopf equation:
R_x w_opt = r_dx
where:
R_x = E[x(n)xᵀ(n)] is the autocorrelation matrix of the input signal x(n).
r_dx = E[d(n)x(n)] is the cross-correlation vector between the desired
signal d(n) and the input signal x(n).
If R_x is invertible, the Wiener solution for the optimal weight vector is:
w_opt = R_x⁻¹ r_dx
2.1 Intuition behind the Wiener Filter
Okay, imagine you’re trying to listen to your friend talking in a noisy room (like a busy cafe here in Itahari). Your friend’s voice is the desired signal, and all the other chatter, music, and clanking cups are the noise in the input signal.
Your brain is trying to act like a filter to isolate your friend’s voice and suppress the noise. The Wiener filter is like the ideal version of this brain filter, but for signals.
3. Algorithm
Input signal vector: x(n)
x(n)
Desired output: d(n)
d(n)
Learning rate: η
η
Initialization:
Set weights: w⁰ = 0
w⁰=0
For each time step n=1,2,…
n=1,2,…:
Compute output:
yⁿ = wᵀⁿ x(n)
yⁿ=wᵀⁿx(n)
Compute error:
e(n) = d(n) - yⁿ
e(n)=d(n)−yⁿ
Update weights:
wⁿ⁺¹ = wⁿ + η ⋅ x(n) ⋅ e(n)
wⁿ⁺¹=wⁿ+η⋅x(n)⋅e(n)
4. Handwritten Derivations
5. Conclusion
LMS algorithm with a small learning rate, with the valid convergence assumptions, is simple to implement and effective. But it lacks speed in convergence, and large η can increase the convergence speed, but may lead to instability or even divergence of the algorithm. Adaptive Control Systems, Biomedical Signal Processing, and Echo Cancellation are some applications of LMS. Adaptive Linear Elements (ADALINE) and Multiple ADALINE (MADALINE) were the earliest neural networks to use LMS for weight updates
References:
Appendix:
Example: “Appendix A: Code Implementation” The full code for the Least Mean Square filter implementation used in this study is available in the following Google Colab notebook: Google Colab notebook.”
Tools