SIMD and MIMD
SIMD (Single Instruction, Multiple Data)SIMD is a type of parallel processing where a single instruction operates on multiple data points simultaneously. Imagine having an assembly line where one station performs the same task on multiple items at once.
How it works:
EX:Consider adding two arrays of numbers:
array1 = [1, 2, 3, 4]
array2 = [5, 6, 7, 8]
result = [0, 0, 0, 0]
A traditional processor would perform the addition element by element:
result[0] = array1[0] + array2[0]
result[1] = array1[1] + array2[1]
result[2] = array1[2] + array2[2]
result[3] = array1[3] + array2[3]
A SIMD processor could load multiple elements from array1 and array2 into its vector registers and perform the addition in a single instruction, potentially processing all four additions at once.
What is AVX?
AVX (Advanced Vector Extensions) is an x86 SIMD (Single Instruction, Multiple Data) instruction set introduced by Intel and AMD to accelerate data-parallel computations. It allows a single CPU instruction to process multiple data elements simultaneously, leveraging wide vector registers (e.g., 256-bit or 512-bit). AVX is commonly used for tasks like:
Example of AVX code : Just for understanding purpose , I didnt execute this anywhere
below code adds two arrays 8x faster than scalar code (assuming AVX is supported).
#include <immintrin.h> // AVX intrinsics header
void add_arrays(float* a, float* b, float* c, int N) {
for (int i = 0; i < N; i += 8) { // Process 8 floats at once (256-bit AVX)
__m256 vecA = _mm256_loadu_ps(&a[i]); // Load 8 floats from a
__m256 vecB = _mm256_loadu_ps(&b[i]); // Load 8 floats from b
__m256 vecC = _mm256_add_ps(vecA, vecB); // Add them in parallel
_mm256_storeu_ps(&c[i], vecC); // Store result
}
}
The par_unseq policy allows the compiler/runtime to use both multi-threading and SIMD (e.g., AVX). However:
MIMD (Multiple Instruction, Multiple Data)
MIMD is a more flexible form of parallel processing where multiple processors can execute different instructions on different data simultaneously. Think of it as having multiple independent assembly lines, each working on different tasks.
Recommended by LinkedIn
How it works:
Example:
Consider rendering a complex 3D scene:
Each processor executes different instructions on different parts of the scene data concurrently.
Other Parallel Architectures
While SIMD and MIMD are the two primary classifications, other architectures exist:
How C++17 Utilizes SIMD and MIMD
The C++17 parallel algorithms, particularly when used with the std::execution::par_unseq policy, can leverage both SIMD and MIMD parallelism:
Finnaly
Referance ;