Implementation of Wu Manber Algorithm?
Last Updated :
10 Jan, 2023
What is Wu- Manber Algorithm?
The Wu-Manber algorithm is a string-matching algorithm that is used to efficiently search for patterns in a body of text. It is a hybrid algorithm that combines the strengths of the Boyer-Moore and Knuth-Morris-Pratt algorithms to provide fast and accurate pattern matching.
Illustration:
Example: s = "the quick brown fox jumps over the lazy dog" pattern = "brown":
Step 1: Divide the pattern into two subpatterns let's say "br" and "own".
Step 2: Next step includes calculating hash values for each subpattern formed in step 1.
Step 3: Start iterating in s from the first character.
Step 4: If one subpattern matches the substring in s like "br" matches "brown" substring in s.
Step 5: Then will check, whether the whole pattern is matching that substring or not.
Step 6: The whole pattern is matching the substring found in string s. It will return the index of the substring indicating pattern matched.
Step 7: If let's say it doesn't match, it will search for another substring in s. If not found return "no match was found".
Steps involved in Wu-Manber Algorithm:
- Create a hash table that maps each possible substring of the pattern to the positions in the pattern where that substring appears.
- This hash table is used to quickly identify the potential starting positions of the pattern in the text.
- Iterate through the text and compare each character to the corresponding character in the pattern.
- If the characters match, you can move to the next character and continue the comparison.
- If the characters do not match, you can use the hash table to determine the maximum number of characters that can be skipped before the next potential starting position of the pattern.
- This allows the algorithm to quickly skip over large sections of the text without missing any potential matches.
Below is the code to implement the above approach:
C++14
// C++ code to implement the above approach
#include <bits/stdc++.h>
using namespace std;
// Generate a hash for each subpattern
int HashPattern(string& pattern, int i, int j)
{
int h = 0;
for (int k = i; k < j; k++) {
h = h * 256 + ((int)pattern[k] - 'a');
}
return h;
}
// Wu Manber algorithm
void WuManber(string& text, string& pattern)
{
// Define the length of the pattern and text
int m = pattern.length();
int n = text.length();
// Define the number of subpatterns to use
int s = 2;
// Define the length of each subpattern
int t = m / s;
// Initialize the hash values for each subpattern
int h[s];
for (int i = 0; i < s; i++) {
h[i] = HashPattern(pattern, i * t, (i + 1) * t);
}
// Initialize the shift value for each subpattern
int shift[s];
for (int i = 0; i < s; i++) {
shift[i] = t * (s - i - 1);
}
// Initialize the match value
bool match = false;
// Iterate through the text
for (int i = 0; i < n - m + 1; i++) {
// Check if the subpatterns match
bool subpatternsMatch = true;
int j;
for (j = 0; j < s; j++) {
if (HashPattern(text, i + j * t,
i + (j + 1) * t)
!= h[j]) {
subpatternsMatch = false;
break;
}
}
if (subpatternsMatch) {
// If the subpatterns match, check if the
// full pattern matches
if (text.substr(i, m) == pattern) {
cout << "Match found at index " << i
<< endl;
match = true;
}
}
// Shift the pattern by the appropriate amount
bool shouldShift = true;
for (j = 0; j < s; j++) {
if (i + shift[j] < n - m + 1) {
shouldShift = false;
break;
}
}
if (shouldShift) {
i += shift[j];
}
}
// If no match was found, print a message
if (!match) {
cout << "No match found \n";
}
}
int main()
{
// Code
string text = "the cat sat on the mat";
string pattern = "the";
WuManber(text, pattern);
return 0;
}
Java
// Java code to implement the above approach
import java.io.*;
import java.util.*;
class GFG {
// Generate a hash for each subpattern
static int hashPattern(String pattern, int i, int j)
{
int h = 0;
for (int k = i; k < j; k++) {
h = h * 256 + (int)pattern.charAt(k);
}
return h;
}
// Wu Manber algorithm
static void wuManber(String text, String pattern)
{
// Define the length of the pattern and text
int m = pattern.length();
int n = text.length();
// Define the number of subpatterns to use
int s = 2;
// Define the length of each subpattern
int t = m / s;
// Initialize the hash values for each subpattern
int[] h = new int[s];
for (int i = 0; i < s; i++) {
h[i] = hashPattern(pattern, i * t, (i + 1) * t);
}
// Initialize the shift value for each subpattern
int[] shift = new int[s];
for (int i = 0; i < s; i++) {
shift[i] = t * (s - i - 1);
}
// Initialize the match value
boolean match = false;
// Iterate through the text
// Iterate through the text
for (int i = 0; i < n - m + 1; i++) {
// Check if the subpatterns match
boolean subpatternsMatch = true;
int j;
for (j = 0; j < s; j++) {
if (hashPattern(text, i + j * t,
i + (j + 1) * t)
!= h[j]) {
subpatternsMatch = false;
break;
}
}
if (subpatternsMatch) {
// If the subpatterns match, check if the
// full pattern matches
if (text.substring(i, i + m).equals(
pattern)) {
System.out.println(
"Match found at index " + i);
match = true;
}
}
// Shift the pattern by the appropriate amount
boolean shouldShift = true;
for (j = 0; j < s; j++) {
if (i + shift[j] < n - m + 1) {
shouldShift = false;
break;
}
}
if (shouldShift) {
i += shift[j];
}
}
// If no match was found, print a message
if (!match) {
System.out.println("No match found");
}
}
public static void main(String[] args)
{
String text = "the cat sat on the mat";
String pattern = "the";
wuManber(text, pattern);
}
}
// This code is contributed by lokesh.
Python3
# Define the hash_pattern() function to generate
# a hash for each subpattern
def hashPattern(pattern, i, j):
h = 0
for k in range(i, j):
h = h * 256 + ord(pattern[k])
return h
# Define the Wu Manber algorithm
def wuManber(text, pattern):
# Define the length of the pattern and
# text
m = len(pattern)
n = len(text)
# Define the number of subpatterns to use
s = 2
# Define the length of each subpattern
t = m // s
# Initialize the hash values for each
# subpattern
h = [0] * s
for i in range(s):
h[i] = hashPattern(pattern, i * t, (i + 1) * t)
# Initialize the shift value for each
# subpattern
shift = [0] * s
for i in range(s):
shift[i] = t * (s - i - 1)
# Initialize the match value
match = False
# Iterate through the text
for i in range(n - m + 1):
# Check if the subpatterns match
for j in range(s):
if hashPattern(text, i + j * t, i + (j + 1) * t) != h[j]:
break
else:
# If the subpatterns match, check if
# the full pattern matches
if text[i:i + m] == pattern:
print("Match found at index", i)
match = True
# Shift the pattern by the appropriate
# amount
for j in range(s):
if i + shift[j] < n - m + 1:
break
else:
i += shift[j]
# If no match was found, print a message
if not match:
print("No match found")
# Driver Code
text = "the cat sat on the mat"
pattern = "the"
# Function call
wuManber(text, pattern)
C#
// C# code to implement the above approach
using System;
using System.Collections.Generic;
public class GFG {
// Generate a hash for each subpattern
static int HashPattern(string pattern, int i, int j)
{
int h = 0;
for (int k = i; k < j; k++) {
h = h * 256 + (int)pattern[k];
}
return h;
}
// Wu Manber algorithm
static void WuManber(string text, string pattern)
{
// Define the length of the pattern and text
int m = pattern.Length;
int n = text.Length;
// Define the number of subpatterns to use
int s = 2;
// Define the length of each subpattern
int t = m / s;
// Initialize the hash values for each subpattern
int[] h = new int[s];
for (int i = 0; i < s; i++) {
h[i] = HashPattern(pattern, i * t, (i + 1) * t);
}
// Initialize the shift value for each subpattern
int[] shift = new int[s];
for (int i = 0; i < s; i++) {
shift[i] = t * (s - i - 1);
}
// Initialize the match value
bool match = false;
// Iterate through the text
for (int i = 0; i < n - m + 1; i++) {
// Check if the subpatterns match
bool subpatternsMatch = true;
int j;
for (j = 0; j < s; j++) {
if (HashPattern(text, i + j * t,
i + (j + 1) * t)
!= h[j]) {
subpatternsMatch = false;
break;
}
}
if (subpatternsMatch) {
// If the subpatterns match, check if the
// full pattern matches
if (text.Substring(i, m).Equals(pattern)) {
Console.WriteLine(
"Match found at index " + i);
match = true;
}
}
// Shift the pattern by the appropriate amount
bool shouldShift = true;
for (j = 0; j < s; j++) {
if (i + shift[j] < n - m + 1) {
shouldShift = false;
break;
}
}
if (shouldShift) {
i += shift[j];
}
}
// If no match was found, print a message
if (!match) {
Console.WriteLine("No match found");
}
}
static public void Main()
{
// Code
string text = "the cat sat on the mat";
string pattern = "the";
WuManber(text, pattern);
}
}
// This code is contributed by lokeshmvs21.
JavaScript
// JS code to implement the approach
// Define the hash_pattern() function to generate
// a hash for each subpattern
function hashPattern(pattern, i, j) {
let h = 0
for (let k = i; k < j; k++)
h = h * 256 + (pattern[k]).charCodeAt(0)
return h
}
// Define the Wu Manber algorithm
function wuManber(text, pattern) {
// Define the length of the pattern and
// text
let m = pattern.length
let n = text.length
// Define the number of subpatterns to use
let s = 2
// Define the length of each subpattern
let t = Math.floor(m / s)
// Initialize the hash values for each
// subpattern
let h = new Array(s).fill(0)
for (let i = 0; i < s; i++)
h[i] = hashPattern(pattern, i * t, (i + 1) * t)
// Initialize the shift value for each
// subpattern
let shift = new Array(s).fill(0)
for (let i = 0; i < s; i++)
shift[i] = t * (s - i - 1)
// Initialize the match value
let match = false
// Iterate through the text
for (let i = 0; i < (n - m + 1); i++) {
// Check if the subpatterns match
for (let j = 0; j < s; j++) {
if (hashPattern(text, i + j * t, i + (j + 1) * t) != h[j])
break
}
// If the subpatterns match, check if
// the full pattern matches
if (text.slice(i, i + m) == pattern) {
console.log("Match found at index" + i + "<br>")
match = true
}
// Shift the pattern by the appropriate
// amount
for (let j = 0; j < s; j++) {
if (i + shift[j] < n - m + 1)
break
else
i += shift[j]
}
}
// If no match was found, document.write a message
if (!match)
console.log("No match found")
}
// Driver Code
let text = "the cat sat on the mat"
let pattern = "the"
// Function call
wuManber(text, pattern)
// This code is contributed by Potta Lokesh
OutputMatch found at index 0
Match found at index 15
Time complexity: O(n + m)
Auxiliary Space: O (n+m)
Difference between KMP and Wu-Manber Algorithms?
KMP algorithm and Wu Manber algorithm are both string-matching algorithms, which means that they are used to find a substring within a larger string. Both algorithms have the same time complexity, which means that they have the same performance characteristics in terms of how long it takes for the algorithm to run.
However, there are some differences between them:
- KMP algorithm uses a preprocessing step to generate a partial match table, which is used to speed up the string-matching process. This makes the KMP algorithm more efficient than the Wu Manber algorithm when the pattern that is being searched for is relatively long.
- Wu Manber algorithm uses a different approach to string matching, which involves dividing the pattern into several subpatterns and using these subpatterns to search for matches in the text. This makes the Wu Manber algorithm more efficient than the KMP algorithm when the pattern that is being searched for is relatively short.
Similar Reads
Implementation of Rabin Karp Algorithm in C++
The Rabin-Karp Algorithm is a string-searching algorithm that efficiently finds a pattern within a text using hashing. It is particularly useful for finding multiple patterns in the same text or for searching in streams of data. In this article, we will learn how to implement the Rabin-Karp Algorith
5 min read
Prim's Algorithm with a Java Implementation
Prim's Algorithm is a popular Algorithm in the field of computer science and Graph Theory It is used to find the Minimum Spanning Tree of a weighted, undirected graph. The Minimum Spanning Tree is the subset of the Graph that connects all vertices with minimum total edge weight that is also without
5 min read
Push Relabel Algorithm | Set 2 (Implementation)
We strongly recommend to refer below article before moving on to this article. Push Relabel Algorithm | Set 1 (Introduction and Illustration) Problem Statement: Given a graph that represents a flow network where every edge has a capacity. Also given two vertices source âsâ and sink âtâ in the graph
15+ min read
Bellman Ford Algorithm (Simple Implementation)
We have introduced Bellman Ford and discussed on implementation here.Input: Graph and a source vertex src Output: Shortest distance to all vertices from src. If there is a negative weight cycle, then shortest distances are not calculated, negative weight cycle is reported.1) This step initializes di
13 min read
Implementation of Restoring Division Algorithm for unsigned integer
In the previous article, we have already discussed the Restoring Division Algorithm. In this article, we will discuss the implementation of this algorithm. Restoring Division Algorithm is used to divide two unsigned integers. This algorithm is used in Computer Organization and Architecture. This alg
14 min read
Most important type of Algorithms
What is an Algorithm?An algorithm is a step-by-step procedure to solve a problem. A good algorithm should be optimized in terms of time and space. Different types of problems require different types of algorithmic techniques to be solved in the most optimized manner. There are many types of algorith
4 min read
LRU Approximation (Second Chance Algorithm)
If you are not familiar with Least Recently Used Algorithm, check Least Recently Used Algorithm(Page Replacement)This algorithm is a combination of using a queue, similar to FIFO (FIFO (Page Replacement)) alongside using an array to keep track of the bits used to give the queued page a "second chanc
15+ min read
Introduction of Hu-Tucker algorithm
Introduction of the Hu-Tucker algorithm :The Hu-Tucker algorithm helps to compress some order of blocks, assuming that you have a certain natural order for the following, then if the strings and the notes are taken into account, then taking them into account, they are sorted in numbers.The question
5 min read
Implementation of Non-Restoring Division Algorithm for Unsigned Integer
In the previous article, we have already discussed the Non-Restoring Division Algorithm. In this article, we will discuss the implementation of this algorithm. Non-restoring division algorithm is used to divide two unsigned integers. The other form of this algorithm is Restoring Division. This algor
15 min read
Introduction and implementation of Karger's algorithm for Minimum Cut
Given an undirected and unweighted graph, find the smallest cut (smallest number of edges that disconnects the graph into two components). The input graph may have parallel edges. For example consider the following example, the smallest cut has 2 edges. A Simple Solution use Max-Flow based s-t cut a
15+ min read