SlideShare a Scribd company logo
Rabin-Karp Substring search
algorithm
1
Prepared By:
Sabiya Fatima
sabiya1990fatima@gmail.com
Objectives
2
 What is Substring search problem
 Definition of the Rabin-Karp algorithm
 How Rabin-Karp works
 An example to illustrate Rabin-Karp
 Complexity Analysis
 Real Life applications
What is Substring search Problem
3
We assume that the text is an array T [1..N] of length n and that the pattern is an array P [1..M]
of length m, where m << n.
We also assume that the elements of P and T are characters in the finite alphabet S.
(e.g., S = {a,b} We want to find P = ‘aab’ in T = ‘abbaabaaaab’)
 A string search algorithm which compares a string's hash values, rather than the strings
themselves.
 For efficiency, the hash value of the next position in the text is easily computed from the
hash value of the current position.
Definition of the Rabin-Karp Algorithm
4
How Rabin-Karp Works
5
 Let characters in both arrays T and P be digits in radix-S notation. S = (0,1,...,9)
 Let p be the value of the characters in P
 Choose a prime number q such that fits within a computer word to speed
computations.
 Compute (p mod q)
 The value of p mod q is what we will be using to find all matches of the pattern P in T.
How Rabin-Karp Works(Contd.)
6
 Compute (T[s+1, .., s+m] mod q) for s = 0 .. n-m
 Test against P only those sequences in T having the same (mod q) value
 (T[s+1, .., s+m] mod q) can be incrementally computed by subtracting the high-order digit,
shifting, adding the low-order bit, all in modulo q arithmetic.
Algorithm
7
RABIN-KARP-MATCHER(T,P,d,q)
1. n = T.length
2. m= P.length
3. h = d^(m-1) mod q
4. p = 0
5. t0 = 0
6. for i = 1 to m // preprocessing
7. p = (dp + p[i]) mod q
8. t0 = (dt0 + p[i]) mod q
9. for s = 0 to n-m // matching
10. if p == ts
11. if P[1 . . . . M] == T[ s+1 . . . . s+m]
12. print “Pattern occurs with shift” s
13. if s<(n + m)
14. ts+1 = (d(ts – T[s+1]h)+T[s+m+1]) mod q
An Example to illustrate Rabin-Karp
8
• Given T = 31415926535 and P = 26
• We choose q = 11
• P mod q = 26 mod 11 = 4
13 14 95 62 35 5
13 14 95 62 35 5
14 mod 11 = 3 not equal to 4
31 mod 11 = 9 not equal to 4
13 14 95 62 35 5
41 mod 11 = 8 not equal to 4
An Example to illustrate Rabin-Karp(contd.)
9
13 14 95 62 35 5
15 mod 11 = 4 equal to 4 -> spurious hit
13 14 95 62 35 5
59 mod 11 = 4 equal to 4 -> spurious hit
13 14 95 62 35 5
92 mod 11 = 4 equal to 4 -> spurious hit
13 14 95 62 35 5
26 mod 11 = 4 equal to 4 -> an exact match!!
13 14 95 62 35 5
65 mod 11 = 10 not equal to 4
An Example to illustrate Rabin-Karp(contd.)
10
13 14 95 62 35 5
53 mod 11 = 9 not equal to 4
13 14 95 62 35 5
35 mod 11 = 2 not equal to 4
As we can see, when a match is found, further testing is done to insure that a match has
indeed been found.
Complexity Analysis 11
RABIN-KARP-MATCHER(T,P,d,q)
1. n = T.length
2. m= P.length
3. h = d^(m-1) mod q O(1)
4. p = 0
5. t0 = 0
6. for i = 1 to m O(m)
7. p = (dp + p[i]) mod q
8. t0 = (dt0 + p[i]) mod q
9. for s = 0 to n-m O((n-m+1)m)
10. if p == ts
11. if P[1 . . . . M] == T[ s+1 . . . . s+m]
12. print “Pattern occurs with shift” s
13. if s<n + m
14. ts+1 = (d(ts – T[s+1]h)+T[s+m+1]) mod q
Complexity Analysis Result
12
 The running time of the Rabin-Karp algorithm in the worst-case scenario is
O((n-m+1))m but it has a good average-case running time.
 If the expected number of valid shifts is small O(1) and the prime q is chosen to be
quite large, then the Rabin-Karp algorithm can be expected to run in time O(n+m) plus
the time to required to process spurious hits.
Real Time Applications
13
 Bioinformatics
• Used in looking for similarities of two or more proteins; i.e. high sequence
similarity usually implies significant structural or functional similarity.
Example:
Hb A_human
GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKL
G+ +VK+HGKKV A++++++AH+ D++ ++ +++LS+LH KL
Hb B_human
GNPKVKAHGKKVLGAFSDGLAH LDNLKGTF ATLSELH CDKL
+ similar amino acids
14
 Good for plagiarism, because it can deal with multiple pattern matching!
 With a good hashing function it can be quite effective and it’s easy to implement!
Real Time Applications
References
15
.
 Cormen, Thomas S., et al. Introduction to Algorithms. 3rd ed. Boston: MIT Press, 2
 Go2Net Website for String Matching Algorithms
 [www.go2net.com/internet/deep/1997/05/14/body.html]
 Yummy Yummy Animations Site for an animation of the Rabin-Karp algorithm at work
[www.mills.edu/ACAD_INFO/MCS/CS/S00MCS125/String.Matching.Algorithms/animations.html]
 National Institute of Standards and Technology Dictionary of Algorithms, Data Structures, and Problems
 [hissa.nist.gov/dads/HTML/rabinKarpAlgo.html]
 Multi-Pattern String Matching with Very Large Pattern Sets
 [https://www.dcc.uchile.cl/~gnavarro/workshop07/lsalmela.pdf]
Thank You
16
Ad

More Related Content

What's hot (20)

String matching algorithms-pattern matching.
String matching algorithms-pattern matching.String matching algorithms-pattern matching.
String matching algorithms-pattern matching.
Swapan Shakhari
 
String matching algorithm
String matching algorithmString matching algorithm
String matching algorithm
Alokeparna Choudhury
 
String matching, naive,
String matching, naive,String matching, naive,
String matching, naive,
Amit Kumar Rathi
 
KMP Pattern Matching algorithm
KMP Pattern Matching algorithmKMP Pattern Matching algorithm
KMP Pattern Matching algorithm
Kamal Nayan
 
String matching algorithms
String matching algorithmsString matching algorithms
String matching algorithms
Ashikapokiya12345
 
pushdown automata
pushdown automatapushdown automata
pushdown automata
Sujata Pardeshi
 
String matching algorithms
String matching algorithmsString matching algorithms
String matching algorithms
Mahdi Esmailoghli
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languages
parmeet834
 
String Matching (Naive,Rabin-Karp,KMP)
String Matching (Naive,Rabin-Karp,KMP)String Matching (Naive,Rabin-Karp,KMP)
String Matching (Naive,Rabin-Karp,KMP)
Aditya pratap Singh
 
Formal Languages and Automata Theory Unit 1
Formal Languages and Automata Theory Unit 1Formal Languages and Automata Theory Unit 1
Formal Languages and Automata Theory Unit 1
Srimatre K
 
P, NP, NP-Complete, and NP-Hard
P, NP, NP-Complete, and NP-HardP, NP, NP-Complete, and NP-Hard
P, NP, NP-Complete, and NP-Hard
Animesh Chaturvedi
 
unit-4-dynamic programming
unit-4-dynamic programmingunit-4-dynamic programming
unit-4-dynamic programming
hodcsencet
 
Knuth morris pratt string matching algo
Knuth morris pratt string matching algoKnuth morris pratt string matching algo
Knuth morris pratt string matching algo
sabiya sabiya
 
Time and Space Complexity
Time and Space ComplexityTime and Space Complexity
Time and Space Complexity
Ashutosh Satapathy
 
module6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdfmodule6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdf
Shiwani Gupta
 
String matching algorithms(knuth morris-pratt)
String matching algorithms(knuth morris-pratt)String matching algorithms(knuth morris-pratt)
String matching algorithms(knuth morris-pratt)
Neel Shah
 
Bruteforce algorithm
Bruteforce algorithmBruteforce algorithm
Bruteforce algorithm
Rezwan Siam
 
Algorithm Complexity and Main Concepts
Algorithm Complexity and Main ConceptsAlgorithm Complexity and Main Concepts
Algorithm Complexity and Main Concepts
Adelina Ahadova
 
Lecture 4 asymptotic notations
Lecture 4   asymptotic notationsLecture 4   asymptotic notations
Lecture 4 asymptotic notations
jayavignesh86
 
KMP String Matching Algorithm
KMP String Matching AlgorithmKMP String Matching Algorithm
KMP String Matching Algorithm
kalpanasatishkumar
 
String matching algorithms-pattern matching.
String matching algorithms-pattern matching.String matching algorithms-pattern matching.
String matching algorithms-pattern matching.
Swapan Shakhari
 
KMP Pattern Matching algorithm
KMP Pattern Matching algorithmKMP Pattern Matching algorithm
KMP Pattern Matching algorithm
Kamal Nayan
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languages
parmeet834
 
String Matching (Naive,Rabin-Karp,KMP)
String Matching (Naive,Rabin-Karp,KMP)String Matching (Naive,Rabin-Karp,KMP)
String Matching (Naive,Rabin-Karp,KMP)
Aditya pratap Singh
 
Formal Languages and Automata Theory Unit 1
Formal Languages and Automata Theory Unit 1Formal Languages and Automata Theory Unit 1
Formal Languages and Automata Theory Unit 1
Srimatre K
 
P, NP, NP-Complete, and NP-Hard
P, NP, NP-Complete, and NP-HardP, NP, NP-Complete, and NP-Hard
P, NP, NP-Complete, and NP-Hard
Animesh Chaturvedi
 
unit-4-dynamic programming
unit-4-dynamic programmingunit-4-dynamic programming
unit-4-dynamic programming
hodcsencet
 
Knuth morris pratt string matching algo
Knuth morris pratt string matching algoKnuth morris pratt string matching algo
Knuth morris pratt string matching algo
sabiya sabiya
 
module6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdfmodule6_stringmatchingalgorithm_2022.pdf
module6_stringmatchingalgorithm_2022.pdf
Shiwani Gupta
 
String matching algorithms(knuth morris-pratt)
String matching algorithms(knuth morris-pratt)String matching algorithms(knuth morris-pratt)
String matching algorithms(knuth morris-pratt)
Neel Shah
 
Bruteforce algorithm
Bruteforce algorithmBruteforce algorithm
Bruteforce algorithm
Rezwan Siam
 
Algorithm Complexity and Main Concepts
Algorithm Complexity and Main ConceptsAlgorithm Complexity and Main Concepts
Algorithm Complexity and Main Concepts
Adelina Ahadova
 
Lecture 4 asymptotic notations
Lecture 4   asymptotic notationsLecture 4   asymptotic notations
Lecture 4 asymptotic notations
jayavignesh86
 

Similar to Rabin Carp String Matching algorithm (20)

String searching
String searching String searching
String searching
thinkphp
 
StringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdfStringMatching-Rabikarp algorithmddd.pdf
StringMatching-Rabikarp algorithmddd.pdf
bhagabatijenadukura
 
String-Matching Algorithms Advance algorithm
String-Matching  Algorithms Advance algorithmString-Matching  Algorithms Advance algorithm
String-Matching Algorithms Advance algorithm
ssuseraf60311
 
Modified Rabin Karp
Modified Rabin KarpModified Rabin Karp
Modified Rabin Karp
Garima Singh
 
lec17.ppt
lec17.pptlec17.ppt
lec17.ppt
shivkr15
 
Lec17
Lec17Lec17
Lec17
Nikhil Chilwant
 
Ch08
Ch08Ch08
Ch08
nathanurag
 
Ch08
Ch08Ch08
Ch08
Joe Christensen
 
6.sequences and series Further Mathematics Zimbabwe Zimsec Cambridge
6.sequences and series   Further Mathematics Zimbabwe Zimsec Cambridge6.sequences and series   Further Mathematics Zimbabwe Zimsec Cambridge
6.sequences and series Further Mathematics Zimbabwe Zimsec Cambridge
alproelearning
 
25 String Matching
25 String Matching25 String Matching
25 String Matching
Andres Mendez-Vazquez
 
Basics of Mathematical Cryptography
Basics of Mathematical CryptographyBasics of Mathematical Cryptography
Basics of Mathematical Cryptography
Neha Gupta
 
Pattern matching programs
Pattern matching programsPattern matching programs
Pattern matching programs
akruthi k
 
Primality
PrimalityPrimality
Primality
Mohanasundaram Nattudurai
 
1. linear model, inference, prediction
1. linear model, inference, prediction1. linear model, inference, prediction
1. linear model, inference, prediction
Malik Hassan Qayyum 🕵🏻‍♂️
 
Daa chapter9
Daa chapter9Daa chapter9
Daa chapter9
B.Kirron Reddi
 
Introduction to the AKS Primality Test
Introduction to the AKS Primality TestIntroduction to the AKS Primality Test
Introduction to the AKS Primality Test
Pranshu Bhatnagar
 
2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou
vafopoulos
 
Germany2003 gamg
Germany2003 gamgGermany2003 gamg
Germany2003 gamg
M Reza Rahmati
 
Gp 27[string matching].pptx
Gp 27[string matching].pptxGp 27[string matching].pptx
Gp 27[string matching].pptx
SumitYadav641839
 
IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION ALGORITHM
IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION  ALGORITHM  IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION  ALGORITHM
IMPLEMENTATION OF DIFFERENT PATTERN RECOGNITION ALGORITHM
NETAJI SUBHASH ENGINEERING COLLEGE , KOLKATA
 
Ad

Recently uploaded (20)

SICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introductionSICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introduction
fabienklr
 
Slide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptxSlide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptx
vvsasane
 
Prediction of Flexural Strength of Concrete Produced by Using Pozzolanic Mate...
Prediction of Flexural Strength of Concrete Produced by Using Pozzolanic Mate...Prediction of Flexural Strength of Concrete Produced by Using Pozzolanic Mate...
Prediction of Flexural Strength of Concrete Produced by Using Pozzolanic Mate...
Journal of Soft Computing in Civil Engineering
 
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
ajayrm685
 
Modeling the Influence of Environmental Factors on Concrete Evaporation Rate
Modeling the Influence of Environmental Factors on Concrete Evaporation RateModeling the Influence of Environmental Factors on Concrete Evaporation Rate
Modeling the Influence of Environmental Factors on Concrete Evaporation Rate
Journal of Soft Computing in Civil Engineering
 
introduction technology technology tec.pptx
introduction technology technology tec.pptxintroduction technology technology tec.pptx
introduction technology technology tec.pptx
Iftikhar70
 
Personal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.pptPersonal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.ppt
ganjangbegu579
 
Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...
Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...
Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...
AI Publications
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
Frontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend EngineersFrontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend Engineers
Michael Hertzberg
 
Uses of drones in civil construction.pdf
Uses of drones in civil construction.pdfUses of drones in civil construction.pdf
Uses of drones in civil construction.pdf
surajsen1729
 
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdfSmart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
PawachMetharattanara
 
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdfLittle Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
gori42199
 
Autodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User InterfaceAutodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User Interface
Atif Razi
 
ATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ATAL 6 Days Online FDP Scheme Document 2025-26.pdfATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ssuserda39791
 
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia
 
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
PawachMetharattanara
 
2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt
rakshaiya16
 
Modelling of Concrete Compressive Strength Admixed with GGBFS Using Gene Expr...
Modelling of Concrete Compressive Strength Admixed with GGBFS Using Gene Expr...Modelling of Concrete Compressive Strength Admixed with GGBFS Using Gene Expr...
Modelling of Concrete Compressive Strength Admixed with GGBFS Using Gene Expr...
Journal of Soft Computing in Civil Engineering
 
Working with USDOT UTCs: From Conception to Implementation
Working with USDOT UTCs: From Conception to ImplementationWorking with USDOT UTCs: From Conception to Implementation
Working with USDOT UTCs: From Conception to Implementation
Alabama Transportation Assistance Program
 
SICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introductionSICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introduction
fabienklr
 
Slide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptxSlide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptx
vvsasane
 
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
ajayrm685
 
introduction technology technology tec.pptx
introduction technology technology tec.pptxintroduction technology technology tec.pptx
introduction technology technology tec.pptx
Iftikhar70
 
Personal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.pptPersonal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.ppt
ganjangbegu579
 
Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...
Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...
Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...
AI Publications
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
Frontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend EngineersFrontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend Engineers
Michael Hertzberg
 
Uses of drones in civil construction.pdf
Uses of drones in civil construction.pdfUses of drones in civil construction.pdf
Uses of drones in civil construction.pdf
surajsen1729
 
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdfSmart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
PawachMetharattanara
 
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdfLittle Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
gori42199
 
Autodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User InterfaceAutodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User Interface
Atif Razi
 
ATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ATAL 6 Days Online FDP Scheme Document 2025-26.pdfATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ssuserda39791
 
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia
 
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
PawachMetharattanara
 
2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt
rakshaiya16
 
Ad

Rabin Carp String Matching algorithm

  • 1. Rabin-Karp Substring search algorithm 1 Prepared By: Sabiya Fatima sabiya1990fatima@gmail.com
  • 2. Objectives 2  What is Substring search problem  Definition of the Rabin-Karp algorithm  How Rabin-Karp works  An example to illustrate Rabin-Karp  Complexity Analysis  Real Life applications
  • 3. What is Substring search Problem 3 We assume that the text is an array T [1..N] of length n and that the pattern is an array P [1..M] of length m, where m << n. We also assume that the elements of P and T are characters in the finite alphabet S. (e.g., S = {a,b} We want to find P = ‘aab’ in T = ‘abbaabaaaab’)
  • 4.  A string search algorithm which compares a string's hash values, rather than the strings themselves.  For efficiency, the hash value of the next position in the text is easily computed from the hash value of the current position. Definition of the Rabin-Karp Algorithm 4
  • 5. How Rabin-Karp Works 5  Let characters in both arrays T and P be digits in radix-S notation. S = (0,1,...,9)  Let p be the value of the characters in P  Choose a prime number q such that fits within a computer word to speed computations.  Compute (p mod q)  The value of p mod q is what we will be using to find all matches of the pattern P in T.
  • 6. How Rabin-Karp Works(Contd.) 6  Compute (T[s+1, .., s+m] mod q) for s = 0 .. n-m  Test against P only those sequences in T having the same (mod q) value  (T[s+1, .., s+m] mod q) can be incrementally computed by subtracting the high-order digit, shifting, adding the low-order bit, all in modulo q arithmetic.
  • 7. Algorithm 7 RABIN-KARP-MATCHER(T,P,d,q) 1. n = T.length 2. m= P.length 3. h = d^(m-1) mod q 4. p = 0 5. t0 = 0 6. for i = 1 to m // preprocessing 7. p = (dp + p[i]) mod q 8. t0 = (dt0 + p[i]) mod q 9. for s = 0 to n-m // matching 10. if p == ts 11. if P[1 . . . . M] == T[ s+1 . . . . s+m] 12. print “Pattern occurs with shift” s 13. if s<(n + m) 14. ts+1 = (d(ts – T[s+1]h)+T[s+m+1]) mod q
  • 8. An Example to illustrate Rabin-Karp 8 • Given T = 31415926535 and P = 26 • We choose q = 11 • P mod q = 26 mod 11 = 4 13 14 95 62 35 5 13 14 95 62 35 5 14 mod 11 = 3 not equal to 4 31 mod 11 = 9 not equal to 4 13 14 95 62 35 5 41 mod 11 = 8 not equal to 4
  • 9. An Example to illustrate Rabin-Karp(contd.) 9 13 14 95 62 35 5 15 mod 11 = 4 equal to 4 -> spurious hit 13 14 95 62 35 5 59 mod 11 = 4 equal to 4 -> spurious hit 13 14 95 62 35 5 92 mod 11 = 4 equal to 4 -> spurious hit 13 14 95 62 35 5 26 mod 11 = 4 equal to 4 -> an exact match!! 13 14 95 62 35 5 65 mod 11 = 10 not equal to 4
  • 10. An Example to illustrate Rabin-Karp(contd.) 10 13 14 95 62 35 5 53 mod 11 = 9 not equal to 4 13 14 95 62 35 5 35 mod 11 = 2 not equal to 4 As we can see, when a match is found, further testing is done to insure that a match has indeed been found.
  • 11. Complexity Analysis 11 RABIN-KARP-MATCHER(T,P,d,q) 1. n = T.length 2. m= P.length 3. h = d^(m-1) mod q O(1) 4. p = 0 5. t0 = 0 6. for i = 1 to m O(m) 7. p = (dp + p[i]) mod q 8. t0 = (dt0 + p[i]) mod q 9. for s = 0 to n-m O((n-m+1)m) 10. if p == ts 11. if P[1 . . . . M] == T[ s+1 . . . . s+m] 12. print “Pattern occurs with shift” s 13. if s<n + m 14. ts+1 = (d(ts – T[s+1]h)+T[s+m+1]) mod q
  • 12. Complexity Analysis Result 12  The running time of the Rabin-Karp algorithm in the worst-case scenario is O((n-m+1))m but it has a good average-case running time.  If the expected number of valid shifts is small O(1) and the prime q is chosen to be quite large, then the Rabin-Karp algorithm can be expected to run in time O(n+m) plus the time to required to process spurious hits.
  • 13. Real Time Applications 13  Bioinformatics • Used in looking for similarities of two or more proteins; i.e. high sequence similarity usually implies significant structural or functional similarity. Example: Hb A_human GSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKL G+ +VK+HGKKV A++++++AH+ D++ ++ +++LS+LH KL Hb B_human GNPKVKAHGKKVLGAFSDGLAH LDNLKGTF ATLSELH CDKL + similar amino acids
  • 14. 14  Good for plagiarism, because it can deal with multiple pattern matching!  With a good hashing function it can be quite effective and it’s easy to implement! Real Time Applications
  • 15. References 15 .  Cormen, Thomas S., et al. Introduction to Algorithms. 3rd ed. Boston: MIT Press, 2  Go2Net Website for String Matching Algorithms  [www.go2net.com/internet/deep/1997/05/14/body.html]  Yummy Yummy Animations Site for an animation of the Rabin-Karp algorithm at work [www.mills.edu/ACAD_INFO/MCS/CS/S00MCS125/String.Matching.Algorithms/animations.html]  National Institute of Standards and Technology Dictionary of Algorithms, Data Structures, and Problems  [hissa.nist.gov/dads/HTML/rabinKarpAlgo.html]  Multi-Pattern String Matching with Very Large Pattern Sets  [https://www.dcc.uchile.cl/~gnavarro/workshop07/lsalmela.pdf]
  翻译: