Rabin Karp Algorithm: C++ Implementation

Introduction

  • While it's easy to spot patterns in a string with the naked eye, what if you wanted to use a computer program to detect those same patterns? Well, a Rabin-Karp algorithm can help with that.
  • Using a hash function, the Rabin-Karp algorithm is used to identify patterns in a string. In contrast to the other available options, this approach limits the number of alphabets it searches throughout rather than checking every alphabet.
  • The use of a hash value in this technique is very important since, with this value alone, the search space is drastically reduced and efficiency is greatly improved. It is far more effective than the other approaches thanks to this process.
  • The "Rabin Karp" algorithm will be discussed in this blog. It is a string search algorithm that bears the names of its creators, Michael O. Rabin and Richard M. Carp.
  • In O(Ns + Np) time, this approach can be used to find every instance of a given pattern (P) in a given string (S), where Ns and Np are the lengths of S and P, respectively.
  • For the sake of clarity, let's use an example. Assume that the pattern P is given as "xyz," the given string S is "cxyzghxyzvjkxyz," and the task is to locate every instance of "P" in "S."

Algorithms

In the naive method, we slide the supplied pattern one by one by each character starting from the 0th index of the given string in order to find the given pattern in the string.

Each character of the pattern and the appropriate substring of the supplied string are compared at each place. We print that we have located the pattern's occurrence in the given string at that place if all the characters in the pattern match.

We slide the pattern one by one in the Rabin-Karp method as well. However, in this case, we compare the hash values of the pattern and the appropriate substring of the string rather than comparing each character one at a time. If the hash value is a match, the occurrence is printed.

Since the Rabin Karp algorithm depends on hash values, we must carefully compute the hash value in order to reduce collisions. We also store the hash values in a vector to avoid doing the same computation repeatedly and to improve time complexity.

Code in C++

// Board Infinity
//C++ code for implementation of Rabin Karp algorithm
#include <bits/stdc++.h>
using namespace std;


// Function for searching a pattern in a string using Rabin Karp algorithm
void rabinKarpSearch(string S, string P)
  {


  // Calculating the length of S and P
  int Ns = S.length();
  int Np = P.length();
   
  // Initialize the value of prime number and mod for calculating hash values
  int prime = 31;
  int mod = 1e9 + 9;
   
  // Calculating the power raise to the taken prime
  vector<long long> p_pow(Ns);
  p_pow[0] = 1;
  for (int i = 1; i < Ns; i++)
  {
        p_pow[i] = (p_pow[i-1] * prime) % mod;
    }
 
    vector<long long> h(Ns + 1, 0);
    for (int i = 0; i < Ns; i++)
    {
        h[i+1] = (h[i] + (S[i] - 'a' + 1) * p_pow[i]) % mod;
    }
   
    // Calculating the hash value of P
    long long hash_P = 0;
    for (int i = 0; i < Np; i++)
      {
        hash_P = (hash_P + (P[i] - 'a' + 1) * p_pow[i]) % mod;
      }
 
    /*
      Now slide the pattern by one character and check for the corresponding
      hash value to match with the hash value of the given pattern
    */
    for (int i = 0; i + Np - 1 < Ns; i++)
      {
        long long curr_hash = (h[i+Np] + mod - h[i]) % mod;
        if (curr_hash == hash_P * p_pow[i] % mod)
            cout<<"The given pattern occurs in the given string at index "<<i<<endl;
      }
  }
int main()
  {
  string S = "cxyzghxyzvjkxyz";
  string P = "xyz";
   
    // Call the function for rabin karp algorithm
  rabinKarpSearch(S,P);
   
  return 0;
  }

Output

The given pattern occurs in the given string at index 1
The given pattern occurs in the given string at index 6
The given pattern occurs in the given string at index 12

write your code here: Coding Playground

Time Complexity: O(Ns + Np)

Space Complexity: O(1)