- While it's easy to spot patterns in a string with the naked eye, what if you wanted to use a computer program to detect those same patterns? Well, a Rabin-Karp algorithm can help with that.
- Using a hash function, the Rabin-Karp algorithm is used to identify patterns in a string. In contrast to the other available options, this approach limits the number of alphabets it searches throughout rather than checking every alphabet.
- The use of a hash value in this technique is very important since, with this value alone, the search space is drastically reduced and efficiency is greatly improved. It is far more effective than the other approaches thanks to this process.
- The "Rabin Karp" algorithm will be discussed in this blog. It is a string search algorithm that bears the names of its creators, Michael O. Rabin and Richard M. Carp.
- In O(Ns + Np) time, this approach can be used to find every instance of a given pattern (P) in a given string (S), where Ns and Np are the lengths of S and P, respectively.
- For the sake of clarity, let's use an example. Assume that the pattern P is given as "xyz," the given string S is "cxyzghxyzvjkxyz," and the task is to locate every instance of "P" in "S."
In the naive method, we slide the supplied pattern one by one by each character starting from the 0th index of the given string in order to find the given pattern in the string.
Each character of the pattern and the appropriate substring of the supplied string are compared at each place. We print that we have located the pattern's occurrence in the given string at that place if all the characters in the pattern match.
We slide the pattern one by one in the Rabin-Karp method as well. However, in this case, we compare the hash values of the pattern and the appropriate substring of the string rather than comparing each character one at a time. If the hash value is a match, the occurrence is printed.
Since the Rabin Karp algorithm depends on hash values, we must carefully compute the hash value in order to reduce collisions. We also store the hash values in a vector to avoid doing the same computation repeatedly and to improve time complexity.
Code in C++
Time Complexity: O(Ns + Np)
Space Complexity: O(1)