Python for Data Science

String Comparision in Python

String Comparision in Python

Introduction

The technique of comparing two strings is called string comparison. These two strings function as operands or parameters that help determine how they differ. The comparison procedure typically compares two strings using their ASCII or Unicode values. Three alternative programming techniques can be used in Python to compare two strings. Now let's go into greater detail about each of them.

Method for Comparision

Method 1 : Using a relational operator

Most often, two constants are compared using relational operators. Since it falls under the heading of a binary operator, Python can also use it to compare strings. After applying a relational operator to both operands, it returns True or False based on the condition. In Python, these kinds of operators are also known as comparison operators. In Python, there are six main categories of comparison operators.

== (Equal operator)

Checks whether both operands are equal or not

> (Greater than)

Checks whether the left-hand side operand is greater than the right-hand side operand

< (Less than)

Checks whether the right-hand side operand is greater than the left-hand side operand

>= (Greater than or equals)

Checks whether the left-hand side operand is greater than or equal to the right-hand side operand

<= (Less than or equals)

Checks whether the right-hand side operand is greater than or equals to the left-hand side operand

When using Python's relational operator, each character in the string is examined for its Unicode value beginning with the zeroth element and continuing until the string's final index or end. A Boolean result is determined by comparing all of the left-side operand's Unicode matches to the operand of the right-side Unicode.

Example:

print("Karlos" == "Karlos")
print("Karlos" < "karlos")
print("Karlos" > "karlos")
print("Karlos" != "Karlos")

Output:

True
True
False
False

Explanation:

This is a straightforward programme that compares the operands on its right and left sides using the relational operator. Here, the operator compares the Unicode values of each character in the two strings, and if they match, it returns True; otherwise, it returns false. Depending on the string comparison, the print() function will either show True or False.

write your code here: Coding Playground

Method 2: Using is and is not (Identity) operator

The |(== ) operator in Python is used to compare the values of the two operands and determine whether they are equal. However, the identity operator 'is' in Python is useful for determining whether or not both of its operands are pointing to the same object. This also occurs when using Python's!= and 'is not' operators.

Example:

val1 = "Karlos"
val2 = "Karlos"
val3 = val1
valn = "karlos"

print(" The ID of val1 is: ", hex(id (val1)))
print(" The ID of val2 is: ", hex(id (val2)))
print(" The ID of val3 is: ", hex(id (val3)))
print(" The ID of valn is: ", hex(id (valn)))
prnt(val1 is val1)
print(val1 is val2)
print(val1 is val3)
print(valn is val1)

Output:

The ID of val1 is: 0x21d012c4f70
The ID of val2 is: 0x21d012c4f70
The ID of val3 is: 0x21d012c4f70
The ID of valn is: 0x21d012c7cb0
True
True
True
False

Explanation:

Here, we are checking & comparing the two strings using the identity operator. Four variables that will store some string values have been declared here. Karlos will be stored in variables val1 and val2, and the value of val1 will be stored in val3. A string "karlos" will be kept in the last valn. Now since they are all distinct objects, the object ID may change. Therefore, to retrieve and show the object ID for each variable generated, we are combining the hex(id()) procedures.

Because all of the first three have the same value (due to space minimization), you'll note that their IDs are identical. As a result, print() will show the same position for all three of these objects. Because even an object has a separate initialized constant, there will be a different object ID. In the same way, we might claim that vain's value does not equal val1's value. This is how using the identity operator to compare two strings can be useful.

write your code here: Coding Playground

Method 3: a comparison using string insensitivity

We talked about the need to match the exact string in the earlier subjects. But we must utilize the lower() and upper() methods to make case-insensitive comparisons. Both of these methods are listed under Python's string objects. While all strings are converted to lowercase letters using the lower() method, all strings are converted to uppercase using the upper() method.

Example:

listOfCities = ["Mumbai", "Bengaluru", "Noida"]
currCity = "noiDa"
for loc in listOfCities:
    print (" Case-Insensitive Comparison:  %s with %s: %s" % (loc, currCity, loc.lower() == currCity.lower()))

Output:

Case-Insensitive Comparison: Mumbai with noiDa: False
Case-Insensitive Comparison: Bengaluru with noiDa: False
Case-Insensitive Comparison: Noida with noiDa: True

Explanation:

We used a list of strings with three different values to create this application. Another variable, currCity, has been used to store the string noiDa. The next step is to iterate through the list of strings (the listOfCities variable) to see whether any of the strings match the curCity. Additionally, we must lowercase both strings using objname.lower() before using the == operator to compare both operands.

write your code here: Coding Playground

Method 4: Utilizing a user-defined function

In addition to the methods mentioned above, we can also develop our own user-defined function using the keyword "def" and compare each character from both strings using the relational operator. Two string parameters that need to be compared are supported by this function.

Example:

def strcmpr(strg, strgg):

    cnt1 = 0
    cnt2 = 0
    for i in range(len(strg)):
        if strg[i] >= "0" and strg[i] <= "9":
            cnt1 += 1  
    for i in range(len(strgg)):
        if strgg[i] >= "0" and strgg[i] <= "9":
            cnt2 += 1
    return cnt1 == cnt2
 
print('Compare String 246 and 2468: ', strcmpr("246", "2468"))
print('Compare String KARLOS and karlos:', strcmpr("KARLOS", "karlos"))

Output:

Compare String 246 and 2468: False
Compare String KARLOS and karlos: True

Explanation:

This is another common method Python programmers use to compare strings. Here, we manually create a user-defined function that counts each character in the string individually and returns True or False depending on whether a match is found. However, we are not considering case-sensitivity in this instance. The second strcmp() function will therefore return True.

write your code here: Coding Playground

Method 5: Using Regular Expressions

A programming element's distinctive pattern is defined by a regex, or regular expression. Regular expressions will be used in this situation as well to look for patterns in the characters of the comparison string. We'll utilize the re module to put the idea of regular expression into practice in Python. This time, we'll use the re module compile() method to check the pattern.

Example:

import re
stateList = ["Madhya Pradesh", "Tamil Nadu", "Uttar Pradesh", "Punjab"]
pattern = re.compile("[Pp]radesh")
for loc in stateList:
    if pattern.search(loc):
        print ("%s is matching with the search pattern" % loc)

Output:

Madhya Pradesh is matching with the search pattern
Uttar Pradesh is matching with the search pattern

Explanation:

Here, the re (regular expression) module is imported first. The names of four separate states are then defined in a list. Currently, we are using the re.compile() module to determine whether the string "Pradesh" has the letter "p" in uppercase or lowercase. If the answer is affirmative, the for loop will traverse through the stateList iterable object and output the message "The is matching with the search pattern" if the pattern.search() found the loc matching.

write your code here: Coding Playground

Conclusion

Using the relational operator or the identity operator is the most effective and significant method of comparing strings out of all of these. You can be asked to use regular expressions or case-insensitively check strings in some competitive tests. In that situation, your only options are Methods 3 and 5. Make sure, however, that methods 3 and 5 are not very effective. Method 4 (user-defined approach) is the one for you if all you want to do is check the string count.