Overview

The term “glob” refers to methods for matching files that include specific patterns in line with UNIX shell-related expansion rules.

In Python, the glob module is used similarly to find, locate, and search for all of the files that are present in a system. This comparable pattern in glob might be anything from a file extension to the prefix of a file name to any likeness between two or more system files.

Introduction to glob Module in Python

The glob module in Python does not require separate installations and comes with every default Python installation.

Although glob is a module that comes with the default Python installation, it still requires a separate import statement

import glob

After the import, we have to refer to every function and method in this module with the prefix glob.

Let's see how to use glob in Python.

This basic example will make use of a function inside the glob class called glob() itself.

import glob
files = glob.glob("./glob/home/sample_files/*") # Creating a glob object
print(files)

This snippet of code prints all the files within the sample_files folder located in the home folder of the glob directory.

For example, the sample_files folder contained two text files named random_text_1.txt and random_text_2.txt. And files are a variable instantiated(created from the class as an instance) as an object of the glob class and return a list as the output.

Glob Module Functions

Now, we will discuss various more functions of the glob module and understand their working inside a Python program. We will also learn that how these functions help us in the pattern matching task. Look at the following list of functions that we have in the glob module, and with the help of these functions, we can carry out the task of filename pattern matching very smoothly:

  1. iglob()
  2. glob()
  3. escape()

Now, we will briefly discuss these functions and then understand the implementation of these functions by using them inside a Python program. We will use each of the above- given functions in an example program and get the list of file names following a similar pattern (that we will define in the function) in the output.

iglob() Function

The iglob() function of the glob module is very helpful in yielding the arbitrary values of the list of files in the output. We can create a Python generator with the iglob() method. We can use the Python generator created by the glob module to list down the files under a given directory. This function also returns an iterator when called, and the iterator returned by it yields the values (list of files) without storing all of the filenames simultaneously.

Syntax: Following is the syntax for using the iglob() function of glob module inside a Python program:

iglob(pathname, *, recursive=False)  

As we can see in the syntax of iglob() function, it takes a total of three parameters in it, which can be defined as given below:

(i) pathname: The pathname parameter is the optional parameter of the function, and we can even leave it while we are working on the file directory that is the same as where our Python is installed. We have to define the pathname from where we have to collect the list of files that following a similar pattern (which is also defined inside the function).

(ii) recursive: It is also an optional parameter for the iglob() function, and it takes only bool values (true or false) in it. The recursive parameter is used to set if the function is following the recursive approach for finding file names or not.

(iii) '*': This is the mandatory parameter of the iglob() function as here we have to define the pattern for which the iglob() function will collect the file names and list them down in the output. The pattern we define inside the iglob() function (such as the extension of file) for the pattern matching should start with the '*' symbol.

Now, let's use this iglob() function in an example program so that we can understand its implementation and function in a better way.

Example 1:

Look at the following Python program with the implementation of iglob() function:

# Import glob module in the program 
import glob as gb 
# Initialize a variable 
inVar = gb.iglob("*.py") # Set Pattern in iglob() function 
# Returning class type of variable 
print(type(inVar)) 
# Printing list of names of all files that matched the pattern 
print("List of the all the files in the directory having extension .py: "
for py in inVar:  
    print(py)  

Output:

<class 'generator'>
List of the all the files in the directory having extension .py:
adding.py
changing.py
code#1.py
code#2.py
code-3.py
code-4.py
code.py
code37.py
code_5.py
code_6.py
configuring.py

write your code here: Coding Playground

Explanation:

We have first imported the glob module so that we can use the iglob() function of it in the program. After that, we have initialized a variable where we used the iglob() function, and inside the iglob() function, we have defined the pattern for which the function will perform filename pattern matching. The pattern we have defined in the iglob() function is all files with a .py extension, i.e., "*.py". After that, we have returned the class type of the variable we have initialized. After that, we have used a for loop on the variable to print the list of all the filenames that have matched by the iglob() function for the pattern we have defined in it.

As we can see in the output, the first program has printed the class type of initialized variable, and then it printed the list of the files with the ".py" extension.

glob() Function

With the help of the glob() function, we can also get the list of files that matching a specific pattern (We have to define that specific pattern inside the function). The list returned by the glob() function will be a string that should contain a path specification according to the path we have defined inside the function. The string or iterator for glob() function actually returns the same value as returned by the iglob() function without actually storing these values (filenames) in it.

Syntax:

Following is the syntax for using the glob() function of the glob module inside a Python program:

glob(pathname, *, recursive = True)  

As we can see in the syntax of the glob() function, it also takes a total of three parameters in it, like the iglob() function. The three parameters defined in the glob() function are the same as those we have read in the iglob() function above. Now, let's use this glob() function in an example program so that we can understand its implementation and function in a better way.

Example 2: Look at the following Python program with the implementation of glob() function:

# Import glob module in the program 
import glob as gb 
# Initialize a variable 
genVar = gb.glob("*.py") # Set Pattern in glob() function 
# Printing list of names of all files that matched the pattern 
print("List of the all the files in the directory having extension .py: "
for py in genVar:  
    print(py)     

Output:

List of the all the files in the directory having extension .py:
adding.py
changing.py
code#1.py
code#2.py
code-3.py
code-4.py
code.py
code37.py
code_5.py
code_6.py
configuring.py

As we can see in the above example program, we have followed the same logic as we have followed in example 1 with the iglob() function. The program has returned the list of all the filenames that match the pattern we set inside the glob() function.

write your code here: Coding Playground

escape() Function

The escape() becomes very impactful as it allows us to escape the given character sequence, which we defined in the function. The escape() function is very handy for locating files that having certain characters (as we will define in the function) in their file names. It will match the sequence by matching an arbitrary literal string in the file names with that special character in them.

Syntax:

Following is the syntax for using the escape() function of glob module inside a Python program:

escape(pathname)  

The escape() should be used with either glob() or iglob() function so that we can print the list of file names in the output as a result. Now, let's use this escape() function in an example program so that we can understand its implementation and function in a better way.

Example 3: Look at the following Python program with the implementation of escape() function:

# Import glob module in the program 
import glob as gb 
# Initialize a variable 
charSeq = "-_#" 
print("Following is the list of filenames that match the special character sequence of escape function: "
# Using nested for loop to get the filenames 
for splChar in charSeq: 
    # Pathname for the glob() function 
    escSet = "*" + gb.escape(splChar) + "*" + ".py" 
    # Printing list of filenames with glob() function 
    for py in (gb.glob(escSet)):  
        print(py)   

Output:

Following is the list of filenames that match the special character sequence of escape function:
code-3.py
code-4.py
code_5.py
code_6.py
code#1.py
code#2.py

write your code here: Coding Playground

Explanation:

We have first defined a character sequence for the escape() sequence so that the escape() function will collect all the file names having that special character sequence in it. We have used a nested for loop such that first, we have created a pathname for the glob() function from the escape() function. And after that, we have used the pathname in glob() function to print the list of filenames matching the special character sequence defined earlier.

As we can see in the output, we have all the filenames with special character sequences in their names which we defined in the program.

Conclusion

So, as we have used the functions of glob modules, i.e., glob(), escape() and iglob() function, we can now easily understand the functionality of the glob module and its functions. With this, we can also depict that how the glob module is very helpful in performsing the filename pattern matching task and how we can get the list of all the files that are following a specific pattern.