CSV Files in Python
Introduction
The term "comma separated values" (or "csv") refers to a straightforward file format that uses certain formatting to arrange tabular data. It provides a standard format for data interchange and stores tabular data, such as from a spreadsheet or database, in plain text. The rows and columns of data in a csv file that opens in an excel sheet determine the format that should be used.
Reading CSV files
Python has many routines for reading csv files. We are outlining a few reading functions.
Using the csv.reader() function
The csv file is read using Python's csv.reader() function. It creates a list of all the columns for each row in the file.
The following information is in the text file python.txt, which by default uses the comma (,) as a delimiter:
name,department,birthday month |
Example
import csv |
Output:
Column names are name, department, birthday month |
Using the open() function, the file "python.csv" was opened in the code above. To read the file, we used the csv.reader() function, which produces an iterable reader object. The data were contained in the reader object, and we used a for loop to iteratively print the contents of each row.
Read a CSV into a Dictionary
Instead than dealing with a list of distinct text items, we can use the DictReader() function to read the data from the CSV file directly into a dictionary.
A reminder of the contents of our input file, python.txt
name,department,birthday month |
Example
import csv |
Output:
The Column names are as follows name, department, birthday month |
Pandas reading csv files
The NumPy library serves as the foundation for the open-source Pandas library. It offers quick data preparation, data cleaning, and analysis for the user.
It takes only a few seconds and is simple to read the csv file into a pandas DataFrame. To open, examine, and read the csv file in pandas and have it save the data in a DataFrame, we don't need to write many lines of code.
Here, we're going to read a slightly more challenging file named hrdata.csv, which contains information about the company's employees.
Name,Hire Date,Salary,Leaves Remaining |
Example
import pandas |
Output:
Name Hire Date Salary Leaves Remaining |
Writing CSV Files
The csv.writer() function in Python can be used to write any new or existing CSV files. It contains two methods, the writer function or the Dict Writer class, and is similar to the csv.reader() module.
It provides the writerow() and writerows methods (). The writerows() function writes many rows, whereas the writerow() function only writes one row.
Dialects
It is described as a framework that enables the creation, storage, and reuse of different formatting settings. It supports a variety of characteristics; the most popular ones are:
- Dialect.delimiter: This attribute is used to denote a field's boundary. The comma is the default value (,).
- Dialect.quotechar: Fields with special characters are quoted using this property.
- Dialect.lineterminator : whose default value is "rn," is used to start new lines.
Let's create a CSV file with the data below.
data = [{'Rank': 'B', 'first_name': 'Parker', 'last_name': 'Brian'}, |
Example:
import csv |
Output:
Writing complete |
It returns a file called "Python.csv" that has the following information in it:
first_name,last_name,Rank |
Create a dictionary from a CSV file
To write the CSV file directly into a dictionary, we may also utilize the class DictWriter.
The following information is located in a file called "python.csv":
Parker, Accounting, November Smith, IT, October |
Example:
import csv |
Output:
import csv |
Create CSV Files Utilizing Pandas
As an open source library built on top of the Numpy library, Pandas is described as such. It offers quick data preparation, data cleaning, and analysis for the user.
It is as simple as using pandas to read the CSV file. The DataFrame, a two-dimensional, heterogeneous tabular data structure with three primary parts (data, columns, and rows), must be created. Here, we'll read a slightly more challenging file named hrdata.csv that provides information about the company's personnel.
Name,Hire Date,Salary,Leaves Remaining |
Example:
import pandas |
Output:
Employee, Hired, Salary, Sick Days |