CARVIEW |
Surprising Things You Can Do with Python’s csv Module
Think it's just for reading simple tables? See what else you can do with this Python standard library module.

Image by Author | Ideogram.ai
CSV, or comma-separated values, is a file format used to store tabular data. Each line represents a data entry, and commas separate the individual fields within the data. It's one of the most common file extensions for data and one of the simplest formats for data exchange within professional environments.
As a data professional with Python knowledge, I am sure everyone has tried to read and load data using the csv module. Usually, that’s all we do with the csv module: loading data and proceeding with other tasks.
For example, I read the following CSV file of Social Sentiment Data from Kaggle with the csv module and showed all the columns.
import csv
with open('sentimentdataset.csv', newline='', encoding='utf-8') as csvfile:
reader = csv.reader(csvfile)
header = next(reader)
print("Columns:", header)
With the output like the following:
Columns: ['', 'Unnamed: 0', 'Text', 'Sentiment', 'Timestamp', 'User', 'Platform', 'Hashtags', 'Retweets', 'Likes', 'Country', 'Year', 'Month', 'Day', 'Hour']
However, there is so much more you can do with the csv module that you might not know. In this article, we will explore all the surprising things you can do with the csv module.
1. Auto-Detect Format
The csv module is intended to work with files in comma-separated format; however, using the Sniffer
method, you can use the module to detect how the data format was separated. You can detect the data structure (dialect) before you read it thoroughly.
For example, here is how we try to detect the dialect with the csv module.
import csv
with open('sentimentdataset.csv', newline='', encoding='utf-8') as f:
sample = f.read(2048)
dialect = csv.Sniffer().sniff(sample, delimiters=[',',';','\t'])
print(f"Detected delimiter: {repr(dialect.delimiter)}")
The result will be like the following output.
Detected delimiter: ','
In the code above, we provide a sample from the first 2 KB of the data and the delimiters we want to detect. The result is the delimiter in the file detected by the module.
2. Header Detection
The CSV module can detect not only the file format but also whether the file contains a header.
We can do the detection with the following code.
has_header = csv.Sniffer().has_header(sample)
print("Header detected?" , has_header)
The result is shown in the output below.
Header detected? True
It seems simple, but there are many cases where the csv file you have doesn’t contain the necessary headers, which means we cannot understand our data. It’s a great addition to your data pipeline for detecting mistakes when reading the file.
3. Reading Data as a List
When we read the file with the csv module, we can structure the result in the desired format. One way to achieve this is to convert each data point into a list format, which we can easily accomplish with the following code.
with open('sentimentdataset.csv', newline='', encoding='utf-8') as f:
reader = csv.reader(f, dialect)
header = next(reader)
for i, row in enumerate(reader):
if i >= 1: break
print(row)
The result is shown in the output below.
['0', '0', ' Enjoying a beautiful day at the park! ', ' Positive ', '2023-01-15 12:30:00', ' User123 ', ' Twitter ', ' #Nature #Park ', '15.0', '30.0', ' USA ', '2023', '1', '15', '12']
You can see that each data row is now presented as a list and can be processed for any further data work.
4. Map Column Names to Values
Using the csv module, we can transform each piece of data into a format similar to the dictionary data format. Essentially, we can map each column name to its corresponding value, allowing us to access it with the column name as the key.
For example, here is how we could automatically assign the column name to the value for columns Text and Sentiment,
with open('sentimentdataset.csv', newline='', encoding='utf-8') as f:
dict_reader = csv.DictReader(f, dialect=dialect)
for i, row in enumerate(dict_reader):
if i >= 2: break
print(row['Text'], row['Sentiment'])
The result is shown in the output below.
Enjoying a beautiful day at the park! Positive
Traffic was terrible this morning. Negative
The code above shows that we access each value in the data in a key-value relationship. This method allows us to process the data more flexibly.
5. Transform CSV file into Another Format
The csv module is not only about reading the file; it could also be about reformatting the file's output format.
For example, you can transform your file into gzip format.
import csv, gzip
with gzip.open('sentiment.gz', 'wt', newline='', encoding='utf-8') as gz:
writer = csv.writer(gz)
for row in csv.reader(open('sentimentdataset.csv', encoding='utf-8'), dialect=dialect):
writer.writerow(row)
You can even transform the file into standard output like below.
import csv, sys
dialect = csv.Sniffer().sniff(sample, delimiters=[',',';','\t'])
writer = csv.writer(sys.stdout)
for row in csv.reader(open('sentimentdataset.csv', encoding='utf-8'), dialect=dialect):
writer.writerow(row)
Use the writer correctly to help your work transform them into the file format you need.
6. Quote Non-Numeric Values
In CSV files, fields can contain commas, quotes, or mixed data (text and numbers). By wrapping a value in double quotes, we force the data to be a single-quoted string in the file, ensuring that anything inside (even commas or line breaks) is treated as part of the values, not as a separator.
We can do the above using the following code.
import csv
INPUT = 'sentimentdataset.csv'
OUTPUT = 'quoted_nonnum.csv'
with open(INPUT, newline='', encoding='utf-8') as fin, \
open(OUTPUT, 'w', newline='', encoding='utf-8') as fout:
reader = csv.DictReader(fin)
writer = csv.writer(fout, quoting=csv.QUOTE_NONNUMERIC)
writer.writerow(['Text', 'Likes'])
for row in reader:
writer.writerow([row['Text'], row['Likes']])
In the code above, we are selecting the Text and Likes columns while quoting all non-numeric values and keeping the numeric values as they are. This way, we can consistently quote the data values to avoid being detected as separators.
Conclusion
As data professionals, we can manipulate CSV files using the Python csv module. However, there are surprising things you can do with this module, including format detection, data format conversion, and much more.
I hope this has helped!
Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.
- 10 Surprising Things You Can Do with Python's datetime Module
- 10 Surprising Things You Can Do with Python's collections Module
- 10 Surprising Things You Can Do with Python's time module
- Mastering Python’s Built-in Statistics Module: A Complete Guide to…
- 5 Things You Need to Know When Building LLM Applications
- Things You Should Know When Scaling Your Web Data-Driven Product
Latest Posts
- A Gentle Introduction to MCP Servers and Clients
- We Used 3 Feature Selection Techniques: This One Worked Best
- Debunking 5 Myths About Cloud Computing for Small Business (Sponsored)
- What Is Cross-Validation? A Plain English Guide with Diagrams
- Qwen Code Leverages Qwen3 as a CLI Agentic Programming Tool
- From Excel to Python: 7 Steps Analysts Can Take Today
Top Posts |
---|
- Building Machine Learning Application with Django
- Nano Banana Practical Prompting & Usage Guide
- 10 Useful Python One-Liners for Data Engineering
- Python for Data Science (Free 7-Day Mini-Course)
- Beginner’s Guide to Creating Your Own Python Shell with the cmd Module
- From Excel to Python: 7 Steps Analysts Can Take Today
- 7 Python Libraries Every Analytics Engineer Should Know
- 10 Python One-Liners to Optimize Your Hugging Face Transformers Pipelines
- Why Do Language Models Hallucinate?
- How To Use Synthetic Data To Build a Portfolio Project