CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: nginx date: Sun, 12 Oct 2025 03:25:26 GMT content-type: text/html; charset=UTF-8 strict-transport-security: max-age=31536000 vary: Accept-Encoding host-header: wpcloud vary: Cookie link: ; rel="https://api.w.org/" link: ; rel="alternate"; title="JSON"; type="application/json" content-encoding: gzip x-ac: 2.bom _atomic_dca STALE alt-svc: h3=":443"; ma=86400 server-timing: a8c-cdn, dc;desc=bom, cache;desc=STALE;dur=2.0 Surprising Things You Can Do with Python’s csv Module - KDnuggets

Surprising Things You Can Do with Python’s csv Module

Think it's just for reading simple tables? See what else you can do with this Python standard library module.

By Cornellius Yudha Wijaya, KDnuggets Technical Content Specialist on May 21, 2025 in Python

Surprising Things You Can Do with Python's csv Module

Image by Author | Ideogram.ai

CSV, or comma-separated values, is a file format used to store tabular data. Each line represents a data entry, and commas separate the individual fields within the data. It's one of the most common file extensions for data and one of the simplest formats for data exchange within professional environments.

As a data professional with Python knowledge, I am sure everyone has tried to read and load data using the csv module. Usually, that’s all we do with the csv module: loading data and proceeding with other tasks.

For example, I read the following CSV file of Social Sentiment Data from Kaggle with the csv module and showed all the columns.

import csv

with open('sentimentdataset.csv', newline='', encoding='utf-8') as csvfile:
    reader = csv.reader(csvfile)        
    header = next(reader)         
    print("Columns:", header)

With the output like the following:

Columns: ['', 'Unnamed: 0', 'Text', 'Sentiment', 'Timestamp', 'User', 'Platform', 'Hashtags', 'Retweets', 'Likes', 'Country', 'Year', 'Month', 'Day', 'Hour']

However, there is so much more you can do with the csv module that you might not know. In this article, we will explore all the surprising things you can do with the csv module.

1. Auto-Detect Format

The csv module is intended to work with files in comma-separated format; however, using the Sniffer method, you can use the module to detect how the data format was separated. You can detect the data structure (dialect) before you read it thoroughly.

For example, here is how we try to detect the dialect with the csv module.

import csv

with open('sentimentdataset.csv', newline='', encoding='utf-8') as f:
    sample = f.read(2048)
    dialect = csv.Sniffer().sniff(sample, delimiters=[',',';','\t'])
    print(f"Detected delimiter: {repr(dialect.delimiter)}")

The result will be like the following output.

Detected delimiter: ','

In the code above, we provide a sample from the first 2 KB of the data and the delimiters we want to detect. The result is the delimiter in the file detected by the module.

2. Header Detection

The CSV module can detect not only the file format but also whether the file contains a header.

We can do the detection with the following code.

has_header = csv.Sniffer().has_header(sample)
print("Header detected?" , has_header)

The result is shown in the output below.

Header detected? True

It seems simple, but there are many cases where the csv file you have doesn’t contain the necessary headers, which means we cannot understand our data. It’s a great addition to your data pipeline for detecting mistakes when reading the file.

3. Reading Data as a List

When we read the file with the csv module, we can structure the result in the desired format. One way to achieve this is to convert each data point into a list format, which we can easily accomplish with the following code.

with open('sentimentdataset.csv', newline='', encoding='utf-8') as f:
    reader = csv.reader(f, dialect)
    header = next(reader)
    for i, row in enumerate(reader):
        if i >= 1: break
        print(row)

The result is shown in the output below.

['0', '0', ' Enjoying a beautiful day at the park!              ', ' Positive  ', '2023-01-15 12:30:00', ' User123      ', ' Twitter  ', ' #Nature #Park                            ', '15.0', '30.0', ' USA      ', '2023', '1', '15', '12']

You can see that each data row is now presented as a list and can be processed for any further data work.

4. Map Column Names to Values

Using the csv module, we can transform each piece of data into a format similar to the dictionary data format. Essentially, we can map each column name to its corresponding value, allowing us to access it with the column name as the key.

For example, here is how we could automatically assign the column name to the value for columns Text and Sentiment,

with open('sentimentdataset.csv', newline='', encoding='utf-8') as f:
    dict_reader = csv.DictReader(f, dialect=dialect)
    for i, row in enumerate(dict_reader):
        if i >= 2: break
        print(row['Text'], row['Sentiment'])

The result is shown in the output below.

Enjoying a beautiful day at the park!                Positive  
Traffic was terrible this morning.                   Negative

The code above shows that we access each value in the data in a key-value relationship. This method allows us to process the data more flexibly.

5. Transform CSV file into Another Format

The csv module is not only about reading the file; it could also be about reformatting the file's output format.

For example, you can transform your file into gzip format.

import csv, gzip

with gzip.open('sentiment.gz', 'wt', newline='', encoding='utf-8') as gz:
    writer = csv.writer(gz)
    for row in csv.reader(open('sentimentdataset.csv', encoding='utf-8'), dialect=dialect):
        writer.writerow(row)

You can even transform the file into standard output like below.

import csv, sys

dialect = csv.Sniffer().sniff(sample, delimiters=[',',';','\t'])
writer = csv.writer(sys.stdout)
for row in csv.reader(open('sentimentdataset.csv', encoding='utf-8'), dialect=dialect):
    writer.writerow(row)

Use the writer correctly to help your work transform them into the file format you need.

6. Quote Non-Numeric Values

In CSV files, fields can contain commas, quotes, or mixed data (text and numbers). By wrapping a value in double quotes, we force the data to be a single-quoted string in the file, ensuring that anything inside (even commas or line breaks) is treated as part of the values, not as a separator.

We can do the above using the following code.

import csv

INPUT = 'sentimentdataset.csv'
OUTPUT = 'quoted_nonnum.csv'

with open(INPUT, newline='', encoding='utf-8') as fin, \
     open(OUTPUT, 'w', newline='', encoding='utf-8') as fout:

    reader = csv.DictReader(fin)
    writer = csv.writer(fout, quoting=csv.QUOTE_NONNUMERIC)
    writer.writerow(['Text', 'Likes'])

    for row in reader:
        writer.writerow([row['Text'], row['Likes']])

In the code above, we are selecting the Text and Likes columns while quoting all non-numeric values and keeping the numeric values as they are. This way, we can consistently quote the data values to avoid being detected as separators.

Conclusion

As data professionals, we can manipulate CSV files using the Python csv module. However, there are surprising things you can do with this module, including format detection, data format conversion, and much more.

I hope this has helped!

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

Latest Posts

Top Posts

Original Source | Taken Source

Surprising Things You Can Do with Python’s csv Module

1. Auto-Detect Format

2. Header Detection

3. Reading Data as a List

4. Map Column Names to Values

5. Transform CSV file into Another Format

6. Quote Non-Numeric Values

Conclusion

More On This Topic

Latest Posts

Top Posts