Renaming columns in a Pandas DataFrame is a common operation when we want to clean, standardize, or transform data. In this article, we'll explore few different methods for renaming columns, each with specific use cases. Whether we're renaming a few columns or applying custom transformations, these methods offer flexible solutions for our needs.
The dataset we will use looks like this- Dataset.csv
Dataset ColumnsMethod 1: Renaming Column using Dictionary
rename() function is one of the most flexible methods for renaming columns. By passing a dictionary, where the keys are the current column names and the values are the new names, we can easily rename specific columns.
- Use rename() with a dictionary to rename the columns.
Python
df = pd.read_csv('data.csv')
df = df.rename(columns={'Age': 'Years', 'Gender': 'Sex'})
print(df)
Output:
Renamed columns of the DatasetThis method is significant because it provides clarity and directness in renaming multiple columns simultaneously, making it efficient for larger datasets.
The other methods for renaming column names in dataset are:
Method 2: Renaming Columns by Assigning to columns Attribute
With this method, we can directly assign a new list of column names to columns attribute. This approach is suitable when we want to replace all the column names at once.
Python
df = pd.read_csv('data.csv')
df.columns = ['Full Name', 'Age in Years', 'Gender Identity', 'City of Residence']
print(df)
Output:
OutputBy assigning a new list to df.columns, we replace all the column names in one operation. This method is simple and quick, but be cautious: the length of the list must match the number of columns in the DataFrame.
Method 3: Renaming Columns Using Axis Parameter
The set_axis() method allows us to rename the columns by passing a new list of column names along with the axis=1 parameter. This method can be useful when we need to create a new DataFrame with the renamed columns.
- Use set_axis() to rename the columns.
Python
df = pd.read_csv('data.csv')
df = df.set_axis(['Name', 'Age', 'Gender', 'Location'], axis=1)
print(df)
This method is useful when we want to create a new DataFrame with renamed columns, especially when we don't want to modify the original DataFrame. Setting axis=1 targets column names.
Output:
OutputMethod 4: Renaming Columns adding prefix or suffix
If we want to add a prefix or suffix to all column names, add_prefix() and add_suffix() methods are very handy. These methods are ideal when we want to modify all column names uniformly.
- Use add_prefix() or add_suffix() to modify column names.
Python
df = pd.read_csv('data.csv')
df = df.add_prefix('col_')
print(df)
Output:
OutputThis is useful when we need to distinguish columns in a merged DataFrame or add identifiers to the column names.
Method 5: Renaming Columns Using List Comprehension
List comprehension is a flexible way to modify column names based on specific conditions. This is useful when we want to apply transformations such as converting all column names to uppercase, applying string operations, or removing unwanted characters.
- Use list comprehension to modify the columns list.
Python
df = pd.read_csv('data.csv')
df.columns = [col.upper() for col in df.columns]
print(df)
Output:
OutputIn this case, all column names are converted to uppercase. This method is highly customizable and allows us to apply conditions like removing spaces, changing the case, or applying regular expressions.
Method 6: Renaming Columns by Replacing Specific Characters
If we need to replace specific characters or patterns in column names, we can use str.replace(). This method is perfect for cleaning up column names, such as removing spaces or replacing special characters.
- Use str.replace() to rename columns.
Python
df = pd.read_csv('data.csv')
df.columns = df.columns.str.replace(' ', '_')
print(df)
Output:
OutputIn this example, spaces are replaced with underscores. It's particularly useful for cleaning up messy or inconsistent column names.
Method 7: Renaming Columns by Mapping Functions
We can map a function to the column names to rename them according to a custom rule. This method is highly flexible and can be used to apply transformations such as converting names to lowercase, capitalizing the first letter, or applying any custom function to column names.
- Use a mapping function like str.lower() or any custom function.
Python
df = pd.read_csv('data.csv')
df.columns = df.columns.map(lambda x: x.lower())
print(df)
Output:
OutputUsing map() with a lambda function allows us to apply a custom transformation to column names. In this example, all column names are converted to lowercase.
Related Articles: