Pandas is a popular data manipulation library in Python that provides powerful tools for data analysis and transformation. Two commonly used functions in Pandas are melt and pivot, which allow users to reshape their data. In this blog post, we will explore the differences between these two functions and provide simple examples to illustrate their usage.
Melt:
The melt function in Pandas is used to transform a dataset from a wide format to a long format, also known as unpivoting. It gathers columns and “melts” them into a single column, creating a new DataFrame with a row for each unique combination of identifiers. The melted column contains the values that were previously spread across multiple columns.
Example:
Let’s consider a dataset with information about students and their scores in different subjects:
import pandas as pd |
Output:
Name Maths Physics Chemistry |
Now, let’s use the melt function to reshape the data:
melted_df = df.melt(id_vars='Name', var_name='Subject', value_name='Score') |
Output:
Name Subject Score |
As shown in the example, the melt function transformed the wide-format DataFrame into a long-format DataFrame, where each row represents a unique combination of the identifier (Name) and the melted column (Subject), with the corresponding values in the Score column.
Pivot:
The pivot function in Pandas is the inverse operation of melt. It is used to transform a long-format DataFrame into a wide format by spreading a column’s values into multiple columns.
Example:
Let’s use the melted DataFrame from the previous example and apply the pivot function:
pivoted_df = melted_df.pivot(index='Name', columns='Subject', values='Score') |
Output:
Subject Chemistry Maths Physics |
In this example, the pivot function reshaped the long-format DataFrame back into a wide-format DataFrame. The unique values in the Subject column became the columns in the pivoted DataFrame, and the corresponding values in the Score column were spread across those columns, with each row representing a unique identifier (Name).