How to Use the Python statistics.geometric_mean() Function

How to Use the Python statistics.geometric_mean() Function

Introduction

The geometric mean is a measure of central tendency that is particularly useful for sets of positive numbers that are multiplicatively related. Unlike the arithmetic mean, which sums values and divides by the count, the geometric mean multiplies the values together and takes the nth root, where n is the number of values. This could be likened to finding the side length of a nearly-square rectangle that represents all the different sized rectangles in your set. The geometric mean is useful in situations when dealing with rates of growth or ratios.

Python provides a built-in statistics module that contains a variety of functions for statistical calculations, including the geometric mean. The statistics.geometric_mean() function allows users to easily compute the geometric mean of a dataset, making it a valuable tool for data analysis. Let’s take a closer look.

Using the statistics.geometric_mean() Function

Once your environment is ready, the next step is to import the statistics module into your Python script. You can do this with the following code snippet:

import statistics

The statistics.geometric_mean() function requires a specific type of input. It accepts an iterable data structure, such as a list or tuple, containing non-negative numerical values. The function will raise an error if it encounters negative values, as the geometric mean is not defined for them.

The syntax for using the geometric_mean() function is straightforward:

statistics.geometric_mean(data)

Here’s a simple example of calculating the geometric mean with a dataset:

data = [1, 3, 9, 27]
result = statistics.geometric_mean(data)
print(result)

# Output: 5.196152422706632

To verify, you could perform the calculation manually:

manual_result = math.pow(1 * 3 * 9 * 27, 1/4)
print(manual_result)

# Output: 5.196152422706632

Let’s have a look at a growth rate example, as the geometric mean is often used to calculate average growth rates. If revenue grows by 2%, 3%, 4%, and 5% over four years, the geometric mean can be calculated as follows:

# 2%, 3%, 4%, 5% growth
growth_rates = [1.02, 1.03, 1.04, 1.05]
geometric_growth = statistics.geometric_mean(growth_rates)
print(geometric_growth)

# Output: 1.0349396095096088

The result indicates the average growth factor over the period.

If you attempt to calculate the geometric mean of an empty dataset, you will encounter a StatisticsError. Here is an example:

empty_data = []
result = statistics.geometric_mean(empty_data)

# Output: StatisticsError: geometric mean requires a non-empty dataset containing positive numbers

Additionally, negative values can cause issues. If your dataset contains negative numbers, you should either filter them out or ensure you are working exclusively with non-negative values before passing the data to the function.

Wrapping Up

In summary, the statistics.geometric_mean() function is a straightforward and effective tool for calculating the geometric mean in Python. For further reading, consider exploring the official Python documentation on the statistics module.

Leave a Reply

Your email address will not be published. Required fields are marked *