CARVIEW |
Mastering NumPy’s Universal Functions for Fast Array Computation
Master element-wise operations, comparisons, logic, aggregation, and broadcasting using NumPy ufuncs for high-performance array processing.

Image by Author | Canva
Performance is everything. Not just in coding or data science. And if you are working with larger datasets, this mini implementation will save you hours.
In NumPy, universal functions will be your go-to tool if you chase speed in the world of numerical calculations/computing.
That’s why, in this article, we will cover and understand how ufuncs can be used and how they will turn real data into insights with efficiency. As we always do, we are going to use a real-life dataset from our platform. So let’s start exploring this one first.
Predicting Price: Real Dataset for Applying NumPy ufuncs
Here is the link to this data project: https://platform.stratascratch.com/data-projects/predicting-price
In this data project, Haensel AMS asked us to solve this data project as a take-home assignment in the recruitment process for a data science position. Let’s first read the dataset.
Import pandas as pd
df = pd.read_csv("sample.csv")
df.head()
Here is the output.
Now let’s see the columns. Here is the code.
df.info()
Here is the output.
Understanding NumPy’s Universal Functions (ufuncs)
Let’s go through it one step at a time.
NumPy has universal functions, ufuncs for short.
These functions operate on array elements.
You don’t write a loop. You call the function. It is applied to all elements in NumPy.
It uses C code underneath, thus the execution is done without much overhead.
Let’s start with a simple operation.
Step 1: Element-wise Addition
You have two columns — para2 and para3. You want their sum row by row. Here is the code:
import numpy as np
sum_array = np.add(df["para2"], df["para3"])
print(sum_array[:5])
Here is the output.
Each value in sum_array is the sum of para2 and para3 for that row. This works across all rows in a single operation.
Step 2: Element-wise Multiplication
You now have a sum array.
Suppose you want to scale that array — perhaps apply a rate or multiplier.
This is where np.multiply comes in.
It assigns a number to each element in an array. Here is the code.
scaled_sum = np.multiply(sum_array, 1.1)
print(scaled_sum[:5])
Here is the output.
Now, each value is raised by 10 percent.
You didn’t have to loop through anything — NumPy took care of it. The pattern holds for all arithmetic operations. You feed it arrays, or you feed it scalars. It is element-wise processing everything.
Step 3: Element-wise Comparison
Let’s say you want to find which rows have para3 above a certain value.
Instead of checking each row manually, use a comparison function.
np.greater checks if each element in an array is greater than another value or array.
It returns a boolean array — True if the condition is met, False otherwise.
mask = np.greater(df["para3"], 2500)
print(mask[:5])
Here is the output.
You can use this mask to filter the original DataFrame:
filtered = df[mask]
print(filtered.head())
Here is the output.
Now, you’ve created a condition and applied it directly to the dataset. No need for if statements or iterations. For a deeper look, see how array slicing in Python makes this possible.
Step 4: Logical Operations
You’ve filtered with one condition.
Now let’s say you want rows where para3 > 2500 and para1 < 500.
You combine boolean arrays using logical functions.
np.logical_and handles this directly. Let’s see.
condition = np.logical_and(df["para3"] > 2500, df["para1"] < 500)
filtered_rows = df[condition]
print(filtered_rows.head())
Here is the output.
This gives you only the rows where both conditions are true.
No nested loops, no .apply(), just one call.
Logical operations also include np.logical_or, np.logical_not, and np.logical_xor.
Each one of them processes arrays the same way — element by element.
Step 5: Aggregation ufuncs like np.sum, np.mean
You’ve worked with individual elements.
Now, let’s look at functions that reduce arrays into single values.
np.sum, np.mean, and np.max are a few examples.
They compute total, average, or maximum values across an axis.
Let’s see the code.
total_para2 = np.sum(df["para2"])
mean_para3 = np.mean(df["para3"])
max_para1 = np.max(df["para1"])
print(total_para2, mean_para3, max_para1)
Here is the output.
Each function returns a scalar.
No loops, no column-wise iteration — just one line for each calculation.
You can also aggregate across rows using axis=1.
row_sums = np.sum(df[["para1", "para2"]], axis=1)
print(row_sums[:5])
Here is the output.
Aggregation helps condense the dataset into insights. You now have totals, averages, or extremes — ready for use.
Step 6: Broadcasting
There are times when arrays do not have the same shape. You just want to perform operations across them.
Broadcasting allows you to do so without reshaping.NumPy automatically stretches the smaller array to the dimension. Here is the code:
centered_para2 = df["para2"] - np.mean(df["para2"])
print(centered_para2[:5])
Here is the output.
Now, each row contains the difference between its value and the average of that column. The mean was a scalar, but NumPy expanded it across the full column.
Broadcasting operates across arrays, scalars, and multi-dimensional shapes. No extra code is required . NumPy hammers out the alignment rules for you.
You can explore more on how NumPy for Data Science interviews comes into play here.
Performance Gains with NumPy ufuncs
You performed every operation — from adding two columns to filtering rows based on conditions — using loops.
But loops are also slow, particularly in large datasets.
Universal functions avoid that bottleneck.
They run directly on memory-level structures without the overhead of Python looping.
When you added para2 and para3, NumPy performed the full operation in one go, in the background (as an illustration)
A loop could have gone through each row to create a new value and store it, again and again.
This is why you got results immediately when you checked if the values in para3 were greater than 2500; ufuncs always return results immediately.
In a loop, the same check would work row by row, which is slower and needs more resources.
Even simple scaling with multiplication, or subtracting the mean, shows a clear difference.
Ufuncs do those operations for you without adding to your codebase or making your workflow slower.
As your data grows, this performance edge becomes more significant.
They are much, much faster and more memory efficient than a Python for loop on data of real size — thousands or millions of rows — which gets you out of the way and frees you up to go faster.
Final Thoughts
In this one, we have discovered new methods of doing operations in arrays. These features will help you save a lot of resources, especially if you work on large datasets, which you probably work on in your work.
Nate Rosidi is a data scientist and in product strategy. He's also an adjunct professor teaching analytics, and is the founder of StrataScratch, a platform helping data scientists prepare for their interviews with real interview questions from top companies. Nate writes on the latest trends in the career market, gives interview advice, shares data science projects, and covers everything SQL.
- Elevate Math Efficiency: Navigating Numpy Array Operations
- Parallelize NumPy Array Operations for Increased Speed
- 6 Reasons Why a Universal Semantic Layer is Beneficial to Your Data Stack
- Functional Programming in Python: Leveraging Lambda Functions and…
- The Fast and Effective Way to Audit ML for Fairness
- Get Hired Fast: Trending AI Tool to Find and Apply for Your Dream Job
Latest Posts
- A Gentle Introduction to MCP Servers and Clients
- We Used 3 Feature Selection Techniques: This One Worked Best
- Debunking 5 Myths About Cloud Computing for Small Business (Sponsored)
- What Is Cross-Validation? A Plain English Guide with Diagrams
- Qwen Code Leverages Qwen3 as a CLI Agentic Programming Tool
- From Excel to Python: 7 Steps Analysts Can Take Today
Top Posts |
---|
- Building Machine Learning Application with Django
- Nano Banana Practical Prompting & Usage Guide
- 10 Useful Python One-Liners for Data Engineering
- Python for Data Science (Free 7-Day Mini-Course)
- Beginner’s Guide to Creating Your Own Python Shell with the cmd Module
- From Excel to Python: 7 Steps Analysts Can Take Today
- 7 Python Libraries Every Analytics Engineer Should Know
- 10 Python One-Liners to Optimize Your Hugging Face Transformers Pipelines
- Why Do Language Models Hallucinate?
- How To Use Synthetic Data To Build a Portfolio Project