itamarst Profile

Contributions

Article

The wrong way to speed up your code with Numba

Numba can make your numeric code faster, but only if you use it right.

Article

Not just NVIDIA: GPU programming that runs everywhere

If you’re doing computations on a GPU, NVIDIA is the default, alongside its CUDA libraries. But NVIDIA-specific sofware it won't run on Macs, in CI, or on other GPUs. What can you do if you want to use GPUs in a portable manner? In this article we’ll cover one option, the wgpu-py library.

Article

Profiling your Numba code

Learn how to use the Profila profiler to find performance bottlenecks in your Numba code.

Article

Beware of misleading GPU vs CPU benchmarks

Do you use NumPy, Pandas, or scikit-learn and want to get faster results? Nvidia has created GPU-based replacements for each of these with the shared promise of extra speed. Unfortunately, while those speed-ups are impressive, they are also misleading. GPU-based libraries might be the answer to your performance problems… or they might be an an unnecessary and expensive distraction.

Article

NumPy 2 is coming: preventing breakage, updating your code

NumPy 2 is coming, and it’s got some backwards incompatible changes. Learn how to keep your code from breaking, and how to upgrade.

Article

How many CPU cores can you actually use in parallel?

Figuring out how much parallelism your program can use is surprisingly tricky.

Article

Using Polars in a Pandas world

Pandas has far more third-party integrations than Polars. Learn how to use those libraries with Polars dataframes.

Article

Two kinds of threads pools, and why you need both

When you’re doing large scale data processing with Python, threads are a good way to achieve parallelism. This is especially true if you’re doing numeric processing, where the global interpreter lock (GIL) is typically not an issue. And if you’re using threading, thread pools are a good way to make sure you don’t use too many resources.

But how many threads should your thread pool have? And do you need just one thread pool, or more than one?

Tutorial

Speeding up Cython with SIMD

SIMD is a CPU feature that lets you speed up numeric processing; learn how to use it with Cython.

Article

The easiest way to speed up Python with Rust

Rust can make your Python code much faster; here’s how to start using it as quickly as possible.

Article

When NumPy is too slow

What do you do when your NumPy code isn’t fast enough? We’ll discuss the options, from Numba to JAX to manual optimizations.

Article

Understanding CPUs can help speed up Numba and NumPy code

With a little understanding of how CPUs and compilers work, you can speed up NumPy with faster Numba code.

Article

Choosing a good file format for Pandas

CSV, JSON, Parquet—which file format should you use for data being processed by Pandas?

Article

“Externally managed environments”: when PEP 668 breaks pip

You’re on a new version of Linux, you try a pip install, and it errors out, talking about “externally managed environments” and “PEP 668”. What’s going on? How do you solve this?

Article

Goodbye to Flake8 and PyLint: faster linting with Ruff

Ruff is a new, much faster linter for Python, to help you catching bugs without waiting forever for CI.

Article

Polars for initial data analysis, Polars for production

Initial and exploratory data analysis have different requirements than production data processing; Polars supports both.

Article

Python’s multiprocessing performance problem

While multiprocessing allows Python to scale to multiple CPUs, it has some performance overhead compared to threading.

Article

Don’t bother trying to estimate Pandas memory usage

Estimating Pandas memory usage from the data file size is surprisingly difficult. Learn why, and some alternative approaches that don’t require estimation.

Article

float64 to float32: Saving memory without losing precision

Switching from float64 (double-precision) to float32 (single-precision) can cut memory usage in half. But how do you deal with data that doesn’t fit?

Article

Some reasons to avoid Cython

If you need to speed up Python, Cython is a very useful tool. It lets you seamlessly merge Python syntax with calls into C or C++ code, making it easy to write high-performance extensions with rich Python interfaces.

That being said, Cython is not the best tool in all circumstances. So in this article I’ll go over some of the limitations and problems with Cython, and suggest some alternatives.

Article

Why Polars uses less memory than Pandas

While Polars is mostly known for running faster than Pandas, if you use it right it can sometimes also significantly reduce memory usage compared to Pandas. In particular, certain techniques that you need to do manually in Pandas can be done automatically in Polars, allowing you to process large datasets without using as much memory—and with less work on your side!

Article

It's time to stop using Python 3.7

Python 3.7 end of life is in 6 months; after that there will be no more security updates. So the time to upgrade is now.

Article

Who controls parallelism? A disagreement that leads to slower code

The libraries you’re using might be running more threads than you realize—and that can mean slower execution.

Article

When should you upgrade to Python 3.11?

Python 3.11 is out now–but should you switch to it immediately? And if you shouldn’t upgrade just yet, when should you?

Article

Find slow data processing tasks (before your customers do)

Your data processing jobs are fast… most of the time. Next, find the slow runs so you can speed them up.

Article

Invasive procedures: Python affordances for performance measurement

Learn a variety of—sometimes horrible—ways to instrument and measure performance in Python.

Article

The limits of Python vectorization as a performance technique

Vectorization is a great way to speed up your Python code, but you’re limited to specific operations on bulk data. Learn how to get pass these limitations.

Article

Finding performance bottlenecks in Celery tasks

Learn how to speed up your Celery tasks by identifying slow tasks, and then finding the performance bottleneck using a profiler.

Article

Pandas vectorization: faster code, slower code, bloated memory

Vectorization in Pandas can make your code faster—except when it will make your code slower.

Article

Making pip installs a little less slow

Installing packages with pip, Poetry, and Pipenv can be slow. Learn how to ensure it’s not even slower, and a potential speed-up.

Showing the last 30 only...