Contributions

Article
Your data processing jobs are fast… most of the time. Next, find the slow runs so you can speed them up.
Article
Learn a variety of—sometimes horrible—ways to instrument and measure performance in Python.
Article
Vectorization is a great way to speed up your Python code, but you’re limited to specific operations on bulk data. Learn how to get pass these limitations.
Article
Learn how to speed up your Celery tasks by identifying slow tasks, and then finding the performance bottleneck using a profiler.
Article
Vectorization in Pandas can make your code faster—except when it will make your code slower.
Article
Installing packages with pip, Poetry, and Pipenv can be slow. Learn how to ensure it’s not even slower, and a potential speed-up.
Article
msgspec is a schema-based JSON encoder/decoder, which allows you to process large files with lower memory and CPU usage.
Article
Python’s Global Interpreter Lock (GIL) stops threads from running in parallel or concurrently. Learn how to determine impact of the GIL on your code.
Tutorial
Learn how to read CSVs in Pandas that much faster.
Article
Vectorization allows you to speed up processing of homogeneous data in Python. Learn what it means, when it applies, and how to do it.
Article
Python 3.6 will stop getting security updates in December 2021. Given the existence of 3.7, 3.8, 3.9, and 3.10, you really should upgrade.
Article
Conda installs are very slow, but you can speed them with a much-faster Conda reimplementation called Mamba.
Article
You can write Python extensions with Cython, Rust, and many other tools. Learn which one you should use, depending on your particular needs.
Article
Python has two packaging systems, pip and Conda. Learn the differences between them so you can pick the right one for you.
Article
Python 3.10 is out now, but you won't be able to switch for a while: there's missing packages, missing toolchain updates, and more.
Article
Learn how to scan your Conda package dependencies for security vulnerabilities.
Tutorial
NumPy provides memory views transparently, as a way to save memory. But you need to understand how they work, because if you’re not careful you can also leak memory, or even modify data in ways you didn’t expect.
Article
When you’re loading many strings into Pandas, you’re going to use a lot of memory. If you have only a limited number of strings, you can save memory with categoricals, but that’s only helpful in a limited number of situations.

With Pandas 1.3, there’s a new option that can save memory on large number of strings as well, simply by changing to a new column type.
Article
Learn how to accurately measure memory usage of your Pandas DataFrame or Series.
Article
Measuring your Python program’s memory usage is not as straightforward as you might think. Learn two techniques, and the tradeoffs between them.
Article
Learn the variety of techniques you can use to make your Python application’s Docker image a whole lot smaller.
Article
A compiled language like Rust or C is a lot faster than Python, but it won’t always make your Python code faster. Learn about the hidden overhead you’ll need to overcome.
Tutorial
Pandas can easily load data using a SQL query, but the resulting dataframe may use too much memory. Learn how to process data in batches, and then how to reduce memory usage even further.
Article
Using old versions of pip can result in installing old packages, or needing to recompile packages from scratch. So make sure you upgrade pip before using it.
Library
A Python memory profiler for data processing and scientific computing applications
Article
Every time you change your pip requirements and rebuild your Docker image, you’re going to have download all your packages. Learn how to prevent this with Docker BuildKit’s new caching feature.
Article
There are many ways out-of-memory problems can manifest in Python. Learn how to identify them, as a first step to fixing the problem.
Tutorial
You want your application packaging to be reproducible, but you also want to be able to change dependencies easily without conflicts. Conda doesn’t make this easy, so learn how to do it with a third-party tool: conda-lock.
Article
To make your Python code faster, you should often start with optimizing single-threaded versions, then consider multiprocessing, and only then think about a cluster.
Article
Python 3.9 is out now, but when should you switch? Learn the problems you'll encounter, and when it's time to try it out.

Showing the last 30 only...