Speed Up Python Code with @lru_cache

Leandro Hirt

Atualizado em: 19/05/2026

Published on: May 19, 2026

Reading time: 7 minutes

Have you ever noticed that your program spends an eternity recalculating the same results over and over? There is a built-in Python tool that memorizes function outputs and delivers them instantly on repeat calls, making scripts that take seconds run in microseconds. If you want to learn how to speed up your Python code with @lru_cache in 2 minutes, this guide is the perfect starting point. Slowness rarely comes from hardware. It usually comes from redundant computation, and that is exactly what memoization eliminates.

The @lru_cache decorator is part of Python’s built-in functools module. It implements a technique called memoization: the first time a function is called with a given set of arguments, Python stores the result. Every subsequent call with identical arguments returns the saved value instantly, bypassing the computation entirely. According to the official Python documentation, this is especially powerful for recursive algorithms and expensive I/O operations that are called repeatedly with the same inputs.

What Exactly Is @lru_cache?

The name stands for Least Recently Used Cache. Imagine a small shelf where you keep books. When the shelf is full and you need to add a new one, you remove the book you have not touched in the longest time. That is precisely how this decorator manages memory: it keeps the most recently used results and discards the oldest ones when the cache reaches its size limit.

When applied to a pure function (one that always returns the same output for the same inputs), the speed gain can be dramatic. A recursive Fibonacci calculation that takes several seconds without caching becomes instantaneous. This is one of the most useful Python decorators available for performance optimization.

Why Code Gets Slow Without a Cache

Many developers blame Python’s nature when their scripts run slowly, but the real cause is usually redundant work. Take recursive Fibonacci as a classic example. Without a cache, computing fib(35) requires calculating fib(34) and fib(33). But fib(34) itself also calculates fib(33), and so on. The call tree grows exponentially, recalculating the same values thousands of times. This is the problem that Python optimization techniques like memoization solve directly.

Setting Up: No Installation Required

The best part about lru_cache is that you do not need to install anything. It is part of Python’s standard library. If you already have Python installed, you are ready. Just import it at the top of your script. If you want a comfortable editing setup, the guide on installing and configuring VS Code will get your environment ready in minutes.

from functools import lru_cache

Step by Step: Applying @lru_cache

Step 1: The Problem Without Optimization

Here is a standard recursive Fibonacci function. Without caching, this visibly slows down for values above 30 because Python recalculates every intermediate value from scratch on every call. This pattern is also common in any Python recursion problem where the same subproblems repeat:

def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# This will take noticeably long for n > 30
print(fibonacci(35))

Step 2: Adding @lru_cache

Adding the decorator is a one-line change. The maxsize parameter sets how many unique results Python should remember. Once that limit is reached, the least recently used entry is discarded to make room for new ones:

from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(35))  # Instantaneous!

If you use maxsize=None, the cache grows without bounds. Use that setting only when you are certain the number of unique inputs is small and predictable, because an unbounded cache can eventually cause a MemoryError in Python.

Measuring the Performance Gain

To confirm the improvement, you can use the Python timeit module or the built-in time module. In typical tests, a function that takes 10 seconds without caching runs in under 0.0001 seconds after the first cached call. The algorithmic complexity changes from exponential to linear, which is the difference between a script that is impractical and one that scales.

Monitoring and Clearing the Cache

The decorator adds two useful methods to any wrapped function. cache_info() shows statistics about how effectively the cache is working, including hits (results returned from cache), misses (results that required actual computation), current size, and maximum size. cache_clear() wipes all stored results, forcing fresh computation on the next call:

# See cache statistics
print(fibonacci.cache_info())
# CacheInfo(hits=33, misses=36, maxsize=128, currsize=36)

# Clear all cached results
fibonacci.cache_clear()

Practical Use Case: Caching API Calls

One of the most impactful real-world applications of lru_cache is caching network requests. If you are consuming REST APIs with Python and the data does not change frequently (such as currency exchange rates you check every hour), caching the response avoids redundant network calls, reduces latency, and prevents hitting API rate limits:

import requests
from functools import lru_cache

@lru_cache(maxsize=10)
def fetch_api_data(url):
    print(f"Making real network request to {url}")
    response = requests.get(url)
    return response.json()

# First call hits the network
data = fetch_api_data("https://api.example.com/v1/prices")

# Second call returns instantly from cache
data_again = fetch_api_data("https://api.example.com/v1/prices")

The Difference Between @lru_cache and @cache

Starting with Python 3.9, the @cache decorator was introduced as a simpler alias for @lru_cache(maxsize=None). Both work identically except that @cache never discards entries. Use @lru_cache with an explicit maxsize when memory is a concern. Use @cache when you know the input space is finite and small. For anyone doing data-intensive work with NumPy or Pandas inside Docker containers, understanding this memory tradeoff is essential for keeping scripts within their resource limits.

When NOT to Use @lru_cache

Despite its power, lru_cache has important limitations. Avoid it on functions that return different values for the same inputs, such as those that read the current time, generate random numbers, or depend on changing global state. Never use it on functions with side effects like writing to a database or sending an email, because the side effect would only happen on the first call and be silently skipped on all cached calls.

Also, all arguments must be hashable. This means you cannot pass a Python list or dictionary directly. If you need to cache a function that processes a list, convert it to a Python tuple before passing it in, since tuples are immutable and therefore hashable.

Complete Project Code: Live Benchmark

Here is the full script you can copy and run immediately to see the performance difference on your own machine:

import time
from functools import lru_cache

# Slow version (no cache)
def fib_slow(n):
    if n < 2:
        return n
    return fib_slow(n-1) + fib_slow(n-2)

# Fast version (with lru_cache)
@lru_cache(maxsize=None)
def fib_fast(n):
    if n < 2:
        return n
    return fib_fast(n-1) + fib_fast(n-2)

# Testing the fast version
print("--- Testing Fib(35) WITH lru_cache ---")
start = time.time()
print(f"Result: {fib_fast(35)}")
end = time.time()
print(f"Time elapsed: {end - start:.6f} seconds")

# Testing the slow version
print("n--- Testing Fib(35) WITHOUT lru_cache ---")
start = time.time()
print(f"Result: {fib_slow(35)}")
end = time.time()
print(f"Time elapsed: {end - start:.6f} seconds")

# Checking cache statistics
print(f"nCache statistics: {fib_fast.cache_info()}")

When you run this, the difference will be immediately obvious. The cached version completes in a tiny fraction of the time. This optimization approach pairs naturally with other techniques: if your script deals with large data files, combining lru_cache with the patterns for reading giant files without freezing Python gives you a dual performance advantage at both the I/O and computation layers.

Frequently Asked Questions

Does @lru_cache use a lot of memory?

It depends entirely on maxsize. With maxsize=None, the cache grows indefinitely. With a fixed number like 128, Python keeps only the last 128 unique results, which is very safe for most applications.

Can I use @lru_cache on class methods?

Yes, but with caution. When applied to instance methods, the cache may keep a reference to self, preventing garbage collection of the object. For class methods, consider specialized libraries or a manual caching dictionary instead.

Does the cache persist between script runs?

No. @lru_cache is in-memory only. When the program ends, all cached data is lost. For persistent caching across runs, use a database like Redis or save results to a local file.

Can I pass a list as an argument to a cached function?

Not directly, because lists are mutable and therefore not hashable. Convert the list to a tuple before passing it to the function, since tuples are immutable and work perfectly as cache keys.

What is the difference between @lru_cache and manual memoization?

Manual memoization typically uses a Python dictionary to store results. @lru_cache is more efficient because the LRU eviction logic is implemented in C at the interpreter level, making it faster than a pure Python equivalent.

Does lru_cache work with async functions (asyncio)?

The standard lru_cache is not designed for coroutines. For async functions defined with async def, use the third-party async-lru library, which provides the same behavior compatible with Python's async event loop.

How do I know if the cache is actually working?

Call your_function.cache_info(). If the "hits" count increases as you call the function with repeated arguments, the cache is working. A high ratio of hits to misses means the cache is delivering strong performance benefits.

Can caching hide bugs?

Yes. If your function depends on changing global variables or external state, the cache may return a stale result instead of a fresh one. Always apply caching only to pure functions where the output depends exclusively on the input arguments.