Make your app fast again: caching function calls in Python
It’s the usual pattern: you start with a new webapp project, the app is responsive and fast. But after a while it gets slow and messy. There are many reasons why it can happen. Set aside the obvious reasons such as »technical debt«, messy code and lack of concurrency/parallelism, performance bottlenecks can arise from slow functions that are called frequently with the same (or a small set of different) parameters.
Here caching can be a big plus, because the return value does not need to be recalculated. However, this only works if your function in question is free of side-effects. In particular for functions that retrieve data from the database you need to make sure that the underlying data is not changed in the meantime. The Python documentation describes it best:
… it doesn’t make sense to cache functions with side-effects, functions that need to create distinct mutable objects on each call, or impure functions such as time() or random().
Later we will touch upon options to mitigate this problem.
Python has various caching libraries, some of them are tailored to Flask or other frameworks. Often adapted to specific needs, these solutions might be limiting if used as general approach to caching.
- To start simple, you can work with off-the-shelf caching with such as
cached_property
orlru_cache
from the functools module. - One level up would be cachetools
- And a more sophisticated solution could be Redis-based caching with the redis or hiredis-py packages
Feel free to combine the different approaches, but here we will focus on the first and second point.
(n.b. @cached_property
is only available in Python 3.8 or later, but you can just copy
the code from the
3.8 functools
and remove it once you switch to 3.8)
Python off-the-shelf caching
The example from the functools-module looks like:
class DataSet:
def __init__(self, sequence_of_numbers):
self._data = sequence_of_numbers
@cached_property
def stdev(self):
return statistics.stdev(self._data)
@cached_property
def variance(self):
return statistics.variance(self._data)
The implementation is straight forward, you just use @cached_property
instead of
@property
. Take care with mutable data:
class DataSetMutable:
def __init__(self, sequence_of_numbers):
self.data = sequence_of_numbers
@cached_property
def stdev(self):
return statistics.stdev(self.data)
@cached_property
def variance(self):
return statistics.variance(self.data)
dsm = DataSetMutable(range(10))
assert abs(dsm.stdev - statistics.stdev(range(10))) < 0.00001
dsm.data = range(30,100)
assert abs(dsm.stdev - statistics.stdev(range(30,100))) < 0.00001
The second assert will fail, dsm.stdev
returns the initially cached data and not the
updated version. In case you work with functions and not properties, @cached_property
will not help you further. In this case you can use @lru_cache
from functools:
@lru_cache
def count_vowels(sentence):
sentence = sentence.casefold()
return sum(sentence.count(vowel) for vowel in 'aeiou')
Once you want to get serious with @lru_cache
, you will inevitably run into following
problem: @lru_cache
can only cache hash()
-able, i.e. immutable, data structures. How
can we still cache functions with lists and dictionaries as arguments? There are two
ways out of the misery:
- only use hashable args in your functions
- wrap
@lru_cache
to deal with mutable data structures sets.
Freeze everything
One low hanging fruit is to use immutable data structures. Lists, dictionaries and sets can be converted easily to their immutable counterparts. Luckily the immutable versions behave mostly the same as their mutable counterparts.
def to_immutable(v):
if isinstance(v, dict):
return frozendict(v)
if isinstance(v, list):
return tuple(v)
if isinstance(v, set):
return frozenset(v)
raise TypeError(f"cannot convert {type(v)} to immutable counterpart")
As experienced Python developer you might have spotted that frozendict
does not exist
in vanilla Python. Unfortunately dictionaries do not have a immutable counterpart. In my
code I use the following definition to create a frozen dictionary:
class frozendict(dict):
def __hash__(self):
return hash(frozenset(self.items()))
By using immutable structures as much as possible (if you know you need to cache them), you can get quite sophisticated caching out of the box.
Extending @lru_cache
to deal with mutable data structures
Let’s have a look at 2. Other people have had this problem already (here or here). It usually boils down to modify or prepend some type of serialisation or conversion to immutable types.
I found that the serialisation with
pickle
works quite well, so I adapted
a
previous solution based on JSON:
import functools
import pickle
def pickled_lru_cache(cache):
def pickled_lru_cache_decorator(func):
def func_with_pickled_args(pickled_args, pickled_kwargs):
args = pickle.loads(pickled_args)
kwargs = pickle.loads(pickled_kwargs)
return func(*args, **kwargs)
cached_func = cache(func_with_pickled_args)
@functools.wraps(func)
def pickled_cached_func(*args, **kwargs):
pickled_args = pickle.dumps(args)
pickled_kwargs = pickle.dumps(kwargs)
return cached_func(pickled_args, pickled_kwargs)
pickled_cached_func.cache_info = cached_func.cache_info
pickled_cached_func.cache_clear = cached_func.cache_clear
return pickled_cached_func
return pickled_lru_cache_decorator
@pickled_lru_cache(functools.lru_cache(maxsize=3))
def function_that_takes_long(*args, **kwargs):
pass
With this approach, it is possible to also cache mutable objects. The extra
functionality comes with a trade-off: time/performance. The extra steps necessary will
of course take longer than the plain @lru_cache
. What is the difference? A quick check
with timethese gives:
# immutable data structures as args
Rate run_py_lru run_hashable_lru run_pickled_lru
run_py_lru 55952/s . 566% 633%
run_hashable_lru 8401/s -85% . 10%
run_pickled_lru 7630/s -86% -9% .
# mutable data structures as args (dict & list)
Rate run_pickled_lru2 run_hashable_lru2
run_pickled_lru2 4332/s . 800%
run_hashable_lru2 482/s -89% .
with run_py_lru
as plain @lru_cache
, run_hashable_lru
as the previous
JSON-solution and run_pickled_lru
as adapted version with pickle
instead of json
.
run_pickled_lru
and run_hashable_lru
are a magnitude (if not more) slower than plain
@lru_cache
. So think twice before you need a cache that can deal with lists and
dictionaries. The performance impact can have two causes (1) wrapping everything in an
extra decorator and (2) the overhead created by more complex hashing. In my case it was
mostly (1).
Making lru_cache
self-less
A second issue that might arise frequently is using @lru_cache
in a class/object
context. Tricky: first, the self
object cannot be hashed and second, caching self
could introduce a memory leak, as the caching might not release the object for garbage
collection. How to deal with it? And more importantly: should we deal with it?
Based on a
stackoverflow post,
I played around with extending the @lru_cache
to
It is definitively possible, as you can see in the code.
But should we do these kind of “tricks”? My opinion is: probably not. If you start
wrapping the @lru_cache
with another decorator that prevents self
from being
included in the cache entry hash, you will create code that is hard to maintain and
reason about (see also the comment on the
omit_self
decorator).
What if the object with the cached function changes and this change needs to be
incorporated? What if I want to have a self-aware TTL LRU cache or a totally different
cache algorithm? I suggest to make it explicit and use e.g.
cachetools
for that.
import cachetools
class Foobar:
def __init__(self):
self.cache = cachetools.LRUCache(maxsize=10)
@property
@cachetools.cachedmethod(lambda self: self.cache)
def cached_method(self):
# complex calculation or query here
return ["some", "important", "result"]
This code snippet is basically a slow version of @cached_property
.
Beware: if you use a cachetools
cache across methods, there might be key conflicts
leading to wrong results. It is better to use multiple caches
import cachetools
class Foobar2:
def __init__(self):
self.cache1 = cachetools.LRUCache(maxsize=10)
self.cache2 = cachetools.LRUCache(maxsize=10)
@cachetools.cachedmethod(lambda self: self.cache1)
def cached_method1(self, a, b, c):
# complex calculation or query here
return ["some", "important", "result", "1"]
@cachetools.cachedmethod(lambda self: self.cache2)
def cached_method2(self, a, b, c):
# complex calculation or query here
return ["some", "important", "result", "2"]
or tag your cache entries:
import cachetools
from cachetools.keys import hashkey
from functools import partial
class Foobar3:
def __init__(self):
self.cache = cachetools.LRUCache(maxsize=10)
@cachetools.cachedmethod(
lambda self: self.cache, key=partial(hashkey, "cached_method1")
)
def cached_method1(self, a, b, c):
# complex calculation or query here
return ["some", "important", "result", "1"]
@cachetools.cachedmethod(
lambda self: self.cache, key=partial(hashkey, "cached_method2")
)
def cached_method2(self, a, b, c):
# complex calculation or query here
return ["some", "important", "result", "2"]
cachetools
provides a versatile interface and the possibility to supply your own key
functions:
import pickle
import cachetools
def pickled_hashkey(*args, **kwargs):
pickled_args = pickle.dumps(args)
pickled_kwargs = pickle.dumps(kwargs)
return cachetools.keys.hashkey(pickled_args, pickled_kwargs)
@cachetools.cached(cache, key=pickled_hashkey)
def foobar(a, b, c):
pass
The solution with cachetools
is more developer friendly, but introduces additional
complexity in the background. In my performance evaluation it was 50% slower than the
custom pickled_lru_cache
.
On/Off button
For performance evaluation it can be handy to disable caching globally. I usually pack all caching-relevant functions and classes into one module, from which I import throughout my app/code.
This construct gives you the advantage that implementation details can be changed at one place only:
# app/caching.py
import uuid
import cachetools
import pickle
import functools
CACHING_DISABLED = False
def pickled_hashkey(*args, **kwargs):
if CACHING_DISABLED:
# every time we generate a cache miss
return uuid.uuid4()
pickled_args = pickle.dumps(args)
pickled_kwargs = pickle.dumps(kwargs)
return cachetools.keys.hashkey(pickled_args, pickled_kwargs)
if CACHING_DISABLED:
cached_property = property
else:
cached_property = functools.cached_property
# app/run.py
from app.caching import cached_property
# instead of
# from functools import cached_property
Conclusion
Caching (in Python) is full of trade-offs and you need to pick the solution that fits your use case. I hope this article will help you in finding that solution.
Have fun!
See also
- What is memoization and how can I use it in Python? - Stack Overflow
- I also saw
python-memoization
, but I did not investigate it further. - code used in this post