Caching model evaluations
Sherpa contains a rudimentary system for caching the results
of model evaluations on one or more dimensions in order to speed up the
time to evaluate models, at the expense of using more memory.
The modelCacher()
function decorator is applied to the
calc()
method of
ArithmeticModel
models, and this then
uses the parameter values, evaluation grid, and integrate setting to
look for a value from that model’s cache. If found the value is returned,
otherwise the model is evaluated and the result is added to the cache.
It is hard to predict how effective caching is so the defaults are set for typical Sherpa use cases, based on some benchmarking, but your performance might be improved with different settings.
What models are cached?
There is unfortunately no easy way to determine whether a model
uses the cache without either viewing the model definition - looking
for the application of @modelCacher
to the calc
method - or
by running a test as shown below,
in the example section.
When is the cache useful?
At present most models use the cache by default.
It is intended to improve fit performance, but the actual
time saved depends on the model and the data being fit.
Compared to most built-in sherpa models, models in the optional XSPEC model
library (sherpa.astro.xspec
) tend to be more complex and
thus benefit more from caching.
By default, the cache is switched off (mdl.cache=0
) for simple models where the model
evaluation is fast, sometimes even faster than the hashing that is needed to look up
a value in the cache. An obvious example is a scale or constant model
(sherpa.models.basic.Scale1D
or sherpa.models.basic.Constant1D
),
but this also applies to some fast analytical
models such as an XSPEC black body (sherpa.astro.xspec.XSbbody
).
The cache is also switched off for models in 2D, because in 2D we often have much
larger arrays and thus the cache would use more memory. On the other hand, there is
also the potential to save more time, given the longer computations needed for 2D models.
So, for 2D models, the user needs to explicitly set the cache size to a positive number
(mdl.cache=3
for example) to turn on the cache.
Can I turn off this behavior for other models?
The size of the cache for a specific model component called mdl
can
be set to zero (mdl.cache=0
) to turn off the cache behavior.
This may be useful if you are evaluating models over a large grid,
to save memory. For a composite model (e.g. a sum of models) you need
to set the cache for each component.
When we use the fit method of the UI or the fit method of an optimizer, it is possible to
turn off caching for all models with the cache
parameter, e.g.
ui.fit(..., cache=False)
.
This works by iterating over all the model components and setting the
cache
attribute to zero. Note that the opposite is not true: setting
cache=True
does not turn on the cache for all model components, it simply
leaves it at the previous setting. This is because some models may not work with
caching at all and need to stay at cache=0
at all times.
The cache has to be manually set to a positive number for all models that should use the cache
to allow caching again.
How do I set the default in my own models to use or not use the cache?
In order to be cacheable at all, a model must have the @modelCacher1d
decorator
applied to the calc
method. The default for an ArithmeticModel
is a cache size of 5, but this can be changed by setting the
cache
attribute in the model’s
constructor. For example, here we deactivate the cache by default by setting the value to 0,
but we still decorate the calc
method so that a user can switch if back on for
individual instances of the model:
>>> from sherpa.models.model import ArithmeticModel, modelCacher1d, Parameter
>>> class MyModel(ArithmeticModel):
... def __init__(self, name='mymodel'):
... self.xpos = Parameter(name, 'offset', 0)
... super().__init__(name, (self.offset,))
... self.cache = 0
...
... @modelCacher1d
... def calc(self, p, *args, **kwargs):
... # do something
... return p[0] + args[0]
>>> m = MyModel()
>>> m.cache = 3 # use the cache in models instance m
Do not set cache = 0
as a class attribute. If you use modelCacher1d
cache
is actually a property that does other things when the cache is set (e.g. reset the
cache content). Setting a class attribute will lead to errors when the decorated calc
method
is called, i.e. the following will not work:
>>> class MyModel(ArithmeticModel):
... cache = 0 # DO NOT DO THIS!
How does the cache work?
The parameter values, integrate setting, and grid values are used to
create a unique token - the SHA256 hash of the values - which is used
to look up a value in the _cache
dictionary. If it exists then the
stored value is returned, otherwise the model is evaluated and added
to the _cache
dictionary. In order to keep the cache size small, the
oldest element in the cache is removed when the number of entries becomes
larger than cache
elements (the
default value for this attribute is 5).
Examples
Checking the cache
In the following example we evaluate a model and check the _cache
attribute, and see that it has been updated by the model evaluation.
>>> from sherpa.models.basic import Box1D
>>> m = Box1D()
>>> m.xlow = 1.5
>>> m.xhi = 4.5
>>> print(m._cache)
{}
>>> print(m([1, 2, 3, 4, 5, 6]))
[0. 1. 1. 1. 0. 0.]
>>> print(m._cache)
{b'<random byte string>': array([0., 1., 1., 1., 0., 0.])}
Fit and the startup method
The fit method can also be seen to use the cache (although in this case it isn’t worth it!). First we set up the data:
>>> import numpy as np
>>> from sherpa.data import Data1D
>>> x = np.arange(0, 3)
>>> y = [2, 0.3, 0.02]
>>> data = Data1D('example', x, y)
A simple model is used:
>>> from sherpa.models.basic import Exp10
>>> mdl = Exp10()
>>> mdl.offset.frozen = True
>>> mdl.offset = 1.0
>>> mdl.coeff.frozen = True
>>> mdl.coeff = -1.0
>>> print(mdl.ampl.val)
1.0
>>> print(mdl._cache)
{}
The fit only takes 4 iterations, so the cache doesn’t help here! Note that
the startup
and teardown
methods are called automatically by
fit()
:
>>> from sherpa.fit import Fit
>>> f = Fit(data, mdl)
>>> result = f.fit()
>>> print(result.format())
Method = levmar
Statistic = chi2gehrels
Initial fit statistic = 9.178
Final fit statistic = 0.00239806 at function evaluation 4
Data points = 3
Degrees of freedom = 2
Probability [Q-value] = 0.998802
Reduced statistic = 0.00119903
Change in statistic = 9.1756
exp10.ampl 0.201694 +/- 0.263543
The cache contains 4 elements which we can display:
>>> print(len(mdl._cache))
4
>>> for v in mdl._cache.values():
... print(v)
...
[10. 1. 0.1]
[10.00345267 1.00034527 0.10003453]
[2.01694277 0.20169428 0.02016943]
[2.01763916 0.20176392 0.02017639]