Skip to content

Commit

Permalink
Merge pull request #213 from khaeru/caching-backend
Browse files Browse the repository at this point in the history
Add CachingBackend, address performance of JDBCBackend.item_get_elements
  • Loading branch information
khaeru authored Nov 22, 2019
2 parents 7a9c52e + c49c416 commit f68bbeb
Show file tree
Hide file tree
Showing 12 changed files with 285 additions and 161 deletions.
3 changes: 2 additions & 1 deletion RELEASE_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ Configuration for ixmp and its storage backends has been streamlined.

- [#189](https://github.com/iiasa/ixmp/pull/189): Identify and load Scenarios using URLs.
- [#182](https://github.com/iiasa/ixmp/pull/182),
[#200](https://github.com/iiasa/ixmp/pull/200): Add new Backend, Model APIs and JDBCBackend, GAMSModel classes.
[#200](https://github.com/iiasa/ixmp/pull/200),
[#213](https://github.com/iiasa/ixmp/pull/213): Add new Backend, Model APIs and CachingBackend, JDBCBackend, GAMSModel classes.
- [#188](https://github.com/iiasa/ixmp/pull/188),
[#195](https://github.com/iiasa/ixmp/pull/195): Enhance reporting.
- [#177](https://github.com/iiasa/ixmp/pull/177): add ability to pass `gams_args` through `Scenario.solve()`
Expand Down
43 changes: 34 additions & 9 deletions doc/source/api-backend.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,8 @@ Provided backends

JDBCBackend supports:

- ``dbtype='HSQLDB'``: HyperSQL databases in local files.
- Remote databases. This is accomplished by creating a :class:`ixmp.Platform` with the ``dbprops`` argument pointing a file that specifies JDBC information. For instance::

jdbc.driver = oracle.jdbc.driver.OracleDriver
jdbc.url = jdbc:oracle:thin:@database-server.example.com:1234:SCHEMA
jdbc.user = USER
jdbc.pwd = PASSWORD
- Databases in local files (HyperSQL) using ``driver='hsqldb'`` and the *path* argument.
- Remote, Oracle databases using ``driver='oracle'`` and the *url*, *username* and *password* arguments.

It has the following methods that are not part of the overall :class:`Backend` API:

Expand All @@ -38,11 +33,30 @@ Provided backends
read_gdx
write_gdx

JDBCBackend caches values in memory to improve performance when repeatedly reading data from the same items with :meth:`.par`, :meth:`.equ`, or :meth:`.var`.

.. tip:: If repeatedly accessing the same item with different *filters*:

1. First, access the item by calling e.g. :meth:`.par` *without* any filters.
This causes the full contents of the item to be loaded into cache.
2. Then, access by making multiple :meth:`.par` calls with different *filters* arguments.
The cache value is filtered and returned without further access to the database.

.. tip:: Modifying an item by adding or deleting elements invalidates its cache.

.. automethod:: ixmp.backend.jdbc.start_jvm

Backend API
-----------

.. currentmodule:: ixmp.backend.base

.. autosummary::

ixmp.backend.FIELDS
ixmp.backend.base.Backend
ixmp.backend.base.CachingBackend

- :class:`ixmp.Platform` implements a *user-friendly* API for scientific programming.
This means its methods can take many types of arguments, check, and transform them—in a way that provides modeler-users with easy, intuitive workflows.
- In contrast, :class:`Backend` has a *very simple* API that accepts arguments and returns values in basic Python data types and structures.
Expand All @@ -51,9 +65,11 @@ Backend API
- :class:`Platform <ixmp.Platform>` code is not affected by where and how data is stored; it merely handles user arguments and then makes, usually, a single :class:`Backend` call.
- :class:`Backend` code does not need to perform argument checking; merely store and retrieve data reliably.

.. autodata:: ixmp.backend.FIELDS
- Additional Backends may inherit from :class:`Backend` or
:class:`CachingBackend`.

.. currentmodule:: ixmp.backend.base

.. autodata:: ixmp.backend.FIELDS

.. autoclass:: ixmp.backend.base.Backend
:members:
Expand Down Expand Up @@ -143,3 +159,12 @@ Backend API
cat_get_elements
cat_list
cat_set_elements


.. autoclass:: ixmp.backend.base.CachingBackend
:members:
:private-members:

CachingBackend stores cache values for multiple :class:`.TimeSeries`/:class:`Scenario` objects, and for multiple values of a *filters* argument.

Subclasses **must** call :meth:`cache`, :meth:`cache_get`, and :meth:`cache_invalidate` as appropriate to manage the cache; CachingBackend does not enforce any such logic.
1 change: 0 additions & 1 deletion doc/source/api-python.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,6 @@ Scenario
add_par
add_set
change_scalar
clear_cache
clone
equ
equ_list
Expand Down
26 changes: 12 additions & 14 deletions doc/source/reporting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,24 +136,22 @@ Others:
>>> k1.drop('a', 'c') == k2.drop('a') == 'foo:b'
True

Notes
-----
A Key has the same hash, and compares equal to its ``str()``. ``repr(key)``
prints the Key in angle brackets ('<>') to signify it is a Key object.
Some notes:

>>> repr(k1)
<foo:a-b-c>
- A Key has the same hash, and compares equal to its ``str()``.
``repr(key)`` prints the Key in angle brackets ('<>') to signify it is a Key object.

Keys are *immutable*: the properties :attr:`name`, :attr:`dims`, and
:attr:`tag` are read-only, and the methods :meth:`append`, :meth:`drop`, and
:meth:`add_tag` return *new* Key objects.
>>> repr(k1)
<foo:a-b-c>

Keys may be generated concisely by defining a convenience method:
- Keys are *immutable*: the properties :attr:`name`, :attr:`dims`, and :attr:`tag` are read-only, and the methods :meth:`append`, :meth:`drop`, and :meth:`add_tag` return *new* Key objects.

>>> def foo(dims):
>>> return Key('foo', dims.split())
>>> foo('a b c')
foo:a-b-c
- Keys may be generated concisely by defining a convenience method:

>>> def foo(dims):
>>> return Key('foo', dims.split())
>>> foo('a b c')
foo:a-b-c


Computations
Expand Down
107 changes: 107 additions & 0 deletions ixmp/backend/base.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
from abc import ABC, abstractmethod
from copy import copy
import json

from ixmp.core import TimeSeries, Scenario

Expand Down Expand Up @@ -759,3 +761,108 @@ def cat_set_elements(self, ms: Scenario, name, cat, keys, is_unique):
-------
None
"""


class CachingBackend(Backend):
"""Backend with additional features for caching data."""

#: Cache of values. Keys are given by :meth:`_cache_key`; values depend on
#: the subclass' usage of the cache.
_cache = {}

#: Count of number of times a value was retrieved from cache successfully
#: using :meth:`cache_get`.
_cache_hit = {}

def __init__(self):
super().__init__()

# Empty the cache
self._cache = {}
self._cache_hit = {}

@classmethod
def _cache_key(self, ts, ix_type, name, filters=None):
"""Return a hashable cache key.
ixmp *filters* (a :class:`dict` of :class:`list`) are converted to a
unique id that is hashable.
Parameters
----------
ts : .TimeSeries
ix_type : str
name : str
filters : dict
Returns
-------
tuple
A hashable key with 4 elements for *ts*, *ix_type*, *name*, and
*filters*.
"""
ts = id(ts)
if filters is None or len(filters) == 0:
return (ts, ix_type, name)
else:
# Convert filters into a hashable object
filters = hash(json.dumps(sorted(filters.items())))
return (ts, ix_type, name, filters)

def cache_get(self, ts, ix_type, name, filters):
"""Retrieve value from cache.
The value in :attr:`_cache` is copied to avoid cached values being
modified by user code. :attr:`_cache_hit` is incremented.
Raises
------
KeyError
If the key for *ts*, *ix_type*, *name* and *filters* is not in the
cache.
"""
key = self._cache_key(ts, ix_type, name, filters)

if key in self._cache:
self._cache_hit[key] = self._cache_hit.setdefault(key, 0) + 1
return copy(self._cache[key])
else:
raise KeyError(ts, ix_type, name, filters)

def cache(self, ts, ix_type, name, filters, value):
"""Store *value* in cache.
Returns
-------
bool
:obj:`True` if the key was already in the cache and its value was
overwritten.
"""
key = self._cache_key(ts, ix_type, name, filters)

refreshed = key in self._cache
self._cache[key] = value

return refreshed

def cache_invalidate(self, ts, ix_type=None, name=None, filters=None):
"""Invalidate cached values.
With all arguments given, single key/value is removed from the cache.
Otherwise, multiple keys/values are removed:
- *ts* only: all cached values associated with the :class:`.TimeSeries`
or :class:`.Scenario` object.
- *ts*, *ix_type*, and *name*: all cached values associated with the
ixmp item, whether filtered or unfiltered.
"""
key = self._cache_key(ts, ix_type, name, filters)

if filters is None:
i = slice(1) if (ix_type is name is None) else slice(3)
to_remove = filter(lambda k: k[i] == key[i], self._cache.keys())
else:
to_remove = [key]

for key in list(to_remove):
self._cache.pop(key)
Loading

0 comments on commit f68bbeb

Please sign in to comment.