Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CachingBackend, address performance of JDBCBackend.item_get_elements #213

Merged
merged 13 commits into from
Nov 22, 2019
3 changes: 2 additions & 1 deletion RELEASE_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ Configuration for ixmp and its storage backends has been streamlined.

- [#189](https://github.com/iiasa/ixmp/pull/189): Identify and load Scenarios using URLs.
- [#182](https://github.com/iiasa/ixmp/pull/182),
[#200](https://github.com/iiasa/ixmp/pull/200): Add new Backend, Model APIs and JDBCBackend, GAMSModel classes.
[#200](https://github.com/iiasa/ixmp/pull/200),
[#213](https://github.com/iiasa/ixmp/pull/213): Add new Backend, Model APIs and CachingBackend, JDBCBackend, GAMSModel classes.
- [#188](https://github.com/iiasa/ixmp/pull/188),
[#195](https://github.com/iiasa/ixmp/pull/195): Enhance reporting.
- [#177](https://github.com/iiasa/ixmp/pull/177): add ability to pass `gams_args` through `Scenario.solve()`
Expand Down
43 changes: 34 additions & 9 deletions doc/source/api-backend.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,8 @@ Provided backends

JDBCBackend supports:

- ``dbtype='HSQLDB'``: HyperSQL databases in local files.
- Remote databases. This is accomplished by creating a :class:`ixmp.Platform` with the ``dbprops`` argument pointing a file that specifies JDBC information. For instance::

jdbc.driver = oracle.jdbc.driver.OracleDriver
jdbc.url = jdbc:oracle:thin:@database-server.example.com:1234:SCHEMA
jdbc.user = USER
jdbc.pwd = PASSWORD
- Databases in local files (HyperSQL) using ``driver='hsqldb'`` and the *path* argument.
- Remote, Oracle databases using ``driver='oracle'`` and the *url*, *username* and *password* arguments.

It has the following methods that are not part of the overall :class:`Backend` API:

Expand All @@ -38,11 +33,30 @@ Provided backends
read_gdx
write_gdx

JDBCBackend caches values in memory to improve performance when repeatedly reading data from the same items with :meth:`.par`, :meth:`.equ`, or :meth:`.var`.

.. tip:: If repeatedly accessing the same item with different *filters*:

1. First, access the item by calling e.g. :meth:`.par` *without* any filters.
This causes the full contents of the item to be loaded into cache.
2. Then, access by making multiple :meth:`.par` calls with different *filters* arguments.
The cache value is filtered and returned without further access to the database.

.. tip:: Modifying an item by adding or deleting elements invalidates its cache.

.. automethod:: ixmp.backend.jdbc.start_jvm

Backend API
-----------

.. currentmodule:: ixmp.backend.base

.. autosummary::

ixmp.backend.FIELDS
ixmp.backend.base.Backend
ixmp.backend.base.CachingBackend

- :class:`ixmp.Platform` implements a *user-friendly* API for scientific programming.
This means its methods can take many types of arguments, check, and transform them—in a way that provides modeler-users with easy, intuitive workflows.
- In contrast, :class:`Backend` has a *very simple* API that accepts arguments and returns values in basic Python data types and structures.
Expand All @@ -51,9 +65,11 @@ Backend API
- :class:`Platform <ixmp.Platform>` code is not affected by where and how data is stored; it merely handles user arguments and then makes, usually, a single :class:`Backend` call.
- :class:`Backend` code does not need to perform argument checking; merely store and retrieve data reliably.

.. autodata:: ixmp.backend.FIELDS
- Additional Backends may inherit from :class:`Backend` or
:class:`CachingBackend`.

.. currentmodule:: ixmp.backend.base

.. autodata:: ixmp.backend.FIELDS

.. autoclass:: ixmp.backend.base.Backend
:members:
Expand Down Expand Up @@ -143,3 +159,12 @@ Backend API
cat_get_elements
cat_list
cat_set_elements


.. autoclass:: ixmp.backend.base.CachingBackend
:members:
:private-members:

CachingBackend stores cache values for multiple :class:`.TimeSeries`/:class:`Scenario` objects, and for multiple values of a *filters* argument.

Subclasses **must** call :meth:`cache`, :meth:`cache_get`, and :meth:`cache_invalidate` as appropriate to manage the cache; CachingBackend does not enforce any such logic.
1 change: 0 additions & 1 deletion doc/source/api-python.rst
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,6 @@ Scenario
add_par
add_set
change_scalar
clear_cache
clone
equ
equ_list
Expand Down
26 changes: 12 additions & 14 deletions doc/source/reporting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,24 +136,22 @@ Others:
>>> k1.drop('a', 'c') == k2.drop('a') == 'foo:b'
True

Notes
-----
A Key has the same hash, and compares equal to its ``str()``. ``repr(key)``
prints the Key in angle brackets ('<>') to signify it is a Key object.
Some notes:

>>> repr(k1)
<foo:a-b-c>
- A Key has the same hash, and compares equal to its ``str()``.
``repr(key)`` prints the Key in angle brackets ('<>') to signify it is a Key object.

Keys are *immutable*: the properties :attr:`name`, :attr:`dims`, and
:attr:`tag` are read-only, and the methods :meth:`append`, :meth:`drop`, and
:meth:`add_tag` return *new* Key objects.
>>> repr(k1)
<foo:a-b-c>

Keys may be generated concisely by defining a convenience method:
- Keys are *immutable*: the properties :attr:`name`, :attr:`dims`, and :attr:`tag` are read-only, and the methods :meth:`append`, :meth:`drop`, and :meth:`add_tag` return *new* Key objects.

>>> def foo(dims):
>>> return Key('foo', dims.split())
>>> foo('a b c')
foo:a-b-c
- Keys may be generated concisely by defining a convenience method:

>>> def foo(dims):
>>> return Key('foo', dims.split())
>>> foo('a b c')
foo:a-b-c


Computations
Expand Down
107 changes: 107 additions & 0 deletions ixmp/backend/base.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
from abc import ABC, abstractmethod
from copy import copy
import json

from ixmp.core import TimeSeries, Scenario

Expand Down Expand Up @@ -759,3 +761,108 @@ def cat_set_elements(self, ms: Scenario, name, cat, keys, is_unique):
-------
None
"""


class CachingBackend(Backend):
"""Backend with additional features for caching data."""

#: Cache of values. Keys are given by :meth:`_cache_key`; values depend on
#: the subclass' usage of the cache.
_cache = {}

#: Count of number of times a value was retrieved from cache successfully
#: using :meth:`cache_get`.
_cache_hit = {}

def __init__(self):
super().__init__()

# Empty the cache
self._cache = {}
self._cache_hit = {}

@classmethod
def _cache_key(self, ts, ix_type, name, filters=None):
"""Return a hashable cache key.

ixmp *filters* (a :class:`dict` of :class:`list`) are converted to a
unique id that is hashable.

Parameters
----------
ts : .TimeSeries
ix_type : str
name : str
filters : dict

Returns
-------
tuple
A hashable key with 4 elements for *ts*, *ix_type*, *name*, and
*filters*.
"""
ts = id(ts)
if filters is None or len(filters) == 0:
return (ts, ix_type, name)
else:
# Convert filters into a hashable object
filters = hash(json.dumps(sorted(filters.items())))
return (ts, ix_type, name, filters)

def cache_get(self, ts, ix_type, name, filters):
"""Retrieve value from cache.

The value in :attr:`_cache` is copied to avoid cached values being
modified by user code. :attr:`_cache_hit` is incremented.

Raises
------
KeyError
If the key for *ts*, *ix_type*, *name* and *filters* is not in the
cache.
"""
key = self._cache_key(ts, ix_type, name, filters)

if key in self._cache:
self._cache_hit[key] = self._cache_hit.setdefault(key, 0) + 1
return copy(self._cache[key])
else:
raise KeyError(ts, ix_type, name, filters)

def cache(self, ts, ix_type, name, filters, value):
"""Store *value* in cache.

Returns
-------
bool
:obj:`True` if the key was already in the cache and its value was
overwritten.
"""
key = self._cache_key(ts, ix_type, name, filters)

refreshed = key in self._cache
self._cache[key] = value

return refreshed

def cache_invalidate(self, ts, ix_type=None, name=None, filters=None):
"""Invalidate cached values.

With all arguments given, single key/value is removed from the cache.
Otherwise, multiple keys/values are removed:

- *ts* only: all cached values associated with the :class:`.TimeSeries`
or :class:`.Scenario` object.
- *ts*, *ix_type*, and *name*: all cached values associated with the
ixmp item, whether filtered or unfiltered.
"""
key = self._cache_key(ts, ix_type, name, filters)

if filters is None:
i = slice(1) if (ix_type is name is None) else slice(3)
to_remove = filter(lambda k: k[i] == key[i], self._cache.keys())
else:
to_remove = [key]

for key in list(to_remove):
self._cache.pop(key)
Loading