-
Notifications
You must be signed in to change notification settings - Fork 54
Butler API
A butler object gets passed to all application and algorithm functions to provide the following capabilities:
- Access to stored data
- Access to target information
- Make calls to other algorithm and application functions, either synchronously or asynchronously.
Note: (mostly for algorithm developers) The butler can store arbitrary variables using (for example) butler.algorithms.set(key='foo', value='bar')
and get these variables with foo = butler.algorithms.get(key='foo')
.
Note: Butler can store NumPy ndarrays only by converting them to a list. This means that butler.algorithms.get(key='some-numpy-array')
will return a list, not the original array.
Stored data is divided into collections, each of which stores dictionaries. Each dictionary has a unique id that can be used to access it, and certain default behaviour:
Where the information about the currently running experiment is stored (for example, the information passed in when the experiment is initialised)
Default behaviours:
- When
MyApp.initExp
is called, the value it returns is stored in this collection, in theargs
key.
Information about the various algorithms being tested in a given experiment (for instance, the type of each algorithm)
Default behaviours:
- When
MyApp.initExp
is called, it returns a dictionary describing the experiment, including information about the algorithms involved. If this dictionary is calledexp_dict
, then it is assumed thatexp_dict.args.alg_list
contains a list of dictionaries describing the algorithms. These are stored in thealgorithms
collection whenMyApp.initExp
returns.
The queries
collection contains information about each query that has been sent out in response to a getQuery
call.
Default behaviours:
- Every time MyApp.getQuery is called, its return value is stored into the queries collection under a unique ID called
query_uid
. - Every time MyApp.processAnswer is called, its return value is a dictionary whose
query_update
key contains a dictionary whose entries are added to the query dictionary for the corresponding query_uid.
Information about participants, such as how many answers each participant has provided.
Default behaviours:
- If a getQuery call is made with a participant_uid that has not been seen before, an entry in this collection is created for that participant and they are assigned an algorithm that will be used to supply them with queries according to the experiment specification.
- Every time MyApp.processAnswer is called, the participant providing the answer has their
num_reported_answers
value incremented.
This collection is for anything else that needs storing. It is not used in the default setup.
Each collection can be accessed from app and algorithm functions as butler.experiment
, butler.queries
, etc. They all have the following functions:
Note: by default the Butler does some formatting for the parameter uid
. In the use cases I have seen, uid
is never specified.
Note: the keys and values described below are key-value pairs, similar to Python's {key: value}
dictionaries.
-
get(uid, key, pattern)
Get an object from the collection (possibly by pattern), or an entry (or entries) from an object in the collection.- key == None and pattern == None: return collection[uid]
- key != None and pattern == None and type(key) != list: return collection[uid][key]
- key != None and pattern == None and type(key) == list: return [collection[uid][k] for k in key]
- pattern != None: return collection[uid] matching pattern
-
set(uid, key, value)
Set an object in the collection, or an entry in an object in the collection.- key == None: collection[uid] = value
- key != None: collection[uid][key] = value
-
exists(uid)
Check if an object with the specified uid exists -
increment(uid, key)
Increment a value (or values) in the collection.- type(key) != list: increment collection[uid][key]
- type(key) == list: increment collection[uid][k] for k in key
-
append(uid, key, value)
Append a value to collection[uid][key] (which is assumed to be a list)
The butler has a reference to the target manager, called targets
. Each application may use a custom target manager, but in the case of SimpleTargetManager, which has functions:
-
set_targetset(exp_uid, targetset)
: Addstargetset
to the targets database associated with the experiment with idexp_uid
. -
get_target_item(exp_uid, target_id)
: Get a particular item from the experiment's targets with idtarget_id
. -
get_targetset(exp_uid)
: Get the entire target set for the specified experiment. -
get_target_mapping(exp_uid)
: Get the whole list of targets, ordered by target_id
The butler also has a function for starting jobs: job(task_name, task_args, ignore_result=True, time_limit=0)
. This function will run the function named in task_name
with arguments task_args
. More precisely:
-
task_name
is the string name of the function to be called. Ifjob
is invoked from an application function, thentask_name
must be the name of another application function. If from an algorithm function, thentask_name
must be the name of another algorithm function. -
task_args
is a string which is the JSON-serialisation of a dictionary containing the arguments to the function named intask_name
-
ignore_result
is a boolean specifying whether this function should return the result of the function call (will turn non-blocking calls into blocking calls). -
time_limit
is an int specifying how many seconds the function should be allowed to run for, if the call is blocking. If it does not complete in the specified time limit, then the process running the function will raise an exception, and shortly thereafter be terminated at whatever stage of execution it happened to be in.
The choice between blocking/non-blocking behaviour is simple:
- If you call
butler.job
from an application function, the call will be non-blocking unless ignore_result=False is specified. - If you call
butler.job
from an algorithm function, the call will be blocking.
Use butler.log
. For example, butler.log("LogName", "This is a log about something")
.
These are visible through [next-url]:8000/experiment/<string:exp_uid>/logs
and [next-url]:8000/experiment/<string:exp_uid>/logs/<log_type>
.
These should be used instead of writing logs to disk everytime getQuery
or processAnswer
is called.