Skip to content

Butler API

Liam Marshall edited this page Jul 17, 2017 · 2 revisions

A butler object gets passed to all application and algorithm functions to provide the following capabilities:

  • Access to stored data
  • Access to target information
  • Make calls to other algorithm and application functions, either synchronously or asynchronously.

Note: (mostly for algorithm developers) The butler can store arbitrary variables using (for example) butler.algorithms.set(key='foo', value='bar') and get these variables with foo = butler.algorithms.get(key='foo').

Note: Butler can store NumPy ndarrays only by converting them to a list. This means that butler.algorithms.get(key='some-numpy-array') will return a list, not the original array.

Access to stored data – Theory

Stored data is divided into collections, each of which stores dictionaries. Each dictionary has a unique id that can be used to access it, and certain default behaviour:

experiment

Where the information about the currently running experiment is stored (for example, the information passed in when the experiment is initialised)

Default behaviours:

  • When MyApp.initExp is called, the value it returns is stored in this collection, in the args key.

algorithms

Information about the various algorithms being tested in a given experiment (for instance, the type of each algorithm)

Default behaviours:

  • When MyApp.initExp is called, it returns a dictionary describing the experiment, including information about the algorithms involved. If this dictionary is called exp_dict, then it is assumed that exp_dict.args.alg_list contains a list of dictionaries describing the algorithms. These are stored in the algorithms collection when MyApp.initExp returns.

queries

The queries collection contains information about each query that has been sent out in response to a getQuery call.

Default behaviours:

  • Every time MyApp.getQuery is called, its return value is stored into the queries collection under a unique ID called query_uid.
  • Every time MyApp.processAnswer is called, its return value is a dictionary whose query_update key contains a dictionary whose entries are added to the query dictionary for the corresponding query_uid.

participants

Information about participants, such as how many answers each participant has provided.

Default behaviours:

  • If a getQuery call is made with a participant_uid that has not been seen before, an entry in this collection is created for that participant and they are assigned an algorithm that will be used to supply them with queries according to the experiment specification.
  • Every time MyApp.processAnswer is called, the participant providing the answer has their num_reported_answers value incremented.

other

This collection is for anything else that needs storing. It is not used in the default setup.

Access to stored data--Practice

Each collection can be accessed from app and algorithm functions as butler.experiment, butler.queries, etc. They all have the following functions:

Note: by default the Butler does some formatting for the parameter uid. In the use cases I have seen, uid is never specified.

Note: the keys and values described below are key-value pairs, similar to Python's {key: value} dictionaries.

  • get(uid, key, pattern) Get an object from the collection (possibly by pattern), or an entry (or entries) from an object in the collection.
    • key == None and pattern == None: return collection[uid]
    • key != None and pattern == None and type(key) != list: return collection[uid][key]
    • key != None and pattern == None and type(key) == list: return [collection[uid][k] for k in key]
    • pattern != None: return collection[uid] matching pattern
  • set(uid, key, value) Set an object in the collection, or an entry in an object in the collection.
    • key == None: collection[uid] = value
    • key != None: collection[uid][key] = value
  • exists(uid) Check if an object with the specified uid exists
  • increment(uid, key) Increment a value (or values) in the collection.
    • type(key) != list: increment collection[uid][key]
    • type(key) == list: increment collection[uid][k] for k in key
  • append(uid, key, value) Append a value to collection[uid][key] (which is assumed to be a list)

Access to targets information

The butler has a reference to the target manager, called targets. Each application may use a custom target manager, but in the case of SimpleTargetManager, which has functions:

  • set_targetset(exp_uid, targetset): Adds targetset to the targets database associated with the experiment with id exp_uid.
  • get_target_item(exp_uid, target_id): Get a particular item from the experiment's targets with id target_id.
  • get_targetset(exp_uid): Get the entire target set for the specified experiment.
  • get_target_mapping(exp_uid): Get the whole list of targets, ordered by target_id

Make sync/async calls

The butler also has a function for starting jobs: job(task_name, task_args, ignore_result=True, time_limit=0). This function will run the function named in task_name with arguments task_args. More precisely:

  • task_name is the string name of the function to be called. If job is invoked from an application function, then task_name must be the name of another application function. If from an algorithm function, then task_name must be the name of another algorithm function.
  • task_args is a string which is the JSON-serialisation of a dictionary containing the arguments to the function named in task_name
  • ignore_result is a boolean specifying whether this function should return the result of the function call (will turn non-blocking calls into blocking calls).
  • time_limit is an int specifying how many seconds the function should be allowed to run for, if the call is blocking. If it does not complete in the specified time limit, then the process running the function will raise an exception, and shortly thereafter be terminated at whatever stage of execution it happened to be in.

The choice between blocking/non-blocking behaviour is simple:

  • If you call butler.job from an application function, the call will be non-blocking unless ignore_result=False is specified.
  • If you call butler.job from an algorithm function, the call will be blocking.

Logging

Use butler.log. For example, butler.log("LogName", "This is a log about something").

These are visible through [next-url]:8000/experiment/<string:exp_uid>/logs and [next-url]:8000/experiment/<string:exp_uid>/logs/<log_type>.

These should be used instead of writing logs to disk everytime getQuery or processAnswer is called.

Clone this wiki locally