Skip to content

Basic Algorithms

Scott Sievert edited this page Feb 17, 2017 · 1 revision

Basics

Recall that an algorithm consists of 4 basic functions:

  • initExp: Receive the initial parameters with which to run an experiment
  • getQuery: Get the next question to ask
  • processAnswer: Receive an answer to a question
  • getModel: Provide its guess for the answer using the information it has received thus far

When we're writing an algorithm,

  • getQuery will be the function in which we have the code to cleverly select the next query to ask about (e.g. to maximise the amount of information acquired from the query).
  • processAnswer will be the function that is called when an answer comes back from the user. In this function, we will store the answer and may possibly perform some computation to help improve our guesses about future classifications.
  • getModel will return the classifier (appropriately specified) as best we can compute it from the data obtained thus far
  • initExp will be called at the start of the experiment.

If a script exists on your own machine that simulates a user responding, it may very roughly look like

## Arguments to all functions very rough; not exact at all
# init the algorithm. Set/compute all variables
initExp()

# the user will answer 100 questions
for n in range(100):
    query = getQuery()  # ask the user a question. can use all previous queries.
    reward = random_reward(query)  # this is handled by the "app" (e.g., DuelingBandits)
    processAnswer(reward)  # maybe update variables, etc

# view the results (in NEXT, this is on the dashboard)
getModel()

This is how converting simulation code to NEXT code goes; do this step first.

Debugging note

You can use utils.debug_print to print when docker_login'd. This prints in yellow, and optionally any color.

Arguments

note: use other algorithms inside your app as an example!

The algorithm code is as straightforward as possible, and can simply represent the core mathematical functions that are being tested or compared. This means that:

  • algorithms only ever see target indices, not targets (i.e., algorithms receive that user i pulled arm j with reward k -- not that user Apq56avba though "shoe43.png" was "80% chance of purchase"). (this depends on the app developer, but this is the case for the default apps)
  • the inputs and outputs are specified exactly in Algs.yaml (but optional parameters can exist). This means that you only need to make some mapping from inputs to outputs -- NEXT doesn't care how you do it.

Now, there does have to be a layer of code to interface between the relatively unconstrained algorithm code and the NEXT system, but discussed more on Framework-Apps.

Initializing with algorithm specific parameters

In the example launch script (e.g., next/examples/cartoon_dueling), in alg_list I have the following code (not exact; some names changed):

if alg_id == 'ValidationSampling':
    alg_item['params'] = {'num_tries': 15}

In myAlg.init, I can call butler.algorithm.get() to get the full list of parameters I initialized with in the script. i.e., this work: params = butler.algorithms.get()['params'].

Extra parameters not specified in Algs.yaml

The argument checking we use (pijemont) supports many types (dict, str, num, etc). It also supports the types "any", "anything" and "stuff" which mean an arbitrary type. When init'ing the experiment, you can some parameter of type "anything" to Algs.yaml and then include it when launching the experiment in a initExp['args']['alg_list'] item.

Of course, a parameter might not be changed by the user of your algorithm and only by you, the algorithm developer. It's up to you how you want to do this; more defaults in the appropriate functions might be a good call (globals probably aren't a good call).

Exact arguments

Every algorithm takes in

  1. A butler object. Use butler.algorithms.set(key='foo', value='bar') and butler.algorithms.get(key='foo) to set/get but more description can be found on Butler-API.
  2. arguments specified in Algs.yaml as keyword arguments.

Example

Let's see a quick example (we'll do a more in-depth later on this wiki page).

initExp:
  args:
    n:
      description: Number of items the user has to choose from.
      type: num
    params:
      description: Possibly algorithm specific parameters.
      type: any
      optional: true
  returns:
    type: bool
    description: A boolean indicating success.
    values: true

getQuery:
  returns:
    type: num
    description: The index of the target we're about to ask about
    
processAnswer:
    args:
      reward:
        type: num
        description: How much reward the user gave the target (0 or 1)
    returns:
      type: bool
      description: A boolean indicating success.
      values: true

getModel:
  returns:
    type: dict
    values:
      means:
        description: The mean score of each target
        type: list
        values:
          type: num

the corresponding code would look something like below. The following code snippet is meant to highlight

  • how arguments and the YAML work together.
  • use of the butler. I use butler.algorithms.set and butler.algorithms.get freely and with a list of keys.
class FooAlg:
    def initExp(self, butler, n, params):
        # setup variables needed later. I can set any value with this.
        butler.algorithms.set(key='ask_about_this_target', value=42)
        return True
    def getQuery(self, butler):
        # either approach of defining args is acceptable (Python 2.7 allows either)
        # ...
        return butler.algorithms.get(key='ask_about_this_target')
    def processAnswer(self, butler, reward):
        # I may update variables stored in the butler here
        # ...
        return True
    def getModel(self, butler):
        # when type(key) == list, butler.algorithms.get returns a dictionary
        dict_ = butler.algorithms.get(key=['means', 'precision'])
        return dict_['means']
Clone this wiki locally