Skip to content

DevOverview

Scott Sievert edited this page Feb 17, 2017 · 1 revision

NEXT Overview

NEXT is a framework designed for running active learning experiments -- that is, machine learning experiments that cleverly decide what question to ask next. NEXT can also be used to run passive experiments.

NEXT lets you define your interface how ever you want. NEXT has been applied to bandit problems and triplet problems. NEXT makes no assumptions about what type of algorithm you're using.

NEXT Architecture

Applications: The fundamental notion in NEXT is an application, which corresponds to a particular machine learning task such as classification or ((TODO: another example)).

Algorithms: The particular mathematical computations that an application uses to choose the next question to ask are contained in that application's various algorithms. For example, an implementation of active SVM could be an algorithm for the NEXT classification application.

Experiments: After you get NEXT is running, you can then start up an experiment which is an instance of a particular application that can serve queries and receive and process responses to those queries. For example, one running experiment might serve pages that look like this:

The five functions: To this end, every application supports five functions:

  • initExp: This function starts a new experiment using supplied initial parameters (e.g. the objects we are classifying and the list of possible classes)

  • getQuery: This function returns the next question that the user should answer

  • processAnswer: This function is called when the user answers a question

  • getModel: This function returns the current best guess for the final answer (whatever that means for a given application) using the information it has received thus far (e.g. the mathematical representation of the classifier that has been learned thus far).

  • getStats: This function is used to access any information that was stored during the course of the experiment, such as timing information (to help analyze whether the system is being overloaded with too many queries), the current state of the predictive model generated from the experiment (e.g. the current guess for the classification of all the objects), as well as other information specific to the application.

Typically, the running of an experiment will go:

  • Experimenter calls initExp with the appropriate parameters to kick off their experiment.

  • An experiment participant makes a getQuery call (usually by going to a webpage) which gives them a question.

  • They answer the question (usually by clicking something on the webpage) which makes a processAnswer call to send back the answer to NEXT, and then makes another getQuery call.

  • This is repeated across several users (possibly in parallel).

  • When the experiment is over, the experimenter calls getModel to see the result (e.g. the classifier that has been learned from the queries). They may also call getStats (usually by going to a "dashboard" webpage (described elsewhere)) which will display statistics about the experiment including timing information, how many queries were answered, and, e.g., how the classifier is classifying all the objects in the experiment.

API: Once NEXT is running on a machine, you can use the NEXT API to call its various functions, including the five functions above for any given application as well as other functions for checking system status, the list of currently running experiments, the raw data collected from any given experiment, and more. These are documented in the Interface

NEXT system overview

In the top-level NEXT directory, you find five folders:

  • /ec2: This folder contains scripts that will take your Amazon AWS credentials and start a machine on EC2 running NEXT. You can learn how to use these in AMI-launch and EC2-launch.

  • /examples: This follder contains scripts that will take in the hostname of a machine running NEXT (such as a machine started with the scripts mentioned above). We provide the basics of launching in Launch-Basics.

  • /apps: This folder contains all of the applications built into NEXT. If you want to add an algorithm, you will put it in a new directory inside this folder. Information about how to develop applications is found in the Framework-Apps.

  • /next: NEXT internals live here. For example, the NEXT API is defined in /next/api. To learn about the API functions in NEXT, use the API.

  • /local: This folder contains some experimental scripts for launching NEXT on a non-EC2 machine (with the right permissions, dependencies installed, etc.)

Built-in NEXT applications

As described above, the /apps directory contains all the available applications built into NEXT. These are ((TODO: update descriptions))

  • PoolBasedTripletMDS: "Here is an object (image, text, whatever); which of these other two objects is more similar to it?"

  • PoolBasedBinaryClassification: "Here is an object and a description. Choose 'yes' or 'no' according to whether the object satisfies the description (for example, "Here are a bunch of pictures; say whether each one has a cat in it").

  • CardinalBanditsPureExploration: "Here is an object; rate it on a scale of (e.g.) 1-10"

  • DuelingBanditsPureExploration: "Here are two objects; pick one of them (according to some specified rule)."

For example, the PoolBasedBinaryClassification application is contained in /apps/PoolBasedBinaryClassification/. The algorithms for this application are contained in /apps/PoolBasedBinaryClassification/algs/.

  • If you want to write a new algorithm for this or another existing application, use the Basic-Algorithms. For more detail, see Framework-Algs.
  • If you want to create a brand new application or to heavily modify an existing application, see the New-Application. For more detail, see Framework-Apps.
Clone this wiki locally