Skip to content
Scott Sievert edited this page Jan 16, 2017 · 15 revisions

How do I get many participants for some study?

We have used Mechanical Turk. We setup some machine (and obtain the URL), then direct them to this URL. Here they answer 50 questions, with no interactions with MTurk. At the end, we ask them to copy-paste their User ID (shown by default at the end of the study) back into MTurk. Using this, we can verify that they responded.

How do I specify the participant ID?

Instead of going to [next-url]:8000/.../[exp-uid], go to [next-url]:8000/.../[exp-uid]?participant=[id].

e.g., instead of http://localhost:8000/query/query_page/query_page/368de69569286ce0ba8a3f40b58a2a go to http://localhost:8000/query/query_page/query_page/368de69569286ce0ba8a3f40b58a2a?participant=scott

How do I access all the targets?

targets = butler.targets.get_targetset(butler.exp_uid)

How do I restart a machine I stopped on EC2?

  1. On EC2, restart the machine via Actions > Instance State > Start
  2. docker_login to your machine using the next_ec2.py script
  3. Run export NEXT_BACKEND_GLOBAL_HOST=ec2-...amazonaws.com
  4. Run docker-compose up.

How do I time every line in a function?

You add the lines

from decorator import decorator
from line_profiler import LineProfiler

@decorator
def profile_each_line(func, *args, **kwargs):
    profiler = LineProfiler()
    profiled_func = profiler(func)
    retval = None
    try:
        retval = profiled_func(*args, **kwargs)
    finally:
        profiler.print_stats()
    return retval

somewhere along with all your other imports at the beginning of the file, and then you can use @profile_each_line as a decorator for any function you wish to profile and you will get many statistics about how long each line takes to run, and which lines took what percentage of the total runtime, etc.

How do I include feature vectors with targets?

There are three options:

  1. NEXT accepts a list of dictionaries as targets. These dictionaries get stored in the butler and are accessible. This requires launching the experiment yourself (and writing any necessary scripts).
  2. Enforcing that feature vectors be passed in to your app in initExp can be done in the YAML. This requires developing your own app.
  3. We have also developed a feature to allow adding feature vectors to images to the examples in example/. The example below will illustrate adding feature vectors to an existing application, the primary empirical use case we have seen.

The third option in detail:

Advantages of this approach include using your algorithm with an existing application/framework. You can easily compare your algorithm with other algorithms. A new algorithm has a choice of paying attention to feature vectors or not; it's up to the developer of that algorithm.

We need to modify the file that launches the experiment on NEXT (e.g., examples/strangefruit/experiment_triplet.py). In this, if we include a key target_features in the experiment dictionary, feature vectors will be added by example/launch_experiment.py (note: only for images ending in .png or .jpg).

The dictionary we add will have keys of different filenames and values of the feature vector. i.e., the dictionary is the form of {filename: feature_vector}. We

experiment['primary_type'] = 'image'
target_zip = 'strangefruit30.zip'
experiment['primary_target_file'] = target_zip
experiment['target_features'] = {filename.split('/')[-1]: np.random.rand(2).tolist()  # tolist() because numpy array not serializable
                                 for filename in zipfile.ZipFile(target_zip).namelist()}
# filename.split above removes 'strangefuit/' from 'strangefruit/image.png'. Required for
# use of lauch_experiment.py (which the examples in next/examples use)

Note: This is only provides information on putting features in targets. It not give information on how to load feature vectors (although I would use np.load or scipy.io.loadmat).

Then to access these in myAlg.py, in initExp we include these lines:

import numpy as np

class myAlg:
    def initExp(self, butler, ...):
        targets = butler.targets.get_targetset(butler.exp_uid)
        feature_matrix = [target['feature_vector'] for target in targets]
        feature_matrix = np.array(feature_matrix)
        # ...

Launching an experiment takes a long time. How do I debug with this?

Do not run docker-compose rm, as it removes your containers. If you run docker-compose stop; docker-compose start, your experiment will remain (docker-compose stop is typically run via Cntrl-C). For more detail, see the wiki page on debugging.

Clone this wiki locally