Neocognitron Group

Breakout Atari Agent

Results of training on a GTX 1080 for 10 hours.

Model:

Our model consists of a Convolutional Neural Network with a preprocessed frame from Breakout of a (210, 160, 3) tuple => (84, 84) grayscale down-sized frame and a linear output size of 4 (no-op, fire, move left, move right) which gets reduced down to 3 (no-op, move left, move right) because (fire) in breakout is basically a no-op. The model uses the Adam optimizer with a logcosh, mean squared error, or huber loss function.

Requirements:

python3: need help installing? click here!
pip (python's package manager): need help installing? click here!

numpy==1.13.3
scikit_image==0.13.1
Keras==2.1.3
gym==0.9.5
gym[atari]
h5py==2.7.1
Pillow==5.1.0

or just do: pip install -r requirements.txt

You will also need pip install tensorflow-gpu==1.7.0 if you are using a GPU to train.

Python Components (located in src):

breakout.py: The main breakout game loop. Integrates with all of the components.
DQNAgent.py: The Deep Q Network Agent for learning the breakout game.
ReplayMemory.py: The Remembering and Replaying for the DQNAgent to learn.
hyperparameters.py: All of the hyperparameters
discrete_frames.py: Discrete frames into the model and memory. More memory footprint, more backpropogation steps.
sliding_frames.py: Sliding frames into the model and memory. Less memory footprint, less backpropagation steps.
utils.py: List of utility functions used by numerous components.

Breakout Main Loop:

'GAME' : 'BreakoutDeterministic-v4', # Name of which game to use
                                     # v1-4 Deterministic or Not

'DISCRETE_FRAMING' : True,     # 2 discrete sets of frames stored in memory

'LOAD_WEIGHTS' : '',           # Loads weights into the model if so desired
                               # leave '' if starting from a new model

'RENDER_ENV' : False,          # shows the screen of the game as it learns
                               # massivly slows the training down when True
                               # default: False

'HEIGHT' : 84,                 # height in pixels
'WIDTH'  : 84,                 # and width in pixels that the game window will get downscaled to
                               # defaults: 84, 84

'FRAME_SKIP_SIZE' : 4,         # how many frames we skip and and how many times we 
                               # choose an action consecutively for that many frames.
                               # default: 4

'MAX_EPISODES' : 12000,        # defined as how many cycles of full life to end life or
                               # winning a round
                               # default: 12,000

'MAX_FRAMES' : 50000000,       # max number of frames allowed to pass before stopping
                               # default: 50,000,000 (how many google used)

'SAVE_MODEL' : 500,            # how many episodes should we go through until we save the model?
                               # default: whenever you want to save the model

'TARGET_UPDATE' : 10000,       # on what mod epochs should we update the target network?
                               # default: 10000

DQNAgent:

'WATCH_Q' : False,             # watch the Q function and see what decision it picks
                               # cool to watch
                               # default: False

'LEARNING_RATE' : 0.00025,     # learning rate of the Adam optimizer
                               # default: 0.00025
    
'INIT_EXPLORATION' : 1.0,      # exploration rate, start at 100%
'EXPLORATION' : 1000000,       # how many frames we decay till
'MIN_EXPLORATION' : 0.1,       # ending exploration rate
                               # defaults: 1.0, 1,000,000, 0.1

'OPTIMIZER' : 'Adam',          # optimizer used
                               # default: RMSprop or Adam

'MIN_SQUARED_GRADIENT' : 0.01, # epsilon rate
                               # default: 0.01

'GRADIENT_MOMENTUM' : 0.95,    # momentum into the gradient used
                               # default: 0.95

'LOSS' : 'huber',              # can be 'logcosh' for logarithm of hyperbolic cosine
                               # or 'mse' for mean squared error
                               # or 'huber' for huber loss
                               # default: logcosh, mse, or huber
    
'NO-OP_MAX' : 30,              # how many times no-op can be called at the beginning
                               # of a single episode, reduces using the same state
                               # at the beginning and increases variance of similar states
                               # default: 30 (don't set this too high or we may lose before acting!)

Replay and Remember Memory:

'SHOW_FIT' : 0,                # shows the fit of the model and it's work, turn to 0 for off
                               # default: 0 for off

'REPLAY_START' : 50000,        # when to start using replay to update the model
                               # default: 50000 frames

'MEMORY_SIZE' : 1000000,       # size of the memory bank
                               # default: 1,000,000

'GAMMA' : 0.99,                # integration of rewards, discount factor, 
                               # preference for present rewards as opposed to future rewards
                               # default: 0.99
# 4 * 8 = 32 batch
'REPLAY_ITERATIONS' : 4,       # how many irerations of replay
                               # default: 4

'BATCH_SIZE' : 8               # batch size used to learn
                               # default: 8

Instructions:

To start the breakout game with the DQN Agent, run python3 breakout.py
To change how the DQN Agent learns, modify hyperparameters.py

Demo (located in demo):

To start the demo, run python3 DQN_Testing.py
Alternatively, there is a python notebook under DQN_Testing.ipynb which renders every 6 frames.

Name		Name	Last commit message	Last commit date
Latest commit History 165 Commits
demo		demo
report		report
src		src
stats		stats
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
Neural Nets.pdf		Neural Nets.pdf
Project Proposal.ipynb		Project Proposal.ipynb
ProjectMilestones (3).ipynb		ProjectMilestones (3).ipynb
README.md		README.md
breakout.gif		breakout.gif
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neocognitron Group

Breakout Atari Agent

Model:

Requirements:

Python Components (located in src):

Breakout Main Loop:

DQNAgent:

Replay and Remember Memory:

Instructions:

Demo (located in demo):

References:

About

Releases

Packages

Contributors 6

Languages

License

CSCI4850/S18-team1-project

Folders and files

Latest commit

History

Repository files navigation

Neocognitron Group

Breakout Atari Agent

Model:

Requirements:

Python Components (located in src):

Breakout Main Loop:

DQNAgent:

Replay and Remember Memory:

Instructions:

Demo (located in demo):

References:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages