Skip to content

Implementing QLearning in Java as part of Artificial Intelligence

Notifications You must be signed in to change notification settings

mgovilla/QLearning-Java

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cs4341-a4

Mayank Govilla, Mago Sheehy, Molly Sunray

Group 30

To run the program, either open the project in IntelliJ and use the run configurations to pass in arguments or use the jar file: java -jar qlearn-1.0.jar "boards/sample.txt" 1.3 0.9 -0.05

the first argument is the file path to the board, the second argument is the time in seconds to learn, the third argument is the probability of moving in the desired direction, and the fourth argument is the reward for each action.




Pseudocode:

class Position {x: int , y: int} Enum ACTION: {UP, DOWN, LEFT, RIGHT}

Main:

  • read input parameters

GridWorld

  • World : int[][] // mostly 0s with some non-zero values that represent the terminal states
  • startingLocation:
  • getValue(Position): int
  • getRandomPos(): Position
  • getNextState(action, previousAgent) : Agent

Agent

  • Position : position
  • chanceOf!Deflection: double // probability of deflection
  • getActualDirection (based on probability)

Q Learning Algo static class QInput: (Position, action)

  • QTable: Map<QInput, double>
  • chooseNextMove(): ACTION
  • train(Board: GridWorld, time: int, rewardFunction: (s)->int)
  • returnPolicy()

chooseNextMove(Position): - get the 4 values from the Q table that correspond to - [Q(s, UP), Q(s, RIGHT), Q(s, DOWN), Q(s, LEFT)] - return best action

train: - loop while time - a = new Agent(randomPos) - while not terminal - move = chooseNextMove() - s_prime = a.getNextState(move) - val = getValue(s_prime) - r(s) = rewardFunction(val) - alpha = 0.1 or something to do with time - QTable[{a.position, move}] = QTable[{a.position, move}] + alpha(r(s) + discount*max(QTable[{s_prime.position, m}] for m in moves) - QTable[{a.position, move}]) - if val != 0: break - a = s_prime

About

Implementing QLearning in Java as part of Artificial Intelligence

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages