A Deep Reinforcement Learning Parser for UCCA, course final project, COSC-689
-
Run
pip install -r requirement.txt
. -
Run
python setup.py install
. -
Check README instructions on data directory page.
You can test the result of our ready trained model directly, or you can first complete all steps following this one to get your own trained model.
Use the name of any xml file under data/raw/test-xml
directory (including the path) to run:
python policyTester.py <filename>
Use the following to produce reward function training data:
python passage2oracles.py
It will produce and store a json binary file.
This will take a long time, and you can omit this step and test the rest of the code using rwdFuncTrainData_smal_sample_set.json
, a small sample of the training data.
Train the reward function:
python rewardNN.py
This will store some files containing the final state of the model at the end of the running.
The environment is set in drl_ucca
folder, with the trained and stored reward function model plugged in.
The Reinforcement Learning part, policyTrainer.py
, will use this environment as a black box.
Use:
python policyInitializer.py # This is to initialize the weights of our trainable parameters better than random
python policyTrainer.py # The silence mode
python policyTrainer.py -e # This will output all oracles predicted
Used code from https://github.com/danielhers/tupa.