NLP2021 Course Project
To install requirements:
pip install bert4keras
The pre-trained Chinese Bert has been downloaded at ./chinese_L-12_H-768_A-12.
At ./data. Train, Validation and test data has been splitted, in json form.
We provide 2 versions. Correlation_basic and Correlation_mlm.
Directly using pre-trained model and calculating character similarity. The weight of three kinds of metrics of similarity are hyperparameters.
python correlation_basic.py
Use train_json to do fine tuning.
python correlation_mlm.py
if __name__ == '__main__':
text = '专家公步虎门大桥涡振原因'
result = text_correction(text)
print(result)