Node2vec is an algorithmic framework for representational learning on graphs. Given any graph, it can learn continuous feature representations for the nodes, which can then be used for various downstream machine learning tasks. Based on PGL, we reproduce node2vec algorithms and reach the same level of indicators as the paper.
The datasets contain two networks: BlogCatalog and Arxiv.
- paddlepaddle>=1.4
- pgl
For examples, use gpu to train gcn on cora dataset.
# multiclass task example
python node2vec.py --use_cuda --dataset BlogCatalog --save_path ./tmp/node2vec_BlogCatalog/ --offline_learning --epoch 400
python multi_class.py --use_cuda --ckpt_path ./tmp/node2vec_BlogCatalog/paddle_model --epoch 1000
# link prediction task example
python node2vec.py --use_cuda --dataset ArXiv --save_path
./tmp/node2vec_ArXiv --offline_learning --epoch 10
python link_predict.py --use_cuda --ckpt_path ./tmp/node2vec_ArXiv/paddle_model --epoch 400
- dataset: The citation dataset "BlogCatalog" and "ArXiv".
- use_cuda: Use gpu if assign use_cuda.
Dataset | model | Task | Metric | PGL Result | Reported Result |
---|---|---|---|---|---|
BlogCatalog | deepwalk | multi-label classification | MacroF1 | 0.250 | 0.211 |
BlogCatalog | node2vec | multi-label classification | MacroF1 | 0.262 | 0.258 |
ArXiv | deepwalk | link prediction | AUC | 0.9538 | 0.9340 |
ArXiv | node2vec | link prediction | AUC | 0.9541 | 0.9366 |