You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering if there is (algorithmically) any difference between your C++ implementation of Node2Vec and the original reference implementation from Stanford. For instance, I saw something about subsampling frequent nodes in the C++ code.
I'm asking because I tried a few other Node2Vec implementations, including the one in pytorch_geometric, and have had trouble replicating the good performance of your C++ code. So I was wondering if there is anything missing in those.
The text was updated successfully, but these errors were encountered:
Can you link the original implementation you are mentioning explicitly? I have reimplemented the official Python code that used the Gensim library underneath. The subsampling actually comes from word2vec implementation in Gensim (for that matter, it's standard word2vec stuff). I have provided some subsampling analysis in Section 3.6 of the VERSE paper. I believe the subsampling is rather key, as it effectively shifts the positive pair distribution.
ok, that helps a lot! I didn't look into the Gensim implementation at all.
The two implementations I tried were this, based on Gensim, so it should be fine and the pytorch_geometric here which I'm pretty sure doesn't do any subsampling (unless that is done while sampling the walks).
Yeah, I tried to keep it close to Python implementation (to be honest, there is not much in Python code per se, I was just reimplementing Gensim). I believe this is still one of the fastest implementations available, since I generate the walks on-the-fly and store the precomputed walk probabilities efficiently.
I was wondering if there is (algorithmically) any difference between your C++ implementation of Node2Vec and the original reference implementation from Stanford. For instance, I saw something about subsampling frequent nodes in the C++ code.
I'm asking because I tried a few other Node2Vec implementations, including the one in pytorch_geometric, and have had trouble replicating the good performance of your C++ code. So I was wondering if there is anything missing in those.
The text was updated successfully, but these errors were encountered: