Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploring Semantic Spaces (E2) - Hamilton et al 2016 #29

Open
HyunkuKwon opened this issue Apr 7, 2020 · 1 comment
Open

Exploring Semantic Spaces (E2) - Hamilton et al 2016 #29

HyunkuKwon opened this issue Apr 7, 2020 · 1 comment

Comments

@HyunkuKwon
Copy link

HyunkuKwon commented Apr 7, 2020

Post questions about the following exemplary reading here:

Hamilton, William H., Jure Leskovec, Dan Jurafsky. 2016. “Diachronic Word Embeddings
Reveal Statistical Laws of Semantic Change
” arXiv preprint arXiv:1605.09096.

@HyunkuKwon HyunkuKwon changed the title Exploring Semantic Spaces (E2) Hamilton et al 2016 Exploring Semantic Spaces (E2) - Hamilton et al 2016 Apr 7, 2020
@wanitchayap
Copy link

wanitchayap commented May 13, 2020

This is less of a question about the paper itself (I think it is a fascinating and well-designed study) but more of a question about the possible application of how they quantified polysemy in this paper.

I asked in this week's orientation paper about how could word embedding model deals with polysemy using unsupervised methods. Do you think how the authors used PPMI to create words' network then measured the clustering coefficient to infer the degree of polysemy is a good solution for polysemy problem in word embedding model? For example, we know from PPMI network that rock is highly polysemy, and thus, we can train the embedding model that separates rock into three different word tokens (e.g. separate vectors for rock_music, rocking, androck_geology).

However, I am not sure how can we know they're supposed to be 3 different senses related to rock for the PPMI network. In addition, we need to be able to distinguish each rock into the 3 senses in the texts for the embedding model's training. Would the PPMI network alone be sufficient to do such task? We also need to decide the cutoff in the clustering coefficient to determine when a word is polysemy enough to treat it as different vectors in the model.

In addition, I am not sure how to deal with contextually diverse discourse function words (e.g. also) that PPMI network would treat as highly polysemy. I think it makes sense in this paper context to treat these function words as highly polysemy. However, I don't think we should have different vectors for also in different contexts.

In short, do you think this polysemy measure the authors use could be a good starting point to deal with polysemy in the word embedding model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants