Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative coherence on short texts #29

Open
elbadma opened this issue Mar 17, 2021 · 1 comment
Open

Negative coherence on short texts #29

elbadma opened this issue Mar 17, 2021 · 1 comment

Comments

@elbadma
Copy link

elbadma commented Mar 17, 2021

Hi, I saw that one can use DETM on short texts. I tried ETM on short texts (each text contains only one sentence) and it seemed to work. However, the coherence score became negative. How should I interpret it? Does lower coherence always mean worse? Or do scores closer to 0 mean worse?
Whenever I try ETM on normal-length texts (consisting of more than one sentence), the coherence is always positive, so I assume that negative coherence is caused by short length

@silviatti
Copy link

Hi! Coherence is computed as the normalized pointwise mutual information, which ranges between -1 and 1. That means scores lower than 0 are fine. It usually happens in the case of short-text documents because documents are much sparser and words co-occur less frequently. Just make sure not to compare coherences computed on two different datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants