Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does DETM work on short texts like tweets and does it work on corpus in other languages? #4

Open
joeylige opened this issue Aug 30, 2020 · 10 comments

Comments

@joeylige
Copy link

Thanks for your sharing.
I wish to conduct an analysis on the topic changes on twitter. I wonder whether DETM is suitable for doing this.

@mona-timmermann
Copy link

mona-timmermann commented Aug 31, 2020

You can certainly try! Maybe run ETM first to see how that performs as it runs much faster than D-ETM and the code is easier to understand. If that works well and if you then have a feeling of how to set num_topics, try D-ETM next :)

@mona-timmermann
Copy link

Other languages should be fine if you use pre-trained embeddings in that language or train Word2Vec yourself.

@joeylige
Copy link
Author

Other languages should be fine if you use pre-trained embeddings in that language or train Word2Vec yourself.
Thanks for the advice!

@Emekaborisama
Copy link

Thanks for your sharing.
I wish to conduct an analysis on the topic changes on twitter. I wonder whether DETM is suitable for doing this.

Awesome. let me know how its goes

@Emekaborisama
Copy link

You can certainly try! Maybe run ETM first to see how that performs as it runs much faster than D-ETM and the code is easier to understand. If that works well and if you then have a feeling of how to set num_topics, try D-ETM next :)

Pls share the link to ETM

@mona-timmermann
Copy link

D-ETM was based on ETM but added the temporal evolution of topics as a feature. ETM: https://github.com/adjidieng/ETM

@mona-timmermann
Copy link

However, there are also models designed specifically for short texts which might work better

@joeylige
Copy link
Author

However, there are also models designed specifically for short texts which might work better

Thank you for your reply. I find that there are models for short text analysis like Biterm Topic Model and models for topic changes like Dynamic Topic Model but fail to find an appropriate method to obtain topics changes in short texts, cuz I don't have much experience in NLP…If you could give me some advice, I will be very grateful!

@mona-timmermann
Copy link

mona-timmermann commented Oct 10, 2020

Personally I don't really know of anything similar to D-ETM that works better on short texts. A different way to look at (e.g., popularity of) topics over time is doing a post-hoc analysis using the topics assigned to documents and their timestamps. Maybe check out the model topics over time, too. But these approaches answer different questions than D-ETM . Have you tried D-ETM and does it not work?

@espoirMur
Copy link

However, there are also models designed specifically for short texts which might work better

what are those model , can you share them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants