Content-based article clustering using scikit-learn
Cluster 1:
Terms: [u'boat', u'strength', u'race', u'rowing', u'peter']
kettlebells.txt
paul-graham-companies.txt
pavel-training.txt
rowing-1.txt
rowing-2.txt
rowing-3.txt
Cluster 2:
Terms: [u's6', u'samsung', u'phone', u'surface', u'galaxy']
alcatel-onetouch-review.txt
galaxy-s6-review.txt
galaxy-s6.txt
htc-one-review.txt
microsoft-surface-review.txt
Cluster 3:
Terms: [u'django', u'author', u'git', u'application', u'environment']
django-rest-framework.txt
starting-django-project.txt
web-best-practices.txt
Cluster 4:
Terms: [u'facebook', u'social', u'news', u'mobile', u'million']
facebook-1.txt
facebook-2.txt
facebook-3.txt
facebook-hello.txt
facebook-whatsapp.txt
google-fi.txt
secret-shutting-down.txt
twitter-earnings.txt
Cluster 5:
Terms: [u'python', u'perl', u'lisp', u'programming', u'language']
fall-of-perl.txt
lisp.txt
pick-up-python.txt
Cluster 6:
Terms: [u'snowden', u'climate', u'capital', u'venture', u'species']
climate-change-1.txt
climate-change-2.txt
karl-marx.txt
marc-andreessen.txt
snowden.txt