Skip to content

blwsk/article-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Article clustering

Content-based article clustering using scikit-learn

Example output

Cluster 1:
Terms: [u'boat', u'strength', u'race', u'rowing', u'peter']
	kettlebells.txt
	paul-graham-companies.txt
	pavel-training.txt
	rowing-1.txt
	rowing-2.txt
	rowing-3.txt

Cluster 2:
Terms: [u's6', u'samsung', u'phone', u'surface', u'galaxy']
	alcatel-onetouch-review.txt
	galaxy-s6-review.txt
	galaxy-s6.txt
	htc-one-review.txt
	microsoft-surface-review.txt

Cluster 3:
Terms: [u'django', u'author', u'git', u'application', u'environment']
	django-rest-framework.txt
	starting-django-project.txt
	web-best-practices.txt

Cluster 4:
Terms: [u'facebook', u'social', u'news', u'mobile', u'million']
	facebook-1.txt
	facebook-2.txt
	facebook-3.txt
	facebook-hello.txt
	facebook-whatsapp.txt
	google-fi.txt
	secret-shutting-down.txt
	twitter-earnings.txt

Cluster 5:
Terms: [u'python', u'perl', u'lisp', u'programming', u'language']
	fall-of-perl.txt
	lisp.txt
	pick-up-python.txt

Cluster 6:
Terms: [u'snowden', u'climate', u'capital', u'venture', u'species']
	climate-change-1.txt
	climate-change-2.txt
	karl-marx.txt
	marc-andreessen.txt
	snowden.txt

About

Content-based article clustering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages