Skip to content

kirel/political-affiliation-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

political-affiliation-prediction

Setup

Install virualenv(-wrapper).

mkvirtualenv political-affiliation-prediction
# or
workon political-affiliation-prediction
pip install -r requirements.dev.txt

and for the frontend

Preparation

Get German Parliament discussions

In order to train a classifier we download, parse and Bag-of-word transform the speeches and discussions of the German Bundestag from http://www.bundestag.de/plenarprotokolle'

python downloader.py --download --parse

Download Newspaper articles

Downloads news articles on politics landing page of various german newspapers and computed pairwise distances between them

python newsreader.py --download --distances

The results are stored in a distances-xxx.json file, but also in distances.json to be served to the web gui.

Run

Start server with pretrained classifier

DEBUG=1 NOJOBS=1 python api.py

Start the frontend development server

cd web
dotenv bundle exec middleman

REMARK: The frontend needs the backend running!

Test

with party-specific buzzwords

curl --data "text=angriffskrieg" 127.0.0.1:5000/predict
curl --data "text=reiche banken" 127.0.0.1:5000/predict
curl --data "text=herdprämie" 127.0.0.1:5000/predict
curl --data "text=sicherheit" 127.0.0.1:5000/predict

Test with some newspapers

curl --data "url=http://www.zeit.de/politik/ausland/2015-04/iran-verhandlungen-einigung" 127.0.0.1:5000/predict
curl --data "url=http://www.sueddeutsche.de/politik/atomverhandlungen-in-lausanne-das-sind-die-eckpunkte-der-einigung-mit-iran-1.2421243" 127.0.0.1:5000/predict
curl --data "url=http://www.faz.net/aktuell/politik/ausland/europa/einigung-in-lausanne-durchbruch-bei-verhandlungen-ueber-iranisches-atomprogramm-13520160.html" 127.0.0.1:5000/predict

Retrain classifier

If no classifier.pickle file is in the model folder, the classifier will be retrained The classifier can be explicitly retrained, as e.g. in a python shell:

from classifier import *
clf = Classifier(folder='model',train=True)

Check classifier by cross-validation

In python shell:

from example import test_with_nested_CV
test_with_nested_CV(folder='model')

Visualization & Exploration

$ ipython notebook

Then check the notebooks i.e. visualization.ipynb

Deployment

First build the web project

cd web && dotenv bundle exec middleman build

The build the container

docker build -t kirel/political-affiliation-prediction .
docker push kirel/political-affiliation-prediction

To test the container locally:

docker run -it --rm -p 5000:5000 kirel/political-affiliation-prediction

To deploy:

pip install -r requirements.dev.txt
ansible-playbook ansible/deploy.yml -i ansible/inventory

When the registry fails

docker save kirel/political-affiliation-prediction | bzip2 | pv | ssh [email protected] 'bunzip2 | sudo docker load'

License

Copyright (c) 2015 Daniel Kirsch, Felix Bießmann, released under the MIT license, see LICENSE

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published