This project tags sentences in Twitter statuses by part of speech. It tags with the Penn Treebank categories for part of speech (I have also tested it with a simpler 14-tag system). It uses a perceptron algorithm to train and predict.
Explanatory paper and citations in writeup. The main perceptron algorithm is in perceptron. Data and files for reading data are in data.