comment-analysis

Implementation of a comment classifier using Machine Learning presented in the AI discipline discipline of computer science course.
Created a comment classifier using Machine Learning with the Naive Bayes algorithm, with the objective of classifying comments in two classes, Positive (1) and Negative (0). For more details visit blog(in portuguese)

Usage

First you must have Clojure and Leiningen installed Clojure and Leiningen.

In a directory in the root of the project called 'comments' you have the files where the comments are that will be used for training separated in two files.

CommentsNegative
CommentsPositive

The comments are in the following format

"Coments are;0"

Where the ';' is the delimiter between the comment and its class
0 - Is class Negative
1 - Is class Positive

In the tests performed using 1317 positive and negative comments, ~70% for training and ~30% for testing, I obtained the following results.

{
    :average-cost 0.0,
    :incorrect 282.0,
    :roc-area {
        :0 0.7298043566982961,
        :1 0.7350174727068667
    },
    :false-positive-rate {
        :0 0.4696969696969697,
        :1 0.24242424242424243
    },
    :unclassified 0.0,
    :sf-entropy-gain -31651.27976575737,
    :kb-mean-information 0.2968905922187232,
    :kb-information 235.1373490372288,
    :percentage-incorrect 35.60606060606061,
    :root-relative-squared-error 105.09300991727571,
    :precision {
        :0 0.6172839506172839,
        :1 0.6862745098039216
    },
    :error-rate 0.3560606060606061,
    :percentage-unclassified 0.0,
    :recall {
        :0 0.7575757575757576,
        :1 0.5303030303030303
    },
    :correlation-coefficient {
        :nan Can't compute correlation coefficient: class is nominal!
    },
    :mean-absolute-error 0.35087054809357776,
    
    :summary 
    Correctly Classified Instances         510               64.3939 %
    Incorrectly Classified Instances       282               35.6061 %
    Kappa statistic                          0.2879
    Mean absolute error                      0.3509
    Root mean squared error                  0.5255
    Relative absolute error                 70.1741 %
    Root relative squared error            105.093  %
    Coverage of cases (0.95 level)          84.0909 %
    Mean rel. region size (0.95 level)      68.8763 %
    Total Number of Instances              792,

    :kb-relative-information 23513.73490372288,
    :false-negative-rate {
        :0 0.24242424242424243,
        :1 0.4696969696969697
    },
    :relative-absolute-error 70.17410961871555,
    :root-mean-squared-error 0.5254650495863785,
    :sf-mean-entropy-gain -39.963737077976475,
    :evaluation-object #object[weka.classifiers.Evaluation 0x687a762c weka.classifiers.Evaluation@687a762c],
    
    :confusion-matrix === Confusion Matrix ===
    a   b   <-- classified as
    300  96 |   a = 0
    186 210 |   b = 1,
    
    :kappa 0.28787878787878785,
    :f-measure {
        :0 0.6802721088435375,
        :1 0.5982905982905983
    },
    :percentage-correct 64.39393939393939,
    :correct 510.0
}

Considerations

I observed that some comments with class "Negative" were positive comments and the opposite also happened, this may cause some inconsistencies in the algorithm, which may cause poor performance.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.idea		.idea
comments		comments
src/comment_analysis		src/comment_analysis
test/comment_analysis		test/comment_analysis
.directory		.directory
.gitignore		.gitignore
.hgignore		.hgignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
comment-analysis.iml		comment-analysis.iml
project.clj		project.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

comment-analysis

Usage

Considerations

About

Releases

Packages

Languages

License

Jciel/comment-analysis

Folders and files

Latest commit

History

Repository files navigation

comment-analysis

Usage

Considerations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages