Skip to content

The Heracles framework for developing and evaluating text mining algorithms

License

Notifications You must be signed in to change notification settings

KSchouten/Heracles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Heracles

The Heracles framework for developing and evaluating text mining algorithms

Heracles is a Java software framework intended to help develop and evaluate text mining algorithms. It is the software that I use to conduct my experiments and it might be of use to other developers.

Key features:

  • Layered structure
  • Very generic data model that should fit any text mining task
  • Wrappers for Stanford CoreNLP
  • Link to Weka toolkit for machine learning algorithms
  • Proper evaluation methods, including cross-validation, testing algorithms side-by-side, and t-tests for statistical significance

Installation

The Heracles framework is developed as an Eclipse Java project, so after cloning, you can just import it as an existing project into your Eclipse workspace. Be aware that this project uses Java 8, so make sure you have that installed and that you have an Eclipse version recent enough to support this.

You can use the Git functionality from within Eclipse if desired, as Eclipse will detect this to be a git-based project. Maven is used to automatically link the project to all the necessary external libraries and Eclipse will start downloading the libraries as soon as you add it to the workspace. This might take some time, depending on your connection.

Included Algorithms

The framework currently includes the code and resources for "Ontology-Driven Sentiment Analysis of Product and Service Aspects" @ ESWC2018. You can find the ontology here inside the repository and the presentation slides can be downloaded here. The data files that these algorithms work with can be obtained from the SemEval-2015 and SemEval-2016 sites.


You contact me by sending an email to the address mentioned in the ESWC paper.

About

The Heracles framework for developing and evaluating text mining algorithms

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages