forked from dbpedia-spotlight/dbpedia-spotlight
-
Notifications
You must be signed in to change notification settings - Fork 0
Lucene Architecture
Sandro edited this page Jul 26, 2013
·
1 revision
The DBpedia Spotlight Architecture is composed by the following modules:
- Web application, a demonstration client (HTML/Javascript interface) that allows users to enter/paste text into a Web browser and visualize the resulting annotated text.
- Web Service, a RESTful/SOAP? Web API that exposes the functionality of annotating and/or disambiguating entities in text.
- Annotation Java/Scala API, exposing the underlying logic that performs the annotation/disambiguation.
- Indexing Java/Scala API, executing the data processing necessary to enable the annotation/disambiguation algorithms used.
- Evaluation module, where we test disambiguators, log results and use those to train our system to perform better.
External dependencies:
- DBpedia Extraction Framework, (only for the index module) extracting the necessary data from the Wikipedia dumps.
- Lucene 3.6, providing the low level indexing framework used by DBpedia Spotlight.
- LingPipe 4.0.0, providing the string matching implementation used for the Spotter module.
- Java 1.6+
- Scala 2.9+
- Spotlight JAR
- Spotlight Library JARs
- Lucene disambiguation index
- Spotter dictionary
- large RAM to set the heap size big enough for the Spotter (approx. 8G)
- Maven 3 for the automatic installation of dependencies.
- Indexing Java/Scala API, executing the data processing necessary to enable the annotation/disambiguation algorithms used.
Project
- Introduction
- Glossary
- User's manual
- Web application
- Installation
- Internationalization
- Licenses
- Researcher
- How to cite
- Support and Feedback
- Troubleshooting
- Team
- Acknowledgements
Statistical backend
Lucene backend
- Introduction
- Downloads
- Architecture
- Internationalization
- Web service parameters / API
- Splitting occurrences into topics
Developers