Wikipedia Mining

Readme en français

Presentation

This project aim at analysing a french Wikipedia Dump, using two different approaches :

text-mining : building a vector representation of the corpus, using well-known VSM and word embedding method.
graph-mining : build an atlas based on the cross references.

Installation

Prerequisites

Before installing the project, you'll need

Maven (version > 3.3.*) - Web site
Java (hereby, Java 8) - Web site

You can check your current versions of the two softwares using the linux commands :

mvn --version
java -version

Building

Building the Maven project :

mvn clean install

Authors

ArcToScience Team, M2 Data Mining, University Lyon 2, France :

Antoine Gourru - GitHub - Web site
Erwan Giry-Fouquet - GitHub - Web site

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src/main		src/main
.gitignore		.gitignore
LICENCE.md		LICENCE.md
README.FR.md		README.FR.md
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wikipedia Mining

Presentation

Installation

Prerequisites

Building

Authors

License

About

Releases

Packages

Languages

License

AntoineGourru/wikipedia-mining

Folders and files

Latest commit

History

Repository files navigation

Wikipedia Mining

Presentation

Installation

Prerequisites

Building

Authors

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages