BayesianBookworm: Unraveling Authorship with Bayesian Analysis

Overview

BayesianBookworm is an innovative text analysis tool that harnesses the power of Bayes' Theorem to determine the probable authorship of literary texts. Initially focusing on the works of Jane Austen and Charles Dickens, this project introduces a novel approach to authorship attribution.

📚 Current Functionality

Data Foundation

The program analyzes texts from the following novels, located in the Books/ directory:

Jane Austen: Emma (em), Pride and Prejudice (pp), Persuasion (pe), Sense and Sensibility (ss)
Charles Dickens: Great Expectations (ge), Hard Times (ht), A Tale of Two Cities (tc), Oliver Twist (ot)

📈 Word Frequency Analytics

A sophisticated dictionary maps word frequencies across these novels, forming the backbone for authorship prediction:

word_frequencies = {
    "officer": [220, 322]  # Austen: 220, Dickens: 322
}

🔍 Identifying the Author

The guess.py script employs this frequency data within a Bayesian framework to estimate the author of a given text passage.

🔮 Planned Enhancements

Incorporating More Authors: Broadening the scope to include various authors for a more comprehensive literary analysis.
Enhanced Algorithm Efficiency: Optimizing the processing capabilities for handling larger datasets.
User Interface Development: Crafting an intuitive interface for effortless user interaction and result visualization.

BayesianBookworm represents a groundbreaking step in literary analytics, merging statistical methods with classical literature to unveil the hidden patterns in authorial styles.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Books		Books
Probability.py		Probability.py
guess.py		guess.py
readme.md		readme.md
words.py		words.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BayesianBookworm: Unraveling Authorship with Bayesian Analysis

Overview

📚 Current Functionality

Data Foundation

📈 Word Frequency Analytics

🔍 Identifying the Author

🔮 Planned Enhancements

About

Releases

Packages

Languages

jcgonzalez25/BayesianBookworm

Folders and files

Latest commit

History

Repository files navigation

BayesianBookworm: Unraveling Authorship with Bayesian Analysis

Overview

📚 Current Functionality

Data Foundation

📈 Word Frequency Analytics

🔍 Identifying the Author

🔮 Planned Enhancements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages