classgraphic

Interactive classification diagnostic plots for scikit-learn.

We classify things for the purpose of doing something to them. Any classification which does not assist manipulation is worse than useless. - Randolph S. Bourne, "Education and Living", The Century Co (April 1917)

Major features:

Plotly based tables for:

class_imbalance_table
classification_table
confusion_matrix_table
describe (dataframe stats)
prediction_table
table

And the following charts:

class_imbalance
class_error
det
feature_importance
missing
precision_recall
roc
prediction_histogram
threshold

For clustering:

Delauney triangulations
Voronoi tessalations

Try it

By trying it on binder, you'll see all the details and interactivity. The quickstart below has static images, but if you run these commands in a jupyter notebook, ipython or IDE you will be able to interact with them.

Quickstart

from classgraphic.essential import *

# loading the data
df = px.data.iris()

# let's see what kind of data we have
describe(df, transpose=True).show()

# any missing?
missing(df)

# features
X = df.drop(columns=["species", "species_id"])

#target
y = df["species"]

# Let's check our classes we will be training on and predicting
class_imbalance_table(y, condition="all")

# train / test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.5, random_state=random_state
)

# we want to see total count for each, default for bars is to be stacked, so that works
# we could also pass to class_imbalance barmode="overlay" if we prefer
class_imbalance(y_train, y_test, condition="train,test")

# model
model = LogisticRegression(max_iter=max_iter, random_state=random_state)
model.fit(X_train, y_train)

# predictions
y_score = model.predict_proba(X_test)
y_pred = model.predict(X_test)

confusion_matrix_table(model, y_test, y_pred).show()
classification_table(model, y_test, y_pred)

feature_importance(model, y, transpose=True)

This concludes the quickstart. There are many more visualizations and tables to explore.

See the notebooks and docs folders on github and the documentation web site for more information.

Requirements

Python 3.8 or later
numpy
pandas
plotly>=5.0
scikit-learn
nbformat

Install

If you use conda, create an environment named classgraphic, then activate it:

in Linux: source activate pilot
In Windows: conda activate pilot

If you use another environment management create and activate your environment using the normal steps.

Then execute:

python setup.py install

or for installing in development mode:

python -m pip install -e . --no-build-isolation

or alternatively

python setup.py develop

To install from github instead:

pip install git+https://github.com/dionresearch/classgraphic

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
binder		binder
classgraphic		classgraphic
docs		docs
notebooks		notebooks
.gitignore		.gitignore
AUTHORS.md		AUTHORS.md
CITATION.cff		CITATION.cff
HISTORY.md		HISTORY.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

classgraphic

Major features:

Try it

Quickstart

Requirements

Install

See also

About

Releases 1

Packages

Languages

License

dionresearch/classgraphic

Folders and files

Latest commit

History

Repository files navigation

classgraphic

Major features:

Try it

Quickstart

Requirements

Install

See also

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages