Skip to content

Commit

Permalink
Merge pull request #50 from rasbt/refactor_base
Browse files Browse the repository at this point in the history
v 0.4.1
  • Loading branch information
rasbt committed May 2, 2016
2 parents 4e6ac4c + ac82911 commit cc54c52
Show file tree
Hide file tree
Showing 106 changed files with 3,283 additions and 4,221 deletions.
2 changes: 1 addition & 1 deletion docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ pages:
- user_guide/classifier/Adaline.md
- user_guide/classifier/LogisticRegression.md
- user_guide/classifier/SoftmaxRegression.md
- user_guide/classifier/NeuralNetMLP.md
- user_guide/classifier/MultiLayerPerceptron.md
- tf_classifier:
- user_guide/tf_classifier/TfMultiLayerPerceptron.md
- user_guide/tf_classifier/TfSoftmaxRegression.md
Expand Down
11 changes: 8 additions & 3 deletions docs/sources/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

---

### Version 0.4.1dev
### Version 0.4.1 (2016-05-01)

##### New Features

Expand All @@ -12,8 +12,13 @@

##### Changes

- Adding optional `dropout` to the [`tf_classifier.TfMultiLayerPerceptron`](./user_guide/tf_classifier/TfMultiLayerPerceptron.md) classifier for regularization
- Adding an optional `decay` parameter to the [`tf_classifier.TfMultiLayerPerceptron`](./user_guide/tf_classifier/TfMultiLayerPerceptron.md) classifier for adaptive learning via an exponential decay of the learning rate eta
- Due to refactoring of the estimator classes, the `init_weights` parameter of the `fit` methods was globally renamed to `init_params`
- Overall performance improvements of estimators due to code clean-up and refactoring
- Added several additional checks for correct array types and more meaningful exception messages
- Added optional `dropout` to the [`tf_classifier.TfMultiLayerPerceptron`](./user_guide/tf_classifier/TfMultiLayerPerceptron.md) classifier for regularization
- Added an optional `decay` parameter to the [`tf_classifier.TfMultiLayerPerceptron`](./user_guide/tf_classifier/TfMultiLayerPerceptron.md) classifier for adaptive learning via an exponential decay of the learning rate eta
- Replaced old `NeuralNetMLP` by more streamlined `MultiLayerPerceptron` ([`classifier.MultiLayerPerceptron`](./user_guide/classifier/MultiLayerPerceptron.md)); now also with softmax in the output layer and categorical cross-entropy loss.
- Unified `init_params` parameter for fit functions to continue training where the algorithm left off (if supported)

### Version 0.4.0 (2016-04-09)

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/USER_GUIDE_INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
- [`Perceptron`](user_guide/classifier/Perceptron.md)
- [`Adaline`](user_guide/classifier/Adaline.md)
- [`LogisticRegression`](user_guide/classifier/LogisticRegression.md)
- [`NeuralNetMLP`](user_guide/classifier/NeuralNetMLP.md)
- [`MultiLayerPerceptron`](user_guide/classifier/MultiLayerPerceptron.md)
- [`SoftmaxRegression`](user_guide/classifier/SoftmaxRegression.md)

## `tf_classifier` (TensorFlow Classifier)
Expand Down
144 changes: 83 additions & 61 deletions docs/sources/user_guide/classifier/Adaline.ipynb

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/sources/user_guide/classifier/Adaline_files/Adaline_20_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/sources/user_guide/classifier/Adaline_files/Adaline_24_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
161 changes: 93 additions & 68 deletions docs/sources/user_guide/classifier/LogisticRegression.ipynb

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
856 changes: 856 additions & 0 deletions docs/sources/user_guide/classifier/MultiLayerPerceptron.ipynb

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
987 changes: 0 additions & 987 deletions docs/sources/user_guide/classifier/NeuralNetMLP.ipynb

This file was deleted.

87 changes: 53 additions & 34 deletions docs/sources/user_guide/classifier/Perceptron.ipynb

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
136 changes: 83 additions & 53 deletions docs/sources/user_guide/classifier/SoftmaxRegression.ipynb

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
222 changes: 91 additions & 131 deletions docs/sources/user_guide/regressor/LinearRegression.ipynb

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
288 changes: 50 additions & 238 deletions docs/sources/user_guide/tf_classifier/TfMultiLayerPerceptron.ipynb

Large diffs are not rendered by default.

215 changes: 64 additions & 151 deletions docs/sources/user_guide/tf_classifier/TfSoftmaxRegression.ipynb

Large diffs are not rendered by default.

68 changes: 35 additions & 33 deletions docs/sources/user_guide/tf_regressor/TfLinearRegression.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion mlxtend/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
#
# License: BSD 3 clause

__version__ = '0.4.1dev'
__version__ = '0.4.1'
19 changes: 19 additions & 0 deletions mlxtend/_base/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Sebastian Raschka 2014-2016
# mlxtend Machine Learning Library Extensions
# Author: Sebastian Raschka <sebastianraschka.com>
#
# License: BSD 3 clause

from ._base_estimator import _BaseEstimator
from ._base_supervised_estimator import _BaseSupervisedEstimator
from ._base_unsupervised_estimator import _BaseUnsupervisedEstimator
from ._base_classifier import _BaseClassifier
from ._base_multiclass import _BaseMultiClass
from ._base_multilayer import _BaseMultiLayer
from ._base_regressor import _BaseRegressor
from ._base_cluster import _BaseCluster

__all__ = ["_BaseEstimator",
"_BaseSupervisedEstimator", "_BaseUnsupervisedEstimator",
"_BaseClassifier", "_BaseMultiClass", "_BaseMultiLayer",
"_BaseRegressor", "_BaseCluster"]
60 changes: 60 additions & 0 deletions mlxtend/_base/_base_classifier.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Sebastian Raschka 2014-2016
# mlxtend Machine Learning Library Extensions
#
# Base Clusteer (Clutering Parent Class)
# Author: Sebastian Raschka <sebastianraschka.com>
#
# License: BSD 3 clause

import numpy as np
from ._base_supervised_estimator import _BaseSupervisedEstimator


class _BaseClassifier(_BaseSupervisedEstimator):

"""Parent Class Classifier
A base class that is implemented by classifiers
"""
def __init__(self, print_progress=0, random_seed=0):
super(_BaseClassifier, self).__init__(
print_progress=print_progress,
random_seed=random_seed)
self._binary_classifier = False

def _check_target_array(self, y, allowed=None):
if not np.issubdtype(y[0], int):
raise AttributeError('y must be an integer array.\nFound %s'
% y.dtype)
found_labels = np.unique(y)
if (found_labels < 0).any():
raise AttributeError('y array must not contain negative labels.'
'\nFound %s' % found_labels)
if allowed is not None:
found_labels = tuple(found_labels)
if found_labels not in allowed:
raise AttributeError('Labels not in %s.\nFound %s'
% (allowed, found_labels))

def score(self, X, y):
""" Compute the prediction accuracy
Parameters
----------
X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Training vectors, where n_samples is the number of samples and
n_features is the number of features.
y : array-like, shape = [n_samples]
Target values (true class labels).
Returns
---------
acc : float
The prediction accuracy as a float
between 0.0 and 1.0 (perfect score).
"""
y_pred = self.predict(X)
acc = np.sum(y == y_pred, axis=0) / float(X.shape[0])
return acc
22 changes: 22 additions & 0 deletions mlxtend/_base/_base_cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Sebastian Raschka 2014-2016
# mlxtend Machine Learning Library Extensions
#
# Base Clusteer (Clutering Parent Class)
# Author: Sebastian Raschka <sebastianraschka.com>
#
# License: BSD 3 clause

from ._base_unsupervised_estimator import _BaseUnsupervisedEstimator


class _BaseCluster(_BaseUnsupervisedEstimator):

"""Parent Class Unsupervised Estimator
A base class that is implemented by clustering estimators
"""
def __init__(self, print_progress=0, random_seed=0):
super(_BaseCluster, self).__init__(
print_progress=print_progress,
random_seed=random_seed)
100 changes: 68 additions & 32 deletions mlxtend/tf_regressor/tf_base.py → mlxtend/_base/_base_estimator.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Sebastian Raschka 2014-2016
# mlxtend Machine Learning Library Extensions
#
# Base Regressor (Regressor Parent Class)
# Base Clusteer (Clutering Parent Class)
# Author: Sebastian Raschka <sebastianraschka.com>
#
# License: BSD 3 clause
Expand All @@ -11,21 +11,29 @@
from time import time


class _TfBaseRegressor(object):
class _BaseEstimator(object):

"""Parent Class Base Regressor
"""Parent Class Estimator
A base class that is implemented by
regressor child classes.
classifiers, regressors, and clustering estimators.
"""
def __init__(self, print_progress=0, random_seed=None):
def __init__(self, print_progress=0,
random_seed=None):
self.print_progress = print_progress
self.random_seed = random_seed
if self.random_seed is not None:
np.random.seed(self.random_seed)
self._is_fitted = False
self._allowed_labels = None

def _fit(self, X, y=None, init_params=True):
# Implemented in child class
pass

def fit(self, X, y, init_weights=True):
"""Learn weight coefficients from training data.
def fit(self, X, y=None, init_params=True):
"""Learn model from training data.
Parameters
----------
Expand All @@ -34,30 +42,28 @@ def fit(self, X, y, init_weights=True):
n_features is the number of features.
y : array-like, shape = [n_samples]
Target values.
init_weights : bool (default: True)
Reinitialize weights
init_params : bool (default: True)
Re-initializes model parametersprior to fitting.
Set False to continue training with weights from
a previous model fitting.
Returns
-------
self : object
"""
self._is_fitted = False
if not (init_weights is None or isinstance(init_weights, bool)):
raise AttributeError("init_weights must be True or False")
self._check_arrays(X=X, y=y)
if self.random_seed is not None:
np.random.seed(self.random_seed)
self._fit(X=X, y=y, init_weights=init_weights)
if init_params:
self._init_params
self._fit(X=X, y=y)
self._is_fitted = True
return self

def _fit(self, X, y, init_weights=True):
# Implemented in child class
pass

def predict(self, X):
"""Predict class labels of X.
"""Predict targets from X.
Parameters
----------
Expand All @@ -67,11 +73,11 @@ def predict(self, X):
Returns
----------
class_labels : array-like, shape = [n_samples]
Predicted class labels.
target_values : array-like, shape = [n_samples]
Predicted target values.
"""
self._check_arrays(X)
self._check_arrays(X=X)
if not self._is_fitted:
raise AttributeError('Model is not fitted, yet.')
return self._predict(X)
Expand All @@ -80,29 +86,32 @@ def _predict(self, X):
# Implemented in child class
pass

def _shuffle(self, arrays):
def _shuffle_arrays(self, arrays):
"""Shuffle arrays in unison."""
r = np.random.permutation(len(arrays[0]))
return [ary[r] for ary in arrays]

def _print_progress(self, epoch, cost=None, time_interval=10):
def _print_progress(self, iteration, n_iter,
cost=None, time_interval=10):
if self.print_progress > 0:
s = '\rEpoch: %d/%d' % (epoch, self.epochs)
s = '\rIteration: %d/%d' % (iteration, n_iter)
if cost:
s += ' | Cost %.2f' % cost
if self.print_progress > 1:
if not hasattr(self, 'ela_str_'):
self.ela_str_ = '00:00:00'
if not epoch % time_interval:
if not iteration % time_interval:
ela_sec = time() - self.init_time_
self.ela_str_ = self._to_hhmmss(ela_sec)
s += ' | Elapsed: %s' % self.ela_str_
if self.print_progress > 2:
if not hasattr(self, 'eta_str_'):
self.eta_str_ = '00:00:00'
if not epoch % time_interval:
eta_sec = ((ela_sec / float(epoch)) *
self.epochs - ela_sec)
if not iteration % time_interval:
eta_sec = ((ela_sec / float(iteration)) *
n_iter - ela_sec)
if eta_sec < 0.0:
eta_sec = 0.0
self.eta_str_ = self._to_hhmmss(eta_sec)
s += ' | ETA: %s' % self.eta_str_
stderr.write(s)
Expand All @@ -122,12 +131,39 @@ def _check_arrays(self, X, y=None):
if y is None:
return
except(AttributeError):
pass
else:
if not isinstance(y, np.ndarray):
raise ValueError('y must be a numpy array.')
if not len(y.shape) == 1:
raise ValueError('y must be a 1D numpy array.')
raise ValueError('y must be a 1D array.')

if not len(y) == X.shape[0]:
raise ValueError('X and y must contain the same number of samples')

def _init_params(self, weights_shape, bias_shape=(1,), dtype='float64',
scale=0.01, random_seed=None):
"""Initialize weight coefficients."""
if random_seed:
np.random.seed(random_seed)
w = np.random.normal(loc=0.0, scale=scale, size=weights_shape)
b = np.zeros(shape=bias_shape)
return b.astype(dtype), w.astype(dtype)

def _yield_minibatches_idx(self, n_batches, data_ary, shuffle=True):
indices = np.arange(data_ary.shape[0])

if shuffle:
indices = np.random.permutation(indices)
if n_batches > 1:
remainder = data_ary.shape[0] % n_batches

if remainder:
minis = np.array_split(indices[:-remainder], n_batches)
minis[-1] = np.concatenate((minis[-1],
indices[-remainder:]),
axis=0)
else:
minis = np.array_split(indices, n_batches)

else:
minis = (indices,)

for idx_batch in minis:
yield idx_batch
40 changes: 40 additions & 0 deletions mlxtend/_base/_base_multiclass.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Sebastian Raschka 2014-2016
# mlxtend Machine Learning Library Extensions
#
# Base Clusteer (Clutering Parent Class)
# Author: Sebastian Raschka <sebastianraschka.com>
#
# License: BSD 3 clause

import numpy as np


class _BaseMultiClass(object):
"""Add-on Parent Class for Multi-class classifier"""

def __init__(self):
pass

def _one_hot(self, y, n_labels, dtype):
"""Returns a matrix where each sample in y is represented
as a row, and each column represents the class label in
the one-hot encoding scheme.
Example:
y = np.array([0, 1, 2, 3, 4, 2])
mc = _BaseMultiClass()
mc._one_hot(y=y, n_labels=5, dtype='float')
np.array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.],
[0., 0., 1., 0., 0.]])
"""
mat = np.zeros((len(y), n_labels))
for i, val in enumerate(y):
mat[i, val] = 1
return mat.astype(dtype)
Loading

0 comments on commit cc54c52

Please sign in to comment.