Skip to content

Commit

Permalink
Frequent Patterns Sparse Support (#667)
Browse files Browse the repository at this point in the history
* sparse attempt

* remove SparseDataFrame support

* Revert "remove SparseDataFrame support"

This reverts commit b8a1cd9.

* remove SparseDataFrame support

* cleanup

* fixes isssues with new sparse format

* upd docs

* add back bool comp

* fix bool check
  • Loading branch information
rasbt authored Feb 24, 2020
1 parent cd54aed commit 213fd02
Show file tree
Hide file tree
Showing 13 changed files with 168 additions and 165 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ matrix:
- os: linux
sudo: required
python: 3.8
env: LATEST="false" IMAGE="true" COVERAGE="false" NUMPY_VERSION="1.17.4" SCIPY_VERSION="1.3.1" SKLEARN_VERSION="0.22.0" JOBLIB_VERSION=0.13.1 PANDAS_VERSION="0.25.3" IMAGEIO_VERSION="2.5.0" SKIMAGE_VERSION="0.15.0" DLIB_VERSION="19.17.0" MINICONDA_PYTHON_VERSION=3.7
env: LATEST="false" IMAGE="true" COVERAGE="false" NUMPY_VERSION="1.18.1" SCIPY_VERSION="1.4.1" SKLEARN_VERSION="0.22.0" JOBLIB_VERSION=0.13.2 PANDAS_VERSION="1.0.1" IMAGEIO_VERSION="2.5.0" SKIMAGE_VERSION="0.15.0" DLIB_VERSION="19.17.0" MINICONDA_PYTHON_VERSION=3.7
- os: linux
python: 3.8
env: LATEST="true" IMAGE="true" COVERAGE="true" NOTEBOOKS="true" MINICONDA_PYTHON_VERSION=3.7
Expand Down
9 changes: 5 additions & 4 deletions docs/sources/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ The CHANGELOG for the current development version is available at

---

### Version 0.18.0 (TBD)
### Version 0.17.2 (TBD)

##### Downloads

- [Source code (zip)](https://github.com/rasbt/mlxtend/archive/v0.18.0.zip)
- [Source code (zip)](https://github.com/rasbt/mlxtend/archive/v0.17.2.zip)

- [Source code (tar.gz)](https://github.com/rasbt/mlxtend/archive/v0.18.0.tar.gz)
- [Source code (tar.gz)](https://github.com/rasbt/mlxtend/archive/v0.17.2.tar.gz)


##### New Features
Expand All @@ -22,7 +22,8 @@ The CHANGELOG for the current development version is available at

##### Changes

- -
- The previously deprecated `OnehotTransactions` has been removed in favor of the `TransactionEncoder.`
- Removed `SparseDataFrame` support in frequent pattern mining functions in favor of pandas >=1.0's new way for working sparse data. If you used `SparseDataFrame` formats, please see pandas' migration guide at https://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html#migrating ([#667](https://github.com/rasbt/mlxtend/pull/667))


##### Bug Fixes
Expand Down
74 changes: 41 additions & 33 deletions docs/sources/user_guide/frequent_patterns/apriori.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,7 @@
" <tr>\n",
" <th>5</th>\n",
" <td>0.8</td>\n",
" <td>(Eggs, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Eggs)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
Expand All @@ -433,7 +433,7 @@
" <tr>\n",
" <th>8</th>\n",
" <td>0.6</td>\n",
" <td>(Onion, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Onion)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
Expand All @@ -443,7 +443,7 @@
" <tr>\n",
" <th>10</th>\n",
" <td>0.6</td>\n",
" <td>(Onion, Kidney Beans, Eggs)</td>\n",
" <td>(Kidney Beans, Onion, Eggs)</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
Expand All @@ -456,12 +456,12 @@
"2 0.6 (Milk)\n",
"3 0.6 (Onion)\n",
"4 0.6 (Yogurt)\n",
"5 0.8 (Eggs, Kidney Beans)\n",
"5 0.8 (Kidney Beans, Eggs)\n",
"6 0.6 (Onion, Eggs)\n",
"7 0.6 (Kidney Beans, Milk)\n",
"8 0.6 (Onion, Kidney Beans)\n",
"8 0.6 (Kidney Beans, Onion)\n",
"9 0.6 (Kidney Beans, Yogurt)\n",
"10 0.6 (Onion, Kidney Beans, Eggs)"
"10 0.6 (Kidney Beans, Onion, Eggs)"
]
},
"execution_count": 4,
Expand Down Expand Up @@ -552,7 +552,7 @@
" <tr>\n",
" <th>5</th>\n",
" <td>0.8</td>\n",
" <td>(Eggs, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Eggs)</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
Expand All @@ -570,7 +570,7 @@
" <tr>\n",
" <th>8</th>\n",
" <td>0.6</td>\n",
" <td>(Onion, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Onion)</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
Expand All @@ -582,7 +582,7 @@
" <tr>\n",
" <th>10</th>\n",
" <td>0.6</td>\n",
" <td>(Onion, Kidney Beans, Eggs)</td>\n",
" <td>(Kidney Beans, Onion, Eggs)</td>\n",
" <td>3</td>\n",
" </tr>\n",
" </tbody>\n",
Expand All @@ -596,12 +596,12 @@
"2 0.6 (Milk) 1\n",
"3 0.6 (Onion) 1\n",
"4 0.6 (Yogurt) 1\n",
"5 0.8 (Eggs, Kidney Beans) 2\n",
"5 0.8 (Kidney Beans, Eggs) 2\n",
"6 0.6 (Onion, Eggs) 2\n",
"7 0.6 (Kidney Beans, Milk) 2\n",
"8 0.6 (Onion, Kidney Beans) 2\n",
"8 0.6 (Kidney Beans, Onion) 2\n",
"9 0.6 (Kidney Beans, Yogurt) 2\n",
"10 0.6 (Onion, Kidney Beans, Eggs) 3"
"10 0.6 (Kidney Beans, Onion, Eggs) 3"
]
},
"execution_count": 5,
Expand Down Expand Up @@ -657,7 +657,7 @@
" <tr>\n",
" <th>5</th>\n",
" <td>0.8</td>\n",
" <td>(Eggs, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Eggs)</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
Expand All @@ -666,7 +666,7 @@
],
"text/plain": [
" support itemsets length\n",
"5 0.8 (Eggs, Kidney Beans) 2"
"5 0.8 (Kidney Beans, Eggs) 2"
]
},
"execution_count": 6,
Expand Down Expand Up @@ -983,7 +983,7 @@
" <tr>\n",
" <th>5</th>\n",
" <td>0.8</td>\n",
" <td>(Eggs, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Eggs)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
Expand All @@ -998,7 +998,7 @@
" <tr>\n",
" <th>8</th>\n",
" <td>0.6</td>\n",
" <td>(Onion, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Onion)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
Expand All @@ -1008,7 +1008,7 @@
" <tr>\n",
" <th>10</th>\n",
" <td>0.6</td>\n",
" <td>(Onion, Kidney Beans, Eggs)</td>\n",
" <td>(Kidney Beans, Onion, Eggs)</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
Expand All @@ -1021,12 +1021,12 @@
"2 0.6 (Milk)\n",
"3 0.6 (Onion)\n",
"4 0.6 (Yogurt)\n",
"5 0.8 (Eggs, Kidney Beans)\n",
"5 0.8 (Kidney Beans, Eggs)\n",
"6 0.6 (Onion, Eggs)\n",
"7 0.6 (Kidney Beans, Milk)\n",
"8 0.6 (Onion, Kidney Beans)\n",
"8 0.6 (Kidney Beans, Onion)\n",
"9 0.6 (Kidney Beans, Yogurt)\n",
"10 0.6 (Onion, Kidney Beans, Eggs)"
"10 0.6 (Kidney Beans, Onion, Eggs)"
]
},
"execution_count": 9,
Expand Down Expand Up @@ -1062,22 +1062,29 @@
"\n",
"**Parameters**\n",
"\n",
"- `df` : pandas DataFrame or pandas SparseDataFrame\n",
"- `df` : pandas DataFrame\n",
"\n",
" pandas DataFrame the encoded format. Also supports\n",
" DataFrames with sparse data; for more info, please\n",
" see (https://pandas.pydata.org/pandas-docs/stable/\n",
" user_guide/sparse.html#sparse-data-structures)\n",
"\n",
" Please note that the old pandas SparseDataFrame format\n",
" is no longer supported in mlxtend >= 0.17.2.\n",
"\n",
" pandas DataFrame the encoded format.\n",
" The allowed values are either 0/1 or True/False.\n",
" For example,\n",
"\n",
"```\n",
" Apple Bananas Beer Chicken Milk Rice\n",
" 0 1 0 1 1 0 1\n",
" 1 1 0 1 0 0 1\n",
" 2 1 0 1 0 0 0\n",
" 3 1 1 0 0 0 0\n",
" 4 0 0 1 1 1 1\n",
" 5 0 0 1 0 1 1\n",
" 6 0 0 1 0 1 0\n",
" 7 1 1 0 0 0 0\n",
" Apple Bananas Beer Chicken Milk Rice\n",
" 0 True False True True False True\n",
" 1 True False True False False True\n",
" 2 True False True False False False\n",
" 3 True True False False False False\n",
" 4 False False True True True True\n",
" 5 False False True False True True\n",
" 6 False False True False True False\n",
" 7 True True False False False False\n",
"```\n",
"\n",
"\n",
Expand Down Expand Up @@ -1108,7 +1115,8 @@
"\n",
"- `low_memory` : bool (default: False)\n",
"\n",
" If `True`, uses an iterator to search for combinations above `min_support`.\n",
" If `True`, uses an iterator to search for combinations above\n",
" `min_support`.\n",
" Note that while `low_memory=True` should only be used for large dataset\n",
" if memory resources are limited, because this implementation is approx.\n",
" 3-6x slower than the default.\n",
Expand Down Expand Up @@ -1173,5 +1181,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
59 changes: 33 additions & 26 deletions docs/sources/user_guide/frequent_patterns/fpgrowth.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -424,27 +424,27 @@
" <tr>\n",
" <th>5</th>\n",
" <td>0.8</td>\n",
" <td>(Eggs, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Eggs)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>0.6</td>\n",
" <td>(Yogurt, Kidney Beans)</td>\n",
" <td>(Kidney Beans, Yogurt)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>0.6</td>\n",
" <td>(Eggs, Onion)</td>\n",
" <td>(Onion, Eggs)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>0.6</td>\n",
" <td>(Kidney Beans, Onion)</td>\n",
" <td>(Onion, Kidney Beans)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>0.6</td>\n",
" <td>(Eggs, Kidney Beans, Onion)</td>\n",
" <td>(Onion, Kidney Beans, Eggs)</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
Expand All @@ -462,11 +462,11 @@
"2 0.6 (Yogurt)\n",
"3 0.6 (Onion)\n",
"4 0.6 (Milk)\n",
"5 0.8 (Eggs, Kidney Beans)\n",
"6 0.6 (Yogurt, Kidney Beans)\n",
"7 0.6 (Eggs, Onion)\n",
"8 0.6 (Kidney Beans, Onion)\n",
"9 0.6 (Eggs, Kidney Beans, Onion)\n",
"5 0.8 (Kidney Beans, Eggs)\n",
"6 0.6 (Kidney Beans, Yogurt)\n",
"7 0.6 (Onion, Eggs)\n",
"8 0.6 (Onion, Kidney Beans)\n",
"9 0.6 (Onion, Kidney Beans, Eggs)\n",
"10 0.6 (Kidney Beans, Milk)"
]
},
Expand Down Expand Up @@ -516,7 +516,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.52 ms ± 362 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)\n"
"3.53 ms ± 124 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)\n"
]
}
],
Expand All @@ -535,7 +535,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.44 ms ± 119 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)\n"
"3.7 ms ± 70.8 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)\n"
]
}
],
Expand All @@ -552,7 +552,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"549 µs ± 17.7 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)\n"
"1.57 ms ± 36.7 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)\n"
]
}
],
Expand Down Expand Up @@ -600,22 +600,29 @@
"\n",
"**Parameters**\n",
"\n",
"- `df` : pandas DataFrame or pandas SparseDataFrame\n",
"- `df` : pandas DataFrame\n",
"\n",
" pandas DataFrame the encoded format. Also supports\n",
" DataFrames with sparse data; for more info, please\n",
" see (https://pandas.pydata.org/pandas-docs/stable/\n",
" user_guide/sparse.html#sparse-data-structures)\n",
"\n",
" Please note that the old pandas SparseDataFrame format\n",
" is no longer supported in mlxtend >= 0.17.2.\n",
"\n",
" pandas DataFrame the encoded format.\n",
" The allowed values are either 0/1 or True/False.\n",
" For example,\n",
"\n",
"```\n",
" Apple Bananas Beer Chicken Milk Rice\n",
" 0 1 0 1 1 0 1\n",
" 1 1 0 1 0 0 1\n",
" 2 1 0 1 0 0 0\n",
" 3 1 1 0 0 0 0\n",
" 4 0 0 1 1 1 1\n",
" 5 0 0 1 0 1 1\n",
" 6 0 0 1 0 1 0\n",
" 7 1 1 0 0 0 0\n",
" Apple Bananas Beer Chicken Milk Rice\n",
" 0 True False True True False True\n",
" 1 True False True False False True\n",
" 2 True False True False False False\n",
" 3 True True False False False False\n",
" 4 False False True True True True\n",
" 5 False False True False True True\n",
" 6 False False True False True False\n",
" 7 True True False False False False\n",
"```\n",
"\n",
"\n",
Expand Down Expand Up @@ -696,5 +703,5 @@
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
Loading

0 comments on commit 213fd02

Please sign in to comment.