[FEATURE]: Expand DLP to Support Additional Default Datasets for Enhanced Testing and Training #1058

codingwithsurya · 2023-11-20T05:14:20Z

Feature Name

Support Additional Default Datasets for Enhanced Testing and Training

Your Name

Surya Subramanian

Description

To enhance the versatility and applicability of the Deep Learning Playground (DLP), we propose to add support for more default datasets. These datasets will provide users with a wider range of options for testing and training their machine learning models. Each dataset comes with unique characteristics and challenges, making them ideal for various research and application purposes.

Here are some proposed datasets you can try to integrate:
CIFAR100: Offers 100 classes, each with 600 images (500 for training and 100 for testing). A more complex version of CIFAR10.

SVHN (Street View House Numbers): A real-world image dataset for developing machine learning and object recognition algorithms, requiring minimal data preprocessing.

ImageNet: A large and complex visual database designed for visual object recognition software research.

CelebA (Celebrity Faces Attributes): A large-scale face attributes dataset with over 200,000 celebrity images, each with 40 attribute annotations.

COIL100 (Columbia Object Image Library 100): Consists of 7200 images of 100 objects, each photographed from various angles.

Omniglot: A dataset designed for one-shot learning, containing 1623 different handwritten characters from 50 different alphabets.

STL10: Inspired by CIFAR-10, this dataset is meant for developing unsupervised feature learning, deep learning, and self-taught learning algorithms.

EMNIST (Extended MNIST): Expands the original MNIST dataset to include handwritten letters.

Task Breakdown
Integration of Datasets: Implement the integration of these datasets into the DLP system, ensuring they are easily accessible and usable for users.

Architecture Optimization: For each dataset, research and determine the most effective neural network architectures that are suitable for testing. This involves understanding the specific characteristics and challenges posed by each dataset.

Documentation and Examples: Provide detailed documentation and example use cases for each dataset, guiding users on how to leverage these datasets effectively.

Testing and Validation: Conduct thorough testing (through POSTMAN + default dataset) to ensure the seamless integration of these datasets into the DLP. Validate the performance of suggested architectures for each dataset. More info on how to do this is in Notion.

github-actions · 2023-11-20T05:14:30Z

Hello @codingwithsurya! Thank you for submitting the Feature Request Form. We appreciate your contribution. 👋

We will look into it and provide a response as soon as possible.

To work on this feature request, you can follow these branch setup instructions:

Checkout the main branch:

```
 git checkout nextjs
```

Pull the latest changes from the remote main branch:

```
 git pull origin nextjs
```

Create a new branch specific to this feature request using the issue number:

```
 git checkout -b feature-1058
```

Feel free to make the necessary changes in this branch and submit a pull request when you're ready.

Best regards,
Deep Learning Playground (DLP) Team

LuHG18 · 2024-02-07T00:41:41Z

Found a small bug in tabularConstants.ts where there was a typographical error for the california housing data set leading to incorrect referencing. Just adding an underscore between "california" and "housing" fixed the problem.

LuHG18 · 2024-02-07T00:48:28Z

Another small bug. The DIGITS data set was not working when selected. This data set had just not been loaded in from sci-kit learn. I loaded it in and added it in dataset.py and that seems to have fixed the problem.

karkir0003 · 2024-02-07T01:40:13Z

Hey @LuHG18 ETA on the PR?

codingwithsurya added the enhancement New feature or request label Nov 20, 2023

github-project-automation bot added this to DLP Project Board Nov 20, 2023

github-project-automation bot moved this to Backlog in DLP Project Board Nov 20, 2023

karkir0003 moved this from Backlog to Todo in DLP Project Board Nov 20, 2023

karkir0003 added the good first issue Good for newcomers label Nov 20, 2023

karkir0003 assigned LuHG18 Dec 29, 2023

karkir0003 moved this from Todo to In Progress in DLP Project Board Dec 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: Expand DLP to Support Additional Default Datasets for Enhanced Testing and Training #1058

[FEATURE]: Expand DLP to Support Additional Default Datasets for Enhanced Testing and Training #1058

codingwithsurya commented Nov 20, 2023

github-actions bot commented Nov 20, 2023

LuHG18 commented Feb 7, 2024

LuHG18 commented Feb 7, 2024

karkir0003 commented Feb 7, 2024

[FEATURE]: Expand DLP to Support Additional Default Datasets for Enhanced Testing and Training #1058

[FEATURE]: Expand DLP to Support Additional Default Datasets for Enhanced Testing and Training #1058

Comments

codingwithsurya commented Nov 20, 2023

Feature Name

Your Name

Description

github-actions bot commented Nov 20, 2023

LuHG18 commented Feb 7, 2024

LuHG18 commented Feb 7, 2024

karkir0003 commented Feb 7, 2024