Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding custom Model for one of the newly added dataset in MLDatasets.jl #337

Open
arcAman07 opened this issue Feb 19, 2022 · 8 comments
Open

Comments

@arcAman07
Copy link

arcAman07 commented Feb 19, 2022

Adding a customized ( low level ) but super easy to use and understand model for the beginner friendly "Titanic Dataset", which can Machine Learining beginers to get started with this package. If it needs to be added, I would love to work on the PR.

@DhairyaLGandhi
Copy link
Member

A PR would be good, it would be awesome to have a very straightforward implementation focussing only on getting a dataset from that package suitable for use with Flux.

@arcAman07
Copy link
Author

Cool, have already made the model. Will make couple of changes so it is less complex and very easy to comprehend for beginners( can serve as a great starting point for them. ) Will work on the PR and try to do it asap

@arcAman07
Copy link
Author

A part of the problem I just encountered is that I am unable to actually load the Titanic Data from the MLDatasets library( have raised an issue ). Should I just implement it from reading from a csv file( like it was done by me in that library) and then create the complete model for use?

@ToucheSir
Copy link
Member

I don't think there's any time crunch on this, so fixing the titanic dataset loading should be done first.

@arcAman07
Copy link
Author

Great will look into it. I had tested that locally was working then. After the release was created, I wasn't able to load it

@arcAman07
Copy link
Author

Issue has been solved, was a mistake on my end while loading it. Will create the model using it and send the PR asap.
Thanks @ToucheSir

@arcAman07
Copy link
Author

Having some troubles training the model for this dataset. After thorough EDA, features importance, data manipulation the testing accuracy is stuck at 0.6 using the simple Logistic regression using the Flux Dense layer. Just not able to improve the model accuracy on testing set after trying out various permutations and combinations by creating various models.( The different datatypes of input features makes it a tough choice of using a neural network rather than other algos ), If accuracy is not the most important thing, and helping user understanding on getting the data, training the data using the Flux.jl library and how to test on it, they I can do the PR.

@CarloLucibello
Copy link
Member

File a PR, maybe there is something wrong with your code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants