Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Midterm Peer Review #71

Open
ngjulia opened this issue Nov 17, 2020 · 0 comments
Open

Midterm Peer Review #71

ngjulia opened this issue Nov 17, 2020 · 0 comments

Comments

@ngjulia
Copy link
Contributor

ngjulia commented Nov 17, 2020

Summary:
The project was to see if they can determine the quality of a wine given 13 characteristics, some of which were nominal, however most were real valued. The group had run some preliminary models and found that many of the wines were rated around 5 and 6 and found sparse ratings at the extrema (3 and 9). Much of the report, as a result, detailed ways in which they tried to mitigate bias from this finding in the data, which included bootstrapping.

What I liked:

  • I really liked how you explained the problem about your data and then went into ways you tried to solve that problem
  • It also seems like you guys already have a good grasp on how to plot and model things, which I think is good to have this early on in the game
  • I also think it's great that you found out a way to even lower your MSE by using shuffling! Noticing oddities in your outputs (like with the red wine thing) can actually lead you to make a better model, which is awesome

Things to improve on:

  • The graphs themselves were a little hard to read - I get that the violin plots were used to show variance but perhaps having a separate plot for variance and the regression might be better as a convincing visual
  • Trying other types of regression might be useful! I think with these projects, it's probably best to try and use everything in our toolbox that we've learned
  • I agree that classification would be a good place to go next, so I would encourage looking into that to perhaps get better results
  • Also, I'm curious if you could use maybe more qualitative data? Like region, date, etc. These might be already somewhat factored into with the pH levels or something, but maybe it would also help in your analysis?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant