Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final Report Peer Review #104

Open
skysfan0620 opened this issue Dec 12, 2021 · 0 comments
Open

Final Report Peer Review #104

skysfan0620 opened this issue Dec 12, 2021 · 0 comments

Comments

@skysfan0620
Copy link

Comments on presentation:
I would like to list a few areas of improvement:

  1. There was no clear definition of the problem and motivation at the start of the presentation.
  2. It is not clear how each data series are incorporated. It is better to discuss what the data looks like and what feature engineering is used.
  3. There is no discussion of methodology. It is better to discuss what kind of linear regression and logistic regression are used and how do you tune your hyper parameters.
  4. The presentation can incorporate more plots rather than just the regression coefficient table.
  5. There can be more explaination about the coefficients on other variables and what do those indicate.
  6. Conclusion is not well explained and future scopes not discussed.

Comments on the report:
Motivation:
The discussion seems comprehensive and large-scaled but the study is only about gender. It is not consistent with what the project aims to examine.

Data:
The choice of independent variables are sound. However, there does not seem to be any distinguishment of the dependent and independent variables. Alse there is no feature engineering tricks discussed in class applied here. You have the opportunity to apply, for example, one hot vector encoding on patent class and application year or use matrix completion to impute missing data.
Also I do not really understand why the most recent data from 2014 to 2020 would be missing.

Methodology:
I understand that you do not wish to estimate anything, but it seems like a wasted opportunity. I would like to see how your model predicts out-of-sample to examine whether it can be generalized at all outside of the designated years.

Results and conclusion:
I think the results are sound and the conclustion is convincing. However, there is not really a great deal of data science techniques being incorporated and the project seems skeleton. Again, splitting your available data into in-sample and out-of-sample would be great to test your out-of-sample performance. There is also a general lack of informative plots about your results and it makes the report very hard to read.

The font and line-spacing seems purposely enlarged to fit 8 pages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant