You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comments on presentation:
I would like to list a few areas of improvement:
There was no clear definition of the problem and motivation at the start of the presentation.
It is not clear how each data series are incorporated. It is better to discuss what the data looks like and what feature engineering is used.
There is no discussion of methodology. It is better to discuss what kind of linear regression and logistic regression are used and how do you tune your hyper parameters.
The presentation can incorporate more plots rather than just the regression coefficient table.
There can be more explaination about the coefficients on other variables and what do those indicate.
Conclusion is not well explained and future scopes not discussed.
Comments on the report:
Motivation:
The discussion seems comprehensive and large-scaled but the study is only about gender. It is not consistent with what the project aims to examine.
Data:
The choice of independent variables are sound. However, there does not seem to be any distinguishment of the dependent and independent variables. Alse there is no feature engineering tricks discussed in class applied here. You have the opportunity to apply, for example, one hot vector encoding on patent class and application year or use matrix completion to impute missing data.
Also I do not really understand why the most recent data from 2014 to 2020 would be missing.
Methodology:
I understand that you do not wish to estimate anything, but it seems like a wasted opportunity. I would like to see how your model predicts out-of-sample to examine whether it can be generalized at all outside of the designated years.
Results and conclusion:
I think the results are sound and the conclustion is convincing. However, there is not really a great deal of data science techniques being incorporated and the project seems skeleton. Again, splitting your available data into in-sample and out-of-sample would be great to test your out-of-sample performance. There is also a general lack of informative plots about your results and it makes the report very hard to read.
The font and line-spacing seems purposely enlarged to fit 8 pages.
The text was updated successfully, but these errors were encountered:
Comments on presentation:
I would like to list a few areas of improvement:
Comments on the report:
Motivation:
The discussion seems comprehensive and large-scaled but the study is only about gender. It is not consistent with what the project aims to examine.
Data:
The choice of independent variables are sound. However, there does not seem to be any distinguishment of the dependent and independent variables. Alse there is no feature engineering tricks discussed in class applied here. You have the opportunity to apply, for example, one hot vector encoding on patent class and application year or use matrix completion to impute missing data.
Also I do not really understand why the most recent data from 2014 to 2020 would be missing.
Methodology:
I understand that you do not wish to estimate anything, but it seems like a wasted opportunity. I would like to see how your model predicts out-of-sample to examine whether it can be generalized at all outside of the designated years.
Results and conclusion:
I think the results are sound and the conclustion is convincing. However, there is not really a great deal of data science techniques being incorporated and the project seems skeleton. Again, splitting your available data into in-sample and out-of-sample would be great to test your out-of-sample performance. There is also a general lack of informative plots about your results and it makes the report very hard to read.
The font and line-spacing seems purposely enlarged to fit 8 pages.
The text was updated successfully, but these errors were encountered: