Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#GSOC PR : Add Preprocess Function for Data Cleaning and Validation #3321
#GSOC PR : Add Preprocess Function for Data Cleaning and Validation #3321
Changes from 12 commits
dccd805
eaa0846
9a887d4
cbc0a34
2879e6a
8de0277
e3403e6
be698d5
35f0a6e
48b7c50
ebb32fb
8cc689f
7a4a68a
064fc30
04d439f
e167ca4
f28ebb2
e524b5e
a3a92f2
9226ef5
f2bab83
cac3c8e
ecd5aa1
5ce2339
bd7cfa5
a870b93
ca14c09
832801f
ce4a597
d02318f
ac572da
350278f
0c4fb82
7574abc
6acfd74
5b6f577
50ee452
f812daa
47656a3
f55c2de
d751ffc
06bf26b
bb66142
4d2c6a5
7e97841
fe5699d
a20389f
1dd9e6c
35f0b3e
91236ac
d01f739
62a8e44
7f782f2
21a615a
c8c234a
19402db
f43a50a
62221d9
0af7df7
1e6a484
529fe6f
2227fd9
80e5b2d
9f6554b
6001fad
909ae68
9c08465
3889397
71c1013
38c9e7a
460672c
7859206
e8c40ac
53a9cae
b9cc4fb
d659567
aacc890
2d248d9
c9fdd7b
172bf55
5e43b35
fa209b4
b1bd57f
fa9ed04
0cf995a
6686b8d
7a66814
a25e663
b7ac546
b61fd4e
d976a02
95018e4
732b966
6389e93
91ef69f
1d215b7
7b9ec87
91d3da6
b32e5e3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 35f0b3e commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what "In quotes" means here. Also, update for change in preprocess function name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update to be a Date
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per verbal discussion, it would be nice to support both models, not replace one with the other. Could we add an argument that lets you specify which one you're running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed issue in commit ce4a597
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than hard coding the covariates in the model, can you detect them from the covariates object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, since the x data is identical for all ensemble members, is there really a need to make replicate copies for each ensemble member?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed the detection of covariates automatically from the covariates object in commit 6acfd74 . the commit also addresses the issue of replicate copies for each ensemble member by creating a
full_data
frame for all the data .There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
train, test, and prediction data can't be rescaled separately otherwise they end up having different meanings. Whenever you standardize data you have to use the same constants (mean and sd) everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed rescaling in ecd5aa1 commit .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-base functions need an explicit namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in commit f2bab83
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow that commit isn't part of this code, as namespaces are still an issue