You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In both cases, the S3 classes defined inherit from data.frames. This is convenient and desirable because it gives users a sense of familiarity, given how common data.frames are in the R ecosystem.
However, one major drawback of S3 is that there is no way to officially declare a class. This means users could potentially end up with an invalid S3 object (as defined by the original packages) because they accidentally dropped required columns. It is important to have a mechanism to alert users the object is no longer valid as soon as it happens. Delaying the warning or errors when a specific operation on the object is required can only lead to frustration in users: "when did my object stop being valid exactly?"
The tagging system introduced by the linelist R packages provides a good solution issue. Tagged columns can be made required and users is warned as soon as the column is dropped. This is robust to all data wrangling operations and column renaming.
linelist itself is not the ideal solution because it focuses on a specific type of data (line list data), which may not match perfectly with the data need in vaccineff / scoringutils / downstream packages.
The present datatagr R package, as a generalisation of linelist to generic data.frame, therefore provides the ideal solution to be used as the foundation layer for packages who want to build S3 classes inheriting from data.frames with a safe validation system.
The text was updated successfully, but these errors were encountered:
To clarify: datatagr is still in the early development stages but I wanted to create this issue already because this new role of the datatagr package may inform its development (which @chartgerink is leading).
We've had a couple of interesting discussions recently on S3 classes with:
In both cases, the S3 classes defined inherit from data.frames. This is convenient and desirable because it gives users a sense of familiarity, given how common data.frames are in the R ecosystem.
However, one major drawback of S3 is that there is no way to officially declare a class. This means users could potentially end up with an invalid S3 object (as defined by the original packages) because they accidentally dropped required columns. It is important to have a mechanism to alert users the object is no longer valid as soon as it happens. Delaying the warning or errors when a specific operation on the object is required can only lead to frustration in users: "when did my object stop being valid exactly?"
The tagging system introduced by the linelist R packages provides a good solution issue. Tagged columns can be made required and users is warned as soon as the column is dropped. This is robust to all data wrangling operations and column renaming.
linelist itself is not the ideal solution because it focuses on a specific type of data (line list data), which may not match perfectly with the data need in vaccineff / scoringutils / downstream packages.
The present datatagr R package, as a generalisation of linelist to generic data.frame, therefore provides the ideal solution to be used as the foundation layer for packages who want to build S3 classes inheriting from data.frames with a safe validation system.
The text was updated successfully, but these errors were encountered: