Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datatagr as the foundation for S3 classes inheriting from data.frames #3

Open
Bisaloo opened this issue Apr 23, 2024 · 2 comments
Open
Labels
question Further information is requested

Comments

@Bisaloo
Copy link
Member

Bisaloo commented Apr 23, 2024

We've had a couple of interesting discussions recently on S3 classes with:

In both cases, the S3 classes defined inherit from data.frames. This is convenient and desirable because it gives users a sense of familiarity, given how common data.frames are in the R ecosystem.

However, one major drawback of S3 is that there is no way to officially declare a class. This means users could potentially end up with an invalid S3 object (as defined by the original packages) because they accidentally dropped required columns. It is important to have a mechanism to alert users the object is no longer valid as soon as it happens. Delaying the warning or errors when a specific operation on the object is required can only lead to frustration in users: "when did my object stop being valid exactly?"

The tagging system introduced by the linelist R packages provides a good solution issue. Tagged columns can be made required and users is warned as soon as the column is dropped. This is robust to all data wrangling operations and column renaming.

linelist itself is not the ideal solution because it focuses on a specific type of data (line list data), which may not match perfectly with the data need in vaccineff / scoringutils / downstream packages.

The present datatagr R package, as a generalisation of linelist to generic data.frame, therefore provides the ideal solution to be used as the foundation layer for packages who want to build S3 classes inheriting from data.frames with a safe validation system.

@nikosbosse
Copy link

This is really cool! Will look into it.

@Bisaloo
Copy link
Member Author

Bisaloo commented Apr 23, 2024

To clarify: datatagr is still in the early development stages but I wanted to create this issue already because this new role of the datatagr package may inform its development (which @chartgerink is leading).

You can see some preliminary discussion at https://github.com/orgs/epiverse-trace/discussions/221, and look into linelist to see a more specific version of what datatagr may be.

@chartgerink chartgerink added the question Further information is requested label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants