Some considerations for superseding R packages #227

jamesmbaazam · 2023-05-03T14:30:26Z

jamesmbaazam
May 3, 2023
Collaborator

Most of the work we've been doing so far in Epiverse has been to write a package from scratch or adopt an existing package that has its own capacity for maintenance. In this case, it is slightly easy to deal with a number of decisions in the development process using internal policies in the blueprint and so forth. A third case that's not yet common in our work is taking over the maintenance of an existing package.

In this post, I would like to share some considerations and lessons learned from maintaining {bpmodels}, originally developed by @sbfnk and the decision to supersede it with {epichains}. The goal of the post is not to be dogmatic but to walk you through my considerations and start a conversation about what people found useful and what alternatives could have worked better.

Some of the decisions taken so far are:

Scope changes:
- examples:
  - plyr was split into dplyr for data.frames and purrr for list
  - reshape was changed into reshape2 then into tidyr. Hadley mentions that each iteration of the package did less and less.
  - From ggmissing as a geom to {naniar}
Name change; a reimagining of the same tool; new API with improved features from old package:
- examples:
  - ggplot to ggplot2,
  - reshape to reshape2 to tidyr,
  - plyr being split into dplyr for data.frames and purrr for lists,
  - Epiforecasts EpiNow vs Epiforecasts EpiNow2,
  - RECON linelist vs Epiverse linelist
Deprecate or co-exist?
- Workflows to achieve either
  - I decided to fork the original repo to Epiverse and keep the old package to prevent breaking old scripts that use bpmodels
Commit histories: to keep or not to keep? (pros and cons). If to keep, how?
- squash earlier history from previous package?
- tag the HEAD commit and work up from there?
Semantic versioning: from scratch (0.0.0.9999)?
Lifecycle badge: which one to use (conceivably will change between the two packages over time until the former is retired like reshape2 -> tidyr)? (could refer to the imminent post on lifecycle badges #):
- experimental according to tidyverse?
- experimental according to reconverse?
- work in progress according to repo status?
Notes
Hadley Wickham's reasons for the need for reshape2 as a reboot of reshape.
StartOverflow discussion about the "true" definition of legacy code.
Recent talk at UseR 2024 on retiring a package with many reverse dependencies

Bisaloo · 2023-06-07T09:05:56Z

Bisaloo
Jun 7, 2023
Maintainer

In terms of examples, the most famous one is probably ggplot vs ggplot2 (tidyverse/ggplot2@6198457). I'm assuming this example must have been discussed somewhere but can't find any resources right now.

Similarly, in the nearby ecosystem, we have incidence vs incidence2 (might be worth connecting with @TimTaylor to see if he'd be interested in collaborating on this blog post).

0 replies

sbfnk · 2023-06-07T09:11:52Z

sbfnk
Jun 7, 2023
Maintainer

In terms of examples, the most famous one is probably ggplot vs ggplot2 (tidyverse/ggplot2@6198457). I'm assuming this example must have been discussed somewhere but can't find any resources right now.

This is the squash-commit no-fork option, right? I can kind of see that making sense for a single author (which it seems to be in that case) but with multiple authors developing the original package I'd worry about hiding contributions.

0 replies

Bisaloo · 2023-06-07T09:20:01Z

Bisaloo
Jun 7, 2023
Maintainer

I think it's actually a copy/paste of the original codebase, which yes, is equivalent to squash. I'm not saying this is the right approach though. I mentioned it in case there is existing documentation on what precisely prompted them to switch to a new package. I know the general gist is "we made some bad design choice we couldn't get out of" but specifics might be interesting.

0 replies

jamesmbaazam · 2023-06-12T11:24:26Z

jamesmbaazam
Jun 12, 2023
Collaborator Author

I'm assuming this example must have been discussed somewhere but can't find any resources right now.

Hadley explains the reason for {reshape2} as a reboot of {reshape} in the README here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epiverse-TRACE

Some considerations for superseding R packages #227

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Notes

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Epiverse-TRACE

Some considerations for superseding R packages #227

jamesmbaazam May 3, 2023 Collaborator

Notes

Replies: 4 comments

Bisaloo Jun 7, 2023 Maintainer

sbfnk Jun 7, 2023 Maintainer

Bisaloo Jun 7, 2023 Maintainer

jamesmbaazam Jun 12, 2023 Collaborator Author

jamesmbaazam
May 3, 2023
Collaborator

Bisaloo
Jun 7, 2023
Maintainer

sbfnk
Jun 7, 2023
Maintainer

Bisaloo
Jun 7, 2023
Maintainer

jamesmbaazam
Jun 12, 2023
Collaborator Author