Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify stance on legacy support #328

Open
thomashoneyman opened this issue Feb 4, 2022 · 13 comments
Open

Clarify stance on legacy support #328

thomashoneyman opened this issue Feb 4, 2022 · 13 comments

Comments

@thomashoneyman
Copy link
Member

thomashoneyman commented Feb 4, 2022

The PureScript registry is built to support packages using the new manifest format in a .purs.json file. However, we also support importing legacy packages from bower (#215), and the API supports adding new versions of legacy packages via the legacy flag:

https://github.com/purescript/registry/blob/d69c9baf804fb7040881a5c388c2926643393bb4/v1/Operation.dhall#L10-L11

However, the spec doesn't include any mention of what support we do and do not have for "legacy" (ie. Bower or Spago pre-registry) packages. We need to clarify this.

The registry will provide some support for legacy packages for at least the first year of operation. Registry support only means that we guarantee legacy packages can be added to the registry; we do not guarantee that registry packages remain compatible with legacy package managers like Bower. We do, however, guarantee support for older versions of Spago. In practice this means that we will preserve the existing package sets repository and continue publishing package sets usable with older versions of Spago there.

A legacy package is defined as a package that is published without a .purs.json manifest file, and it can be one of two things: a project containing a bower.json file, or a project containing a spago.dhall file and accompanying packages.dhall (ie. a Spago file using a configuration format pre-registry -- not the new proposed Spago format).

A legacy package can be uploaded to the registry via an Addition or Update operation by setting legacy: true. Setting the legacy flag means that the registry will:

  1. Attempt to parse the package's legacy package files, such as bower.json, spago.dhall, and packages.dhall files, in order to create a .purs.json manifest for the package
  2. Include the generated .purs.json file in the resulting package tarball
  3. Attempt to use the legacy package files to publish the package version to Pursuit (this will fail for Spago)
  4. Attempt to use the generated .purs.json file to publish the package version to the package sets

Legacy packages cannot use monorepo setups because their version is parsed from the ref provided in the addition or update operation. Attempting to upload legacy packages via a monorepo setup will fail.


@f-f I do need your help understanding exactly what has to happen to guarantee the packages.dhall file in the package-sets repo remains compatible with old Spago clients for the next year, even as we're publishing package sets using the new package sets format in the registry. This may need its own issue to track backwards-compatible mirroring from the registry to the package sets repo.

@thomashoneyman
Copy link
Member Author

With the benefit of time and sleep I've rethought through why and how we will support legacy projects. In short, the registry has little ability to support legacy projects. But as PureScript maintainers we can take action elsewhere to keep the experience smooth. Below, I've proposed everything I think we can do to support these projects.

Before I go on, I should clarify what I mean by a "legacy" project. From the registry's perspective, a legacy project is any package without a purs.json file. More generally, a legacy project is any project that doesn't use the registry at all -- it uses the Bower registry via Pulp, or it uses the package sets repo via Spago.

What does it mean to "support" a legacy project? We want these projects to be able to publish to the new registry without dropping their current package manager or registry. We also want a smooth migration path for their current package manager to switch to the new registry.

There are only three places where we can provide this support, and the registry can provide only one:

  1. We can publish legacy packages to the new registry
  2. We can maintain existing package managers and registries
  3. We can update package managers to use the new registry

The registry is only relevant to publishing packages. Maintaining package managers and other registries are not its responsibility. But as PureScript maintainers, it's our responsibility.

So -- what support can we realistically provide?

Publishing

The publishing process is the same for all packages: open an issue on GitHub with an addition or update. We will fetch and upload your package. From the registry's perspective, a 'legacy' and a 'new' project differ only in the existence of a purs.json file. So the only thing we can do to make things easier is drop this rule -- we can still accept these projects via the legacy importer.

Outside the registry, we can go a step further. Pulp and Spago users won't be able to use the registry, but we could still provide a command that opens the GitHub issue for them. It would be a 'legacy publish' workflow. It could also generate a purs.json file.

Package Managers & Existing Registries

PureScript projects are either Pulp + Bower or Spago + package sets. None of these projects will be able to use the new registry, so there's nothing to update. Still, we need to take some care to ensure these users aren't hurt by the new registry.

With a new registry available, package authors may drop Bower support for their package. There's also no obligation to release any more package sets in the legacy format. Both of these would result in Pulp or Spago users losing access to packages over time.

As far as Bower goes -- we have no control over whether library authors choose to keep or drop Bower support, so this one is out of our hands. (We could force packages to have Bowerfiles when publishing to the registry, but I am against it. If others disagree then I can explain my reasoning.) Yet we are maintainers of two hundred or so libraries across the PureScript organizations. We can and should maintain Bowerfiles in those libraries to minimize the inconvenience to Bower users.

As far as the package sets, we have already committed to mirroring the new package sets back to the package-sets repository in the legacy format. Legacy Spago users will receive up-to-date package sets so long as we do this.

Those actions are about all we can do to ensure the new registry doesn't reduce packages available in other registries. We also need to update package managers to help users publish their legacy packages, as described in the "Publishing" section above.

Package Managers & The New Registry

Of course, package managers won't always be in legacy mode. Spago and Pulp can both be updated to use the new registry proper -- generating purs.json files, using the registry index to locate packages, and downloading and extracting the tarballs. In both cases we want to provide a smooth migration path, but that discussion is better done in the respective repositories. It's really the package manager's responsibility to handle the migration well.


With all this in mind, I think our path forward should be this:

  1. Allow Bower-based packages to be published to the registry with only a bower.json file for at least a year
  2. Allow Spago-based packages to be published to the registry with only a legacy spago.dhall and packages.dhall pair for at least a year. These packages will not automatically be published to Pursuit, because that requires a bower.json or purs.json file to be present.

This covers the publishing side. What about existing registries / package sets?

  1. Guarantee that we generate package sets that are compatible with legacy Spago for a year.
  2. Commit to maintaining Bowerfiles in all core PureScript libraries for a year

And what about current package managers?

  1. Add a command to Spago and Pulp that generates a .purs.json file for them and a command that publishes the user's legacy package to the new registry.

Package managers should use the new registry in their new versions, but this isn't necessary right away, so I'm not listing it.

@f-f
Copy link
Member

f-f commented Feb 5, 2022

@thomashoneyman this is all great! 👏

A couple of notes:

  • I opened Package sets naming convention and mirroring #329 to track the mirroring of package sets from here to the legacy location
  • the main thing that we need to do to make sure that packages can be used by old versions of Spago is that we don't allow anyone to use the subdir key at all. This is because Spago assumes that "1 repo = 1 package", and will look for sources in the src folder at the root of the repo.

@thomashoneyman
Copy link
Member Author

thomashoneyman commented Nov 6, 2023

It's been quite a while since we've talked about this issue, but from what I can tell this is a little out of date. Some things have happened:

  1. The registry went into alpha about 1 year ago
  2. We have decided to permanently support alternate manifest formats from purs.json (Accept multiple package manifest formats #435, with spago support in Read spago.yaml files and use them to create Manifests #593)
  3. Spago has been rewritten in PureScript, so while spago.dhall is "legacy", spago.yaml is not. There are no plans to publish legacy spago packages to the registry (instead, switch to the new spago).
  4. Package sets have been continuously mirrored for more than a year

We still have a blocker in that as soon as we allow monorepos, the subdir key, and non-GitHub providers then bower, old spago, pulp, and package set mirroring will all end. However, we're well past the window of intended legacy support for bower and I think we should feel free to implement those changes as soon as we want to.

In other words, there is no more guaranteed legacy support, though we haven't decided when it will actually end. Given that there is no active work to build the needed support for monorepos / non-GitHub into the compiler / Pursuit, this will probably be the situation for a long time. Things are pretty much ready to go on the registry side.

@JordanMartinez
Copy link
Contributor

Given that there is no active work to build the needed support for monorepos / non-GitHub into the compiler / Pursuit, this will probably be the situation for a long time. Things are pretty much ready to go on the registry side.

I think there is active work here. The work on spago docs implements a majority of Pursuit. It's just a matter of making the rest into an actual web server we deploy. Also, if we exposed a purs api subcommand, we might be able to defer some compiler-specific checks to there.

@thomashoneyman
Copy link
Member Author

I see. If that's intended to become the next iteration of Pursuit then indeed there is work being done! In that case, I'm happy to hear it, and I'd like to join any conversations around building monorepo / non-GitHub support into the next iteration of Pursuit. We'll still need changes on the compiler side / purs publish too — largely by relaxing unnecessary restrictions the compiler imposes for historical reasons.

Still, I think that we're well within our rights at this point to terminate legacy support once we have implemented those changes and are ready to deploy a new Pursuit.

@f-f
Copy link
Member

f-f commented Nov 7, 2023

Yes to all of this, and I think the purs publish is the basis for any further work - purs publish currently produces the docs for Pursuit, and it requires a tag to be specified, and sources in the root.

So the dependency goes: purs publish work -> new Pursuit -> then we can allow monorepo and non-Github things

By now I forgot what purs publish does, but I wonder if entirely bypassing it is an option, i.e. produce the data that Pursuit needs in some other way

@thomashoneyman
Copy link
Member Author

Just took a quick look and I think we probably could produce the format outside of purs publish. The process is a bunch of checks culminating in this type:
https://github.com/purescript/purescript/blob/6b49918b9646863e73bbedd1d47f474ba3783408/src/Language/PureScript/Docs/Types.hs#L51-L68

And the publish command just prints it out on the command line:
https://github.com/purescript/purescript/blob/6b49918b9646863e73bbedd1d47f474ba3783408/app/Command/Publish.hs#L74-L80

You can see that tools like pulp just push that off to Pursuit directly, first by gzipping the json they got:
https://github.com/purescript-contrib/pulp/blob/78f7aa6ae76337ea6f8362cac03fb1c8ee1858cd/src/Pulp/Publish.purs#L68

...and then uploading it.
https://github.com/purescript-contrib/pulp/blob/78f7aa6ae76337ea6f8362cac03fb1c8ee1858cd/src/Pulp/Publish.purs#L85

So I do think that so long as we can produce this same 'Package' type with the same JSON encoding then we could bypass purs publish altogether. The main function we'll probably have to worry about is the one that gathers up all the modules to include in the payload after you've generated the docs:

https://github.com/purescript/purescript/blob/6b49918b9646863e73bbedd1d47f474ba3783408/src/Language/PureScript/Docs/Collect.hs#L43-L50

If we can replicate this function then we can likely replace purs publish altogether. That's overall tracked in #525.


These are the checks done — for example, this GitHub-only restriction on the package:
https://github.com/purescript/purescript/blob/6b49918b9646863e73bbedd1d47f474ba3783408/src/Language/PureScript/Publish/Registry/Compat.hs#L78-L86

This module describes various errors purs publish checks for:
https://github.com/purescript/purescript/blob/6b49918b9646863e73bbedd1d47f474ba3783408/src/Language/PureScript/Publish/ErrorsWarnings.hs

The actual publishing process runs these checks to eventually produce a 'package':
https://github.com/purescript/purescript/blob/6b49918b9646863e73bbedd1d47f474ba3783408/src/Language/PureScript/Publish.hs#L132-L133

@thomashoneyman
Copy link
Member Author

@JordanMartinez the work on spago docs includes parsing and rendering documentation pages based on the format output by purs docs?

@JordanMartinez
Copy link
Contributor

@thomashoneyman I'm not entirely sure. While I started working on Pursuit stuff, I turned my focus towards Spago because of the issues I ran into when using it for the new Pursuit. Plus, I figured getting the new Spago out sooner would be better overall before using it for the new Pursuit.

The Pursuit work was being done privately, but I've made what I have done public here: https://github.com/JordanMartinez/purescript-pursuit

As for what the compiler and pulp do, I've summarized them both here: https://github.com/JordanMartinez/purescript-pursuit/blob/main/docs/publish-docs.md

@thomashoneyman
Copy link
Member Author

The registry already does everything that pulp / bower did (produce a resolutions file, etc.), and it also has all the checks we want from purs publish integrated into the publish API. The only part that it's missing is the step between compiling the package and producing a Package from the output directory, for which it defers to purs publish.

For the Pursuit part specifically, see:

publishToPursuit
:: forall r
. PublishToPursuit
-> Run (PURSUIT + COMMENT + LOG + EXCEPT String + AFF + EFFECT + r) Unit
publishToPursuit { packageSourceDir, dependenciesDir, compiler, resolutions } = do

With this in mind, I think we could focus on a) producing the package format Pursuit requires here in the registry and removing the call to Purs.Publish and b) ensuring a Pursuit rewrite can still use the existing Package format to display hyperlinked docs. If both of those are possible then we're off to the races.

@thomashoneyman
Copy link
Member Author

A little more detail — here's an example of the specific output purs publish produces for a package:
https://github.com/purescript/pursuit-backups/blob/master/purescript-effect/4.0.0.json

By far the bulk of the information is at the modules key, which is an array of objects where each object is the documentation for a specific module. Turns out that these objects are literally just the docs.json files for each module in the output directory, so that key is easy to produce: slurp up the docs file the compiler produced for each file.

The rest of the JSON object for effect-4.0.0 looks like this, with comments indicating how we get each piece of information from the existing package manifest, metadata entry, or another source:

{
  // We always set the uploader to 'pacchettibotti'
  "uploader": "JordanMartinez",

  // This is a Bower file format; we can easily produce one from a Manifest.
  "packageMeta": {
    // Manifest.location
    "homepage": "https://github.com/purescript/purescript-effect",
    // Manifest.location
    "repository": {
      "url": "https://github.com/purescript/purescript-effect.git",
      "type": "git"
    },
    // Manifest.excludeFiles potentially, or omit this altogether since Pursuit doesn't use it
    "ignore": [ "bower_components" ],
    // Manifest.dependencies, with 'purescript-' prepended to each one
    "dependencies": { "purescript-prelude": "^6.0.0" },
    // Manifest.name, with 'purescript-' prepended
    "name": "purescript-effect",
    // Manifest.license in an array singleton
    "license": ["BSD-3-Clause"]
  },

  // Metadata.publishedTime 
  "tagTime": "2022-04-27T14:04:24+0000",

  // Every object in this array is the same as the contents of a docs.json in output, ie. output/Effect/docs.json;
  // we require all packages compile so we can always get the docs.
  "modules": []

  // When we publish a package we produce exact resolutions so we can reuse those
  // or set this to an empty object if Pursuit doesn't use it
  // https://github.com/purescript/registry-dev/blob/f68685892062f68f3d697e59a95c4d836c1d5170/app/src/App/API.purs#L683
  "resolvedDependencies": { "purescript-prelude": "6.0.0" },

  // Manifest.version
  "version": "4.0.0",

  // Manifest.location, filtered to only accept the GitHub location type, until Pursuit can accept more
  "github": ["purescript", "purescript-effect"],

  // Metadata.ref, until Pursuit stops requiring Git tags
  "versionTag": "v4.0.0",

  // We can easily do this with the existing associateModules function
  // https://github.com/purescript/registry-dev/blob/f68685892062f68f3d697e59a95c4d836c1d5170/lib/src/PursGraph.purs#L64
  //
  // and indeed we already do this in the publishing process for legacy packages
  // https://github.com/purescript/registry-dev/blob/f68685892062f68f3d697e59a95c4d836c1d5170/app/src/App/API.purs#L613
  "moduleMap": {
    "Data.BooleanAlgebra": "purescript-prelude",
    "Data.Ring": "purescript-prelude",
    "Data.Semigroup.Generic": "purescript-prelude",
    "Data.Monoid.Generic": "purescript-prelude"
  },

  // The compiler version is required in the publishing process so we can reuse it
  "compilerVersion": "0.14.5"
}

In other words, once we have solved and compiled the package we're publishing, we have all the information we need to also produce the JSON payload Pursuit expects for publishing. We can bypass purs publish altogether.

@thomashoneyman
Copy link
Member Author

thomashoneyman commented Nov 12, 2023

@JordanMartinez perhaps a Pursuit rewrite in PureScript could have a core module that exposes this 'publish' type, which the registry could import and use in the application. Similar to how spago-core imports the registry library (registry-lib), but then the registry imports spago-core into the main application.

@thomashoneyman
Copy link
Member Author

(Turns out I wasn't quite right, because the docs.json format is incomplete — see #525 for details)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants