Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require all packages to solve / compile and include all valid compilers in their metadata #669

Open
wants to merge 56 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
0b6dd4e
Add 'compilers' field to metadata
thomashoneyman Nov 8, 2023
e15e4a8
Add utilities for building with many compilers
thomashoneyman Nov 11, 2023
d8e7e41
Remove PackageSource and require all packages to solve/compile
thomashoneyman Nov 11, 2023
8e069b6
Determine all compilers for package in publish pipeline
thomashoneyman Nov 11, 2023
5348ee2
Initial cut at discovering compiler in legacy import
thomashoneyman Nov 12, 2023
630c0bf
Always look up metadata / manifests in each publishing step
thomashoneyman Nov 12, 2023
77d6e68
Testing the pipeline...
thomashoneyman Nov 13, 2023
8749bea
Better reporting of failures
thomashoneyman Nov 13, 2023
be93d18
Update union of package set / spago / bower deps, consider ranges in …
thomashoneyman Nov 14, 2023
5a15433
Include spago.yaml files in legacy import
thomashoneyman Nov 15, 2023
559275c
Retain compilation in cache
thomashoneyman Nov 15, 2023
09d515a
Consider compilers when solving
thomashoneyman Nov 16, 2023
98ef892
Rely on solver per-compiler instead of looking at metadata for compat…
thomashoneyman Nov 16, 2023
ae621da
Adjust unused dependency pruning to replace used transitive deps
thomashoneyman Nov 17, 2023
5c54103
Remove unused functions
thomashoneyman Nov 17, 2023
441b960
wip
thomashoneyman Nov 17, 2023
3495edb
Use cache when finding first suitable compiler
thomashoneyman Nov 19, 2023
7ceab4c
WIP: Include missing direct imports
thomashoneyman Nov 19, 2023
3b85cd5
No longer try to insert missing dependencies
thomashoneyman Nov 19, 2023
3fa90b5
Address internal comments
thomashoneyman Nov 20, 2023
628fdf0
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Nov 20, 2023
0d3cef9
Re-enable comment
thomashoneyman Nov 20, 2023
4e8cb87
Remove unnecessary
thomashoneyman Nov 20, 2023
d7c4180
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Dec 1, 2023
81c85a4
Fix 'removed packages' stats
thomashoneyman Dec 1, 2023
10bccee
Feedback
thomashoneyman Dec 1, 2023
26c5aa0
Always print publish stats
thomashoneyman Dec 1, 2023
b11917e
tweaks
thomashoneyman Dec 4, 2023
3ddde82
Better publish stats formatting and write removals
thomashoneyman Dec 4, 2023
ec388d1
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Dec 4, 2023
5b17cb3
Update flake
thomashoneyman Dec 5, 2023
f924b31
Integrate inserting missing dependencies
thomashoneyman Dec 7, 2023
3cdb9b9
Tweaks for efficiency
thomashoneyman Dec 7, 2023
d0181e5
(hopefully) final run of the importer
thomashoneyman Dec 8, 2023
6f9f0cd
Update spec to note transitive dependencies requirement.
thomashoneyman Dec 8, 2023
2721c6a
attempt to discover publish compiler with both legacy and current ind…
thomashoneyman Dec 8, 2023
f8d0f80
Tweaks
thomashoneyman Dec 10, 2023
e2d6e87
Patch some legacy manifests
thomashoneyman Dec 10, 2023
b8a21a8
Range tweaks for bolson/deku/rito
thomashoneyman Dec 11, 2023
3d7ab49
Update to fix darwin support for spago builds
thomashoneyman Dec 18, 2023
6bc8d09
Clean up publish stats
thomashoneyman Dec 18, 2023
9acbc94
Enforce an explicit 0.13 date cutoff / core org cutoff
thomashoneyman Dec 19, 2023
d2c3b9a
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Jan 5, 2024
bea2013
Move location check above manifest parse
thomashoneyman Jan 17, 2024
c942722
Merge branch 'master'
thomashoneyman Jul 29, 2024
637a757
format
thomashoneyman Jul 29, 2024
ab184f2
Fix octokit codec merge error
thomashoneyman Jul 29, 2024
9cc56e7
Revert "Fix octokit codec merge error"
thomashoneyman Jul 29, 2024
c05fcb9
Set compiler explicitly to 0.15.5
thomashoneyman Jul 29, 2024
637488d
Tweaks
thomashoneyman Jul 29, 2024
662dd00
Set all purs test compilers to 0.15.4 range
thomashoneyman Jul 29, 2024
8156aa2
Update retry logic to fix integration test
thomashoneyman Jul 30, 2024
ed7913c
Complete run of legacy importer
thomashoneyman Aug 26, 2024
ec8e3ff
Format
thomashoneyman Aug 26, 2024
d7d5e49
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Aug 29, 2024
7d74da3
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Oct 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ Note:

- Globs you provide at the `includeFiles` and `excludeFiles` keys must contain only `*`, `**`, `/`, `.`, `..`, and characters for Linux file paths. It is not possible to negate a glob (ie. the `!` character), and globs cannot represent a path out of the package source directory.
- When packaging your project source, the registry will first "include" your `src` directory and always-included files such as your `purs.json` file. Then it will include files which match globs indicated by the `includeFiles` key ([always-ignored files](#always-ignored-files) cannot be included). Finally, it will apply the excluding globs indicated by the `excludeFiles` key to the included files ([always-included files](#always-included-files) cannot be excluded).
- Dependencies you provide at the `dependencies` key must exist in the registry, and the dependency ranges must be solvable (ie. it must be possible to produce a single version of each dependency that satisfies the provided version bounds, including any transitive dependencies).
- Dependencies you provide at the `dependencies` key must exist in the registry, the dependency ranges must be solvable (ie. it must be possible to produce a single version of each dependency that satisfies the provided version bounds, including any transitive dependencies), and transitive dependencies are not allowed (ie. any modules you import in your code must come from packages listed in your dependencies).

For example:

Expand All @@ -234,11 +234,12 @@ For example:

All packages in the registry have an associated metadata file, which is located in the `metadata` directory of the `registry` repository under the package name. For example, the metadata for the `aff` package is located at: https://github.com/purescript/registry/blob/main/metadata/aff.json. Metadata files are the source of truth on all published and unpublished versions for a particular package for what there content is and where the package is located. Metadata files are produced by the registry, not by package authors, though they take some information from package manifests.

Each published version of a package records three fields:
Each published version of a package records four fields:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely we can make this more future proof 😄

Suggested change
Each published version of a package records four fields:
Each published version of a package records the following fields:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think you're right 😆


- `hash`: a [`Sha256`](#Sha256) of the compressed archive fetched by the registry for the given version
- `bytes`: the size of the tarball in bytes
- `publishedTime`: the time the package was published as an `ISO8601` string
- `compilers`: compiler versions this package is known to work with. This field can be in one of two states: a single version indicates that the package worked with a specific compiler on upload but has not yet been tested with all compilers, whereas a non-empty array of versions indicates the package has been tested with all compilers the registry supports.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be tidier to only allow a non-empty array instead of several possible types? After all, the state with multiple compilers listed is going to be a superset of the first state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with the non-empty array is that it isn't clear whether an array of a single element represents one of:

  • a package that has been published with the given compiler, but which hasn't been tested against the full set of compilers
  • a package that has been tested against the full set of compilers and only works with one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When are we going to end up in a situation where we don't test the package against the whole set of compilers? My reading of the PR is that we always do?

In any case, we'll always have packages that are not "tested against the full set of compilers": when a new compiler version comes out, then all packages will need a retest, and if a package doesn't have the new compiler in the array then we don't know if it's not compatible or if it hasn't been tested yet.

Maybe we need another piece of state somewhere else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When are we going to end up in a situation where we don't test the package against the whole set of compilers? My reading of the PR is that we always do?

Yes, as implemented here we just go ahead and test everything as soon as we've published. However, I split out the state because in our initial discussions we worried about how long it takes for the compiler builds to run (it takes publishing from N seconds to N minutes in some cases — large libraries or ones that leverage a lot of type machinery). We'd originally talked about the compiler matrix being a cron job that runs later in the day. I just made it part of the publishing pipeline directly because it was simpler to implement.

If we decide that it's OK for publishing to take a long time then we can eliminate this state and just test the compilers immediately. In that case we'd just have a non-empty array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, we'll always have packages that are not "tested against the full set of compilers": when a new compiler version comes out, then all packages will need a retest, and if a package doesn't have the new compiler in the array then we don't know if it's not compatible or if it hasn't been tested yet.

Yea, that's a good point. You don't know if the metadata you're reading just hasn't been reached yet by an ongoing mass compiler build to check a new compiler.

Maybe we need another piece of state somewhere else?

Off the top of my head I don't know a good place to put some state about possible compiler support; the metadata files are not helpful if a new compiler comes out and we're redoing the build since they're only aware of the one package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide that it's OK for publishing to take a long time then we can eliminate this state and just test the compilers immediately. In that case we'd just have a non-empty array.

I'm cool with this if you are.

We'll always have packages that are not "tested against the full set of compilers" [...] maybe we need another piece of state somewhere else?

We could either a) say that the supported list of compilers for a package can potentially be missing the current compiler if the matrix is currently running and not bother with state or b) put a JSON file or something in the metadata directory that indicates whether the compiler matrix is running. Then consumers can look at that.

Personally the matrix runs infrequently enough (just new compiler releases!) that I would rather opt for (a).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pondered this for a few days and I think it's complicated?

Since we're going towards a model where we'd only run one registry job at a time and queue the rest (to prevent concurrent pushes to the repo), I'm afraid that running the whole matrix at once would make publishing very slow.
Something that we could do to counteract this could be to split the "publish" and the "matrix runs": on publishing we'd just add the package metadata with one compiler, and at the end of the publishing job we'd queue a series of "compiler matrix" jobs, each testing one compiler. These jobs would be of low priority, so new publishes would get in front of the queue, and things can stay snappy.

Personally the matrix runs infrequently enough (just new compiler releases!) that I would rather opt for (a).

The approach detailed above implies that we're in a world where we do (a), i.e. the list of compilers is always potentially out of date, and that's fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional note about the above: since the above would be introducing an "asynchronous matrix builder", we need to consider the dependency tree in our rebuilding: if a package A is published with compiler X, and then a package B depending on it is immediately published after it (a very common usecase since folks seem to publish their packages in batches), then we'd need to either make sure that matrix-build jobs for B are always run after matrix-build jobs for A, or retry them somehow.


Each unpublished version of a package records three fields:

Expand Down
12 changes: 12 additions & 0 deletions app/fixtures/github-packages/transitive-1.0.0/bower.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"name": "purescript-transitive",
"homepage": "https://github.com/purescript/purescript-transitive",
"license": "BSD-3-Clause",
"repository": {
"type": "git",
"url": "https://github.com/purescript/purescript-transitive.git"
},
"dependencies": {
"purescript-effect": "^4.0.0"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
module Transitive where

import Prelude

uno :: Int
uno = one
1 change: 1 addition & 0 deletions app/fixtures/registry/metadata/prelude.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
"published": {
"6.0.1": {
"bytes": 31142,
"compilers": ["0.15.10", "0.15.11", "0.15.12"],
"hash": "sha256-o8p6SLYmVPqzXZhQFd2hGAWEwBoXl1swxLG/scpJ0V0=",
"publishedTime": "2022-08-18T20:04:00.000Z",
"ref": "v6.0.1"
Expand Down
1 change: 1 addition & 0 deletions app/fixtures/registry/metadata/type-equality.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
"published": {
"4.0.1": {
"bytes": 2184,
"compilers": ["0.15.9", "0.15.10", "0.15.11"],
"hash": "sha256-Hs9D6Y71zFi/b+qu5NSbuadUQXe5iv5iWx0226vOHUw=",
"publishedTime": "2022-04-27T18:00:18.000Z",
"ref": "v4.0.1"
Expand Down
965 changes: 576 additions & 389 deletions app/src/App/API.purs

Large diffs are not rendered by default.

11 changes: 6 additions & 5 deletions app/src/App/CLI/Git.purs
Original file line number Diff line number Diff line change
Expand Up @@ -111,11 +111,12 @@ gitPull { address: { owner, repo }, pullMode } cwd = Except.runExcept do
, " has no untracked or dirty files, it is safe to pull the latest."
]
pure true
Just files -> do
Log.debug $ Array.fold
[ "Some files are untracked or dirty in local checkout of " <> cwd <> ": "
, NonEmptyArray.foldMap1 (append "\n - ") files
]
Just _files -> do
-- This is a bit noisy, so commenting it out for now.
-- Log.debug $ Array.fold
-- [ "Some files are untracked or dirty in local checkout of " <> cwd <> ": "
-- , NonEmptyArray.foldMap1 (append "\n - ") files
-- ]
Log.warn $ Array.fold
[ "Local checkout of " <> formatted
, " has untracked or dirty files, it may not be safe to pull the latest."
Expand Down
26 changes: 26 additions & 0 deletions app/src/App/CLI/Purs.purs
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,16 @@ import Node.ChildProcess.Types (Exit(..))
import Node.Library.Execa as Execa
import Registry.Version as Version

-- | The minimum compiler version that supports 'purs graph'
minPursGraph :: Version
minPursGraph = unsafeFromRight (Version.parse "0.14.0")

minPursuitPublish :: Version
minPursuitPublish = unsafeFromRight (Version.parse "0.14.7")

minLanguageCSTParser :: Version
minLanguageCSTParser = unsafeFromRight (Version.parse "0.15.0")

-- | Call a specific version of the PureScript compiler
callCompiler_ :: { version :: Maybe Version, command :: PursCommand, cwd :: Maybe FilePath } -> Aff Unit
callCompiler_ = void <<< callCompiler
Expand All @@ -22,6 +32,22 @@ data CompilerFailure
| MissingCompiler

derive instance Eq CompilerFailure
derive instance Ord CompilerFailure

compilerFailureCodec :: JsonCodec CompilerFailure
compilerFailureCodec = CA.codec' decode encode
where
decode :: Json -> Either JsonDecodeError CompilerFailure
decode json =
map CompilationError (CA.decode (CA.array compilerErrorCodec) json)
<|> map UnknownError (CA.decode CA.string json)
<|> map (const MissingCompiler) (CA.decode CA.null json)

encode :: CompilerFailure -> Json
encode = case _ of
CompilationError errors -> CA.encode (CA.array compilerErrorCodec) errors
UnknownError message -> CA.encode CA.string message
MissingCompiler -> CA.encode CA.null unit

type CompilerError =
{ position :: SourcePosition
Expand Down
9 changes: 1 addition & 8 deletions app/src/App/Effect/Cache.purs
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,6 @@ handleMemoryFs env = case _ of
case inFs of
Nothing -> pure $ reply Nothing
Just entry -> do
Log.debug $ "Fell back to on-disk entry for " <> memory
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just so noisy. Maybe we can introduce a Log.superDebug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this log useful at all? I think it's ok to just remove it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think they're not really useful now that we're confident the cache works correctly. I had them in there from when I first developed it and would either sometimes see things I thought should be cached not get cached, or I wanted to make sure something I removed from the cache really was.

putMemoryImpl env.ref unit (Key memory (Const entry))
pure $ reply $ Just $ unCache entry
Just cached ->
Expand Down Expand Up @@ -226,8 +225,7 @@ getMemoryImpl ref (Key id (Reply reply)) = do
let (unCache :: CacheValue -> b) = unsafeCoerce
cache <- Run.liftEffect $ Ref.read ref
case Map.lookup id cache of
Nothing -> do
Log.debug $ "No cache entry found for " <> id <> " in memory."
Nothing ->
pure $ reply Nothing
Just cached -> do
pure $ reply $ Just $ unCache cached
Expand All @@ -236,7 +234,6 @@ putMemoryImpl :: forall x r a. CacheRef -> a -> MemoryEncoding Const a x -> Run
putMemoryImpl ref next (Key id (Const value)) = do
let (toCache :: x -> CacheValue) = unsafeCoerce
Run.liftEffect $ Ref.modify_ (Map.insert id (toCache value)) ref
Log.debug $ "Wrote cache entry for " <> id <> " in memory."
pure next

deleteMemoryImpl :: forall x r a. CacheRef -> MemoryEncoding Ignore a x -> Run (LOG + EFFECT + r) a
Expand Down Expand Up @@ -275,7 +272,6 @@ getFsImpl cacheDir = case _ of
let path = Path.concat [ cacheDir, safePath id ]
Run.liftAff (Aff.attempt (FS.Aff.readFile path)) >>= case _ of
Left _ -> do
Log.debug $ "No cache found for " <> id <> " at path " <> path
pure $ reply Nothing
Right buf -> do
pure $ reply $ Just buf
Expand All @@ -284,7 +280,6 @@ getFsImpl cacheDir = case _ of
let path = Path.concat [ cacheDir, safePath id ]
Run.liftAff (Aff.attempt (FS.Aff.readTextFile UTF8 path)) >>= case _ of
Left _ -> do
Log.debug $ "No cache file found for " <> id <> " at path " <> path
pure $ reply Nothing
Right content -> case Argonaut.Parser.jsonParser content of
Left parseError -> do
Expand All @@ -307,7 +302,6 @@ putFsImpl cacheDir next = case _ of
Log.warn $ "Failed to write cache entry for " <> id <> " at path " <> path <> " as a buffer: " <> Aff.message fsError
pure next
Right _ -> do
Log.debug $ "Wrote cache entry for " <> id <> " as a buffer at path " <> path
pure next

AsJson id codec (Const value) -> do
Expand All @@ -317,7 +311,6 @@ putFsImpl cacheDir next = case _ of
Log.warn $ "Failed to write cache entry for " <> id <> " at path " <> path <> " as JSON: " <> Aff.message fsError
pure next
Right _ -> do
Log.debug $ "Wrote cache entry for " <> id <> " at path " <> path <> " as JSON."
pure next

deleteFsImpl :: forall a b r. FilePath -> FsEncoding Ignore a b -> Run (LOG + AFF + r) a
Expand Down
4 changes: 2 additions & 2 deletions app/src/App/Effect/GitHub.purs
Original file line number Diff line number Diff line change
Expand Up @@ -241,8 +241,8 @@ request octokit githubRequest@{ route: route@(GitHubRoute method _ _), codec } =
-- auto-expire cache entries. We will be behind GitHub at most this amount per repo.
--
-- TODO: This 'diff' check should be removed once we have conditional requests.
Right _ | DateTime.diff now prevResponse.modified >= Duration.Hours 4.0 -> do
Log.debug $ "Found cache entry but it was modified more than 4 hours ago, refetching " <> printedRoute
Right _ | DateTime.diff now prevResponse.modified >= Duration.Hours 23.0 -> do
Log.debug $ "Found cache entry but it was modified more than 23 hours ago, refetching " <> printedRoute
result <- requestWithBackoff octokit githubRequest
Cache.put _githubCache (Request route) (result <#> \resp -> { response: CA.encode codec resp, modified: now, etag: Nothing })
pure result
Expand Down
2 changes: 1 addition & 1 deletion app/src/App/Effect/PackageSets.purs
Original file line number Diff line number Diff line change
Expand Up @@ -428,7 +428,7 @@ validatePackageSet (PackageSet set) = do
-- We can now attempt to produce a self-contained manifest index from the
-- collected manifests. If this fails then the package set is not
-- self-contained.
Tuple unsatisfied _ = ManifestIndex.maximalIndex (Set.fromFoldable success)
Tuple unsatisfied _ = ManifestIndex.maximalIndex ManifestIndex.IgnoreRanges (Set.fromFoldable success)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always ignore ranges in package sets, but we should rely on them otherwise, especially now that we're actually solving packages as part of publishing and can be more trusting that they aren't bogus.


-- Otherwise, we can check if we were able to produce an index from the
-- package set alone, without errors.
Expand Down
13 changes: 7 additions & 6 deletions app/src/App/Effect/Registry.purs
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ handle env = Cache.interpret _registryCache (Cache.handleMemory env.cacheRef) <<
let formatted = formatPackageVersion name version
Log.info $ "Writing manifest for " <> formatted <> ":\n" <> printJson Manifest.codec manifest
index <- Except.rethrow =<< handle env (ReadAllManifests identity)
case ManifestIndex.insert manifest index of
case ManifestIndex.insert ManifestIndex.ConsiderRanges manifest index of
Left error ->
Except.throw $ Array.fold
[ "Can't insert " <> formatted <> " into manifest index because it has unsatisfied dependencies:"
Expand All @@ -274,7 +274,7 @@ handle env = Cache.interpret _registryCache (Cache.handleMemory env.cacheRef) <<
let formatted = formatPackageVersion name version
Log.info $ "Deleting manifest for " <> formatted
index <- Except.rethrow =<< handle env (ReadAllManifests identity)
case ManifestIndex.delete name version index of
case ManifestIndex.delete ManifestIndex.ConsiderRanges name version index of
Left error ->
Except.throw $ Array.fold
[ "Can't delete " <> formatted <> " from manifest index because it would produce unsatisfied dependencies:"
Expand Down Expand Up @@ -358,7 +358,7 @@ handle env = Cache.interpret _registryCache (Cache.handleMemory env.cacheRef) <<

Just metadata -> do
Log.debug $ "Successfully read metadata for " <> printedName <> " from path " <> path
Log.debug $ "Setting metadata cache to singleton entry (as cache was previosuly empty)."
Log.debug $ "Setting metadata cache to singleton entry (as cache was previously empty)."
Cache.put _registryCache AllMetadata (Map.singleton name metadata)
pure $ Just metadata

Expand Down Expand Up @@ -835,8 +835,9 @@ readManifestIndexFromDisk root = do

entries <- map partitionEithers $ for packages.success (ManifestIndex.readEntryFile root)
case entries.fail of
[] -> case ManifestIndex.fromSet $ Set.fromFoldable $ Array.foldMap NonEmptyArray.toArray entries.success of
[] -> case ManifestIndex.fromSet ManifestIndex.ConsiderRanges $ Set.fromFoldable $ Array.foldMap NonEmptyArray.toArray entries.success of
Left errors -> do
Log.debug $ "Could not read a valid manifest index from entry files: " <> Array.foldMap (Array.foldMap (\(Manifest { name, version }) -> "\n - " <> formatPackageVersion name version) <<< NonEmptyArray.toArray) entries.success
Except.throw $ append "Unable to read manifest index (some packages are not satisfiable): " $ Array.foldMap (append "\n - ") do
Tuple name versions <- Map.toUnfoldable errors
Tuple version dependency <- Map.toUnfoldable versions
Expand Down Expand Up @@ -877,10 +878,10 @@ readAllMetadataFromDisk metadataDir = do

entries <- Run.liftAff $ map partitionEithers $ for packages.success \name -> do
result <- readJsonFile Metadata.codec (Path.concat [ metadataDir, PackageName.print name <> ".json" ])
pure $ map (Tuple name) result
pure $ bimap (Tuple name) (Tuple name) result

unless (Array.null entries.fail) do
Except.throw $ append "Could not read metadata for all packages because the metadata directory is invalid (some package metadata cannot be decoded):" $ Array.foldMap (append "\n - ") entries.fail
Except.throw $ append "Could not read metadata for all packages because the metadata directory is invalid (some package metadata cannot be decoded):" $ Array.foldMap (\(Tuple name err) -> "\n - " <> PackageName.print name <> ": " <> err) entries.fail

Log.debug "Successfully read metadata entries."
pure $ Map.fromFoldable entries.success
Expand Down
49 changes: 36 additions & 13 deletions app/src/App/Effect/Source.purs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import Registry.App.Prelude
import Data.Array as Array
import Data.DateTime (DateTime)
import Data.JSDate as JSDate
import Data.String as String
import Effect.Aff as Aff
import Effect.Exception as Exception
import Effect.Now as Now
Expand All @@ -20,6 +21,7 @@ import Registry.App.Effect.GitHub as GitHub
import Registry.App.Effect.Log (LOG)
import Registry.App.Effect.Log as Log
import Registry.App.Legacy.Types (RawVersion(..))
import Registry.Foreign.FSExtra as FS.Extra
import Registry.Foreign.Octokit as Octokit
import Registry.Foreign.Tar as Foreign.Tar
import Registry.Location as Location
Expand All @@ -28,8 +30,15 @@ import Run as Run
import Run.Except (EXCEPT)
import Run.Except as Except

-- | Packages can be published via the legacy importer or a user via the API. We
-- | determine some information differently in these cases, such as the time the
-- | package was published.
data ImportType = Old | Recent

derive instance Eq ImportType

-- | An effect for fetching package sources
data Source a = Fetch PackageSource FilePath Location String (Either String FetchedSource -> a)
data Source a = Fetch FilePath Location String (Either String FetchedSource -> a)

derive instance Functor Source

Expand All @@ -41,17 +50,17 @@ _source = Proxy
type FetchedSource = { path :: FilePath, published :: DateTime }

-- | Fetch the provided location to the provided destination path.
fetch :: forall r. PackageSource -> FilePath -> Location -> String -> Run (SOURCE + EXCEPT String + r) FetchedSource
fetch source destination location ref = Except.rethrow =<< Run.lift _source (Fetch source destination location ref identity)
fetch :: forall r. FilePath -> Location -> String -> Run (SOURCE + EXCEPT String + r) FetchedSource
fetch destination location ref = Except.rethrow =<< Run.lift _source (Fetch destination location ref identity)

-- | Run the SOURCE effect given a handler.
interpret :: forall r a. (Source ~> Run r) -> Run (SOURCE + r) a -> Run r a
interpret handler = Run.interpret (Run.on _source handler Run.send)

-- | Handle the SOURCE effect by downloading package source to the file system.
handle :: forall r a. Source a -> Run (GITHUB + LOG + AFF + EFFECT + r) a
handle = case _ of
Fetch source destination location ref reply -> map (map reply) Except.runExcept do
handle :: forall r a. ImportType -> Source a -> Run (GITHUB + LOG + AFF + EFFECT + r) a
handle importType = case _ of
Fetch destination location ref reply -> map (map reply) Except.runExcept do
Log.info $ "Fetching " <> printJson Location.codec location
case location of
Git _ -> do
Expand All @@ -73,7 +82,7 @@ handle = case _ of
Log.debug $ "Using legacy Git clone to fetch package source at tag: " <> show { owner, repo, ref }

let
repoDir = Path.concat [ destination, repo ]
repoDir = Path.concat [ destination, repo <> "-" <> ref ]

clonePackageAtTag = do
let url = Array.fold [ "https://github.com/", owner, "/", repo ]
Expand All @@ -83,24 +92,38 @@ handle = case _ of
Failed err -> Aff.throwError $ Aff.error err
Succeeded _ -> pure unit

alreadyExists = String.contains (String.Pattern "already exists and is not an empty directory")

Run.liftAff (Aff.attempt clonePackageAtTag) >>= case _ of
Left error -> do
Log.error $ "Failed to clone git tag: " <> Aff.message error
Except.throw $ "Failed to clone repository " <> owner <> "/" <> repo <> " at ref " <> ref
Right _ -> Log.debug $ "Cloned package source to " <> repoDir
Left error -> do
Log.error $ "Failed to clone git tag: " <> Aff.message error <> ", retrying..."
when (alreadyExists (Aff.message error)) $ FS.Extra.remove repoDir
Run.liftAff (Aff.delay (Aff.Milliseconds 1000.0))
Run.liftAff (Aff.attempt clonePackageAtTag) >>= case _ of
Right _ -> Log.debug $ "Cloned package source to " <> repoDir
Left error2 -> do
Log.error $ "Failed to clone git tag (attempt 2): " <> Aff.message error2 <> ", retrying..."
when (alreadyExists (Aff.message error)) $ FS.Extra.remove repoDir
Run.liftAff (Aff.delay (Aff.Milliseconds 1000.0))
Run.liftAff (Aff.attempt clonePackageAtTag) >>= case _ of
Right _ -> Log.debug $ "Cloned package source to " <> repoDir
Left error3 -> do
Log.error $ "Failed to clone git tag (attempt 3): " <> Aff.message error3
Except.throw $ "Failed to clone repository " <> owner <> "/" <> repo <> " at ref " <> ref

Log.debug $ "Getting published time..."

let
getRefTime = case source of
LegacyPackage -> do
getRefTime = case importType of
Old -> do
timestamp <- Except.rethrow =<< Run.liftAff (Git.gitCLI [ "log", "-1", "--date=iso8601-strict", "--format=%cd", ref ] (Just repoDir))
jsDate <- Run.liftEffect $ JSDate.parse timestamp
dateTime <- case JSDate.toDateTime jsDate of
Nothing -> Except.throw $ "Could not parse timestamp of git ref to a datetime given timestamp " <> timestamp <> " and parsed js date " <> JSDate.toUTCString jsDate
Just parsed -> pure parsed
pure dateTime
CurrentPackage ->
Recent ->
Run.liftEffect Now.nowDateTime

-- Cloning will result in the `repo` name as the directory name
Expand Down
Loading
Loading