Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A survey on Golang's dependency management modes (GOPATH and Go Modules): status quo, problems and challenges #1

Open
KateGo520 opened this issue Jan 7, 2021 · 0 comments

Comments

@KateGo520
Copy link
Owner

KateGo520 commented Jan 7, 2021

This report has been collected into [golang/go/wiki/ExperienceReports](https://github.com/golang/go/wiki/ExperienceReports#modules).

The following empirical findings are summarized based on a research paper:
[ICSE’21] Ying Wang, Liang Qiao, Chang Xu*, Yepang Liu, Na Meng, Shing-Chi Cheung, Hai Yu and Zhiliang Zhu. Hero: On the Chaos When PATH Meets Modules, In 43rd International Conference on Software Engineering (ICSE 2021).

Background

Golang has two dependency management modes, GOPATH and Go Modules.

GOPATH Mode:

Prior to Golang 1.11, Golang uses GOPATH mode to assist in managing libraries. Libraries referenced by a project are fetched using command go get. This mode does not require developers to provide any configuration file. It works by matching the URLs of the site hosting referenced libraries with the import paths specified by the go get command. However, it fetches only a library's latest version. To overcome this restriction, developers use third-party tools such as Dep and Glide to manage different library versions under the same vendor directory.

https://divan.dev/posts/gopath/

GOPATH offered a simple and clean structure for your directory - bin/, pkg/ and src/ triplet. 
Down the directory depth, the structure was mirroring Go import names, which were mirroring URL of the version control system for the package.
GOPATH didn’t try to solve the versioning problem. It was postponed and offloaded to the third-party tools, and ultimately resulted in Go Modules.
With GOPATH, there is only one ‘master’ version, and that’s it.

https://www.ardanlabs.com/blog/2019/10/modules-01-why-and-what.html

When operating in GOPATH mode, the solution was to use go get to identify and clone all the repos for all the dependencies into your GOPATH workspace. However, this wasn’t a perfect solution since go get only knows how to clone and update the latest code from the master branch for each dependency. Pulling code from the master branch for each dependency might be fine when you write your initial code. Eventually after a few months (or years) of dependencies evolving independently, the dependencies’ latest master code is likely to no longer be compatible with your project. This is because your project is not respecting the version tags so any upgrade might contain a breaking change.
When operating in the new module mode, the option for go get to clone the repos for all the dependencies into a single well defined workspace is no longer preferred. Plus, you need to find a way of referencing a compatible version of each dependency that would work for the entirety of the project. Then there is supporting the use of different major semantic versions of the same dependency within your project incase your dependencies are importing different major versions of the same package.
Although some solutions to these problems already existed in the form of community-developed tooling (dep, godep, glide, …), Go needed an integrated solution. The solution was to reuse the module file to maintain a list of direct and sometimes indirect dependencies by version. Then treat any given version of a repo as a single immutable bundle of code. This versioned immutable bundle is called a module.

Go Modules Mode:

Golang 1.11 introduced the Go Modules mode, which allows multiple library versions to be referenced by a module using different paths. A module comprises a tree of Golang source files with a go.mod configuration file defined in the tree's root directory. The configuration file explicitly specifies the module's dependency of specific library versions as well as a module path by which the module itself can be uniquely referenced by other projects. The file must be specified according to the semantic import versioning (SIV) rules. For instance, projects whose major versions are v2 or above, should include a version suffix like "/v2" at the end of their module paths.

https://github.com/golang/go/wiki/Modules#semantic-import-versioning

As a result of Semantic Import Versioning, code opting in to Go modules must comply with these rules:
1.	Follow semver. (An example VCS tag is v1.2.3).
2.	If the module is version v2 or higher, the major version of the module must be included as a /vN at the end of the module paths used in go.mod files (e.g., module github.com/my/mod/v2, require github.com/my/mod/v2 v2.0.1) and in the package import path (e.g., import "github.com/my/mod/v2/mypkg"). This includes the paths used in go get commands (e.g., go get github.com/my/mod/[email protected]. Note there is both a /v2 and a @v2.0.1 in that example. One way to think about it is that the module name now includes the /v2, so include /v2 whenever you are using the module name).
3.	If the module is version v0 or v1, do not include the major version in either the module path or the import path.

1
https://www.ardanlabs.com/blog/2019/10/modules-01-why-and-what.html

Modules provide an integrated solution for three key problems that have been a pain point for developers since Go’s initial release:
Ability to work with Go code outside of the GOPATH workspace.
Ability to version a dependency and identify the most compatible version to use.
Ability to manage dependencies natively using the Go tooling.

Module-aware VS Module-unaware:

Module-awareness: The capability of recognizing a virtual path ended with a version suffix like "/v2" from projects in Go Modules.

2

Module-aware project: A project is module-aware if and only if it uses a compatible or new Golang version and does not use any third-party tools.
Module-unaware project: A project is module-unaware if and only if it uses a legacy Golang version, or it uses a compatible or new Golang version with a third-party tool.

3

This figure shows how module-aware and module-unaware projects differ in parsing an import path with or without a v2+ version suffix.
For an import path like "github.com/user/projectA", a module-aware project could reference a specific version v0.∗.∗ or v1.∗.∗ of projectA under v2 (latest version under v2, by default), while a module-unaware project would reference the version on projectA's main branch (typically the latest version).
For an import path like "github.com/user/projectA/v2", a module-aware project could reference a specific version v2.∗.∗ of projectA (latest version under v3, by default), while a module-unaware project would fail to recognize it.

Some concrete issues

Many projects suffered from various issues caused by such mixed dependency management modes. Go Modules is not backward compatible with GOPATH. SIV rules can be violated even if a Golang project and its referenced upstream projects both use Go Modules. Resolve these issues for a Golang project requires up-to-date knowledge of its upstream and downstream projects, as well as their possible heterogeneous uses of two dependency management modes. There are some concrete issues.

Issue A:

Build errors can occur when projects in GOPATH with no module-awareness directly or transitively depend on projects in Go Modules which have virtual paths with version suffixes.

4

E.g., issues: pierrec/lz4#33, golang/dep#1962, gin-gonic/gin#2427, micro/go-micro#1839, urfave/cli#866, golang/go#37995, Masterminds/glide#1017, redis/go-redis#1143, Masterminds/glide#968, libp2p/go-libp2p-kad-dht#258, gofrs/uuid#67.

golang/dep#1962

If a library has a major version 2, then it's module line in go.mod will be module github.com/foo/bar/v2 even if it is being fetched from github.com/foo/bar. Go, since 1.10.3, will build just fine when using imports like import bar "github.com/foo/bar/v2", but dep complains that the repo doesn't have a submodule v2.
As a result, we can't use dep at all if we depend on packages using go.mod.

Issue B:

A project that has migrated to Go Modules may not find their referenced libraries in downstream GOPATH mode projects, or may fetch unintended library versions, due to different import path interpretations by the two modes.

hybridgroup/gobot#689

This is due to the usage of the dependency "github.com/codegangsta/cli", which has been renamed to "github.com/urfave/cli"

Issue C:

Errors will occur when projects in Go Modules depend on projects also in Go Modules but not following SIV rules:
(1) lacking version suffixes like "/v2" in module paths or import paths, although the versions of concerned projects are v2+ (e.g., issue kataras/iris#1355, pierrec/lz4#39, v2ray/v2ray-core#2438, etcd-io/etcd#11154, prometheus/prometheus#6048, vitessio/vitess#5019, golang/go#32695, dgrijalva/jwt-go#301, shirou/gopsutil#663);
(2) version tags not following the semver (e.g., issue osrg/gobgp#1848, gohugoio/hugo#5639, gin-gonic/gin#1388, rclone/rclone#2960, robfig/cron#196);
(3) module paths in go.mod files inconsistent with URLs associated with concerned projects on their hosting sites (e.g., issue jwplayer/jwplatformgo#9, micro/micro#272, etcd-io/etcd#11808).

golang/go#31543

Of the various go.mod files I have looked at in repos with v2+ semver tags over the last several months, I estimate more than 50% of those go.mod files are incorrect due to missing the required /vN at the end of the module path.

golang/go#32695

Observe people accidentally creating and using modules that have v2+ semver tags but that have not adopted Semantic Import Versioning.
For example, if there is a module example.com/foo that:
>has a go.mod file (that is, it has adopted modules)
>has a v2.0.0 semver tag as its latest tag
>and its module line reads module example.com/foo (without the in theory required /v2)
then this still works for a module-based consumer, even though the module is in a "bad" state:
go get example.com.com/[email protected]+incompatible
This means people can and do create usable v2+ modules that did not adopt Semantic Import Versioning, and consumers can and do consume those "bad" modules.
However, other things such as upgrades or go get example.com.com/foo@latest do not work as expected, which leads to confusion.

Fixing Solutions

In order to resolve the above issues, developers have come up with various solutions. But resolve these issues for a Golang project requires up-to-date knowledge of its upstream and downstream projects, as well as their possible heterogeneous uses of two dependency management modes. Resolving these issues in a project locally without considering the ecosystem in a holistic way can easily cause new issues to its downstream projects.

Cases

Below summarize eight common fixing solutions with different trade-offs.

13

Solution 1: Projects in GOPATH migrate to Go Modules.

It encourages migration from GOPATH to Go Modules, for fixing Issue A, since Issue A is due to projects still in GOPATH and unable to recognize import paths with version suffixes.
For example, in issue redis/go-redis#1154, project go-redis/redis migrated to Go Modules, but its downstream projects were still in GOPATH. Then, downstream projects were suggested to migrate to Go Modules as well to avoid build errors. This solved downstream projects' Issue A, but also caused Issue A to downstream projects' module-unaware downstream projects. As a result, new Issue A (e.g., issues redpanda-data/connect#232, redpanda-data/connect#270) occurred for these projects.
Examples: golang/go#37995, filebrowser/filebrowser#530, gotestyourself/gotest.tools#203, kataras/iris#1385, labstack/echo#1321, micro/go-micro#1839, urfave/cli#866, DataDog/dd-trace-go#606, oauth2-proxy/oauth2-proxy#642, prometheus/client_golang#673.
[This solution will affect downstream projects in GOPATH]

Solution 2: Projects in Go Modules roll back to GOPATH.

It cancels previous Go Modules migration, for fixing Issue A and Issue C, by solving migration's caused incompatibility.
For example, in issue gofrs/uuid#61(Issue A), project gofrs/uuid's migration to Go Modules broke many downstream projects' building (in GOPATH). As a compromise, gofrs/uuid rolled back to GOPATH, waiting for downstream projects to migrate first. In issue shirou/gopsutil#663(Issue C), shirou/gopsutil and its downstream projects were all in Go Modules, but shirou/gopsutil violated SIV rules (lacking a version suffix in its module path of v2+ release), causing build errors to downstream projects. As such, shirou/gopsutil chose to roll back to GOPATH, temporarily making downstream projects to work again. This solves the problem, but hinders the migration status to the ecosystem.
Examples: dgraph-io/badger#4662, golang-migrate/migrate#103, go-mail/mail#39, patrickmn/go-cache#89, stripe/stripe-go#712, cenkalti/backoff#76, redis/go-redis#1149, go-chi/chi#327, pierrec/lz4#39, go-chi/jwtauth#42, sercand/kuberesolver#11.

Solution 3: Changing the strategy of releasing v2+ projects in Go Modules from major branch to subdirectory.

It targets at Issue A, where module-unaware projects cannot recognize virtual import paths for v2+ libraries in Go Modules. The new strategy creates physical paths by code cloning, so that libraries can be referenced by module-unaware projects. However, this is a workaround treatment and needs extra maintenance in subsequent releases.
Examples: mediocregopher/radix#128, nicksnyder/go-i18n#184, olivere/elastic#1145, gomodule/redigo#366, golang/go#37995, twitchtv/twirp#169.

Solution 4: Maintaining v2+ libraries in Go Modules in downstream projects' Vendor directories rather than referencing them by virtual import paths.

It targets at Issue A. By making a copy of libraries in downstream projects, these projects can avoid fetching them by virtual import paths.
For example, in issue mediocregopher/radix#141, mediocregopher/radix refused to use the major subdirectory strategy for its v2+ project release in Go Modules. Its downstream projects had to make a copy of mediocregopher/radix’s code in their Vendor directories, causing extra maintenance and potential Issue B in future.
Examples: brianvoe/gofakeit#88, gopherjs/gopherjs#881, redis/go-redis#1143, mholt/archiver#192, moby/moby#40371, vmihailenco/msgpack#237.
[This solution may affect downstream projects in Go Modules]

Solution 5: Using a replace directive with version information to avoid using import paths in referencing libraries.

It addresses Issue B.1 (problematic import path interpretations) and Issue C (import path violating SIV rules).
For example, in issue andrewstuart/goq#12, a client project used a directive to replace the original import path: "replace github.com/andrewstuart/goq => astuart.co/goq v1.0.0", to reference its expected project andrewstuart/goq's version. However, this makes developers no longer able to use the go get command to fetch automatically upgraded libraries.
Examples: maistra/istio#78, scylladb/gocql#3, etcd-io/etcd#10773, cockroachdb/cockroach#47246, micro/micro#1149, moby/moby#39302, grpc/grpc-go#3500.

Solution 6: Updating import paths for libraries that have changed their repositories.

It fixes Issue B.2, where libraries in a project's Vendor directory may be inconsistent with the ones referenced by their import paths. It updates import paths to help a project's downstream projects in Go Modules fetch consistent library versions.
For example, in issue google/go-cloud#429, google/go-cloud managed library coreos/etcd in its Vendor directory, which later changed its hosting repository from github.com/coreos/etcd to **go.etcd.io/etcd**. To fix build errors for google/go-cloud's downstream projects in Go Modules, it updated coreos/etcd's import path to the latest one for the consistency. This fixes the issue, benefiting all affected downstream projects, without impacting others in the ecosystem.
Examples: hybridgroup/gobot#689, kythe/kythe#3344, pion/webrtc#1082, census-instrumentation/opencensus-go#1052, micro/go-plugins#372, go-kit/kit#940, golang/lint#436, golang/oauth2#395, sirupsen/logrus#1041, nats-io/nats.go#478, unknwon/com#26.

Solution 7: Projects in Go Modules fix configuration items to strictly follow SIV rules.

It urges projects that have migrated to Go Modules to follow Golang's official guidelines on SIV rules, for fixing Issue C.
For example, in issue redis/go-redis#1149, project go-redis/redis added version suffix "/v7" at the end of its module path to follow SIV rules. However, Issue C fixed, but the project's downstream projects in GOPATH may encounter Issue A (unable to recognize such version suffixes, e.g., issue redis/go-redis#1151).
Examples: etcd-io/etcd#11154, godbus/dbus#125, gotestyourself/gotest.tools#140, redpanda-data/connect#232, kataras/iris#1355, labstack/echo#1244, mholt/archiver#187, golang/go#27009, golang/go#33879, gin-gonic/gin#1388, golangci/golangci-lint#371, osrg/gobgp#1848, golang/go#37529, gohugoio/hugo#5639, istio/api#1201, golang/go#29731, vitessio/vitess#5019, googleforgames/open-match#675, micro/micro#272, microsoft/go-winio#156, golang/go#31437.
[This solution may affect downstream projects in GOPATH]

Solution 8: Using a hash commit ID for a specific version to replace a problematic version number in library referencing.

Downstream projects use this approach to avoid Issue C, where some projects in Go Modules violate SIV rules in version numbers and cause build errors to downstream projects also in Go Modules. It avoids referencing problematic version numbers, by require directives with the specific hash commit ID in downstream projects' go.mod files.
For example, in issue prometheus/prometheus#6048, one of prometheus/prometheus's downstream projects in Go Modules chose to use directive "require github.com/prometheus/prometheus 43acd0e" to reference its expected version v2.12.0. As Solution 5, this also causes developers cannot fetch automatically upgraded libraries.
Examples: argoproj/argo-workflows#2602, concourse/concourse#3952, ibm-messaging/mq-golang#121, dexidp/dex#1710, zouyx/agollo#78, rwynn/monstache#316, pingcap/parser#812.
[This solution will affect downstream projects in Go Modules]

As it stands, GOPATH mode and Go Modules mode will co-exist for a while in the ecosystem. Developers are likely to encounter these issues again.

The purpose of this report is to help developers understand these issues better and find the best solution to fix the issues. Hope this report can help the ecosystem make a smooth transition from GOPATH to Go Modules.

References:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant