Skip to content

Latest commit

 

History

History
518 lines (293 loc) · 32.9 KB

CHANGELOG-3.2.md

File metadata and controls

518 lines (293 loc) · 32.9 KB

Previous change logs can be found at CHANGELOG-3.1.

v3.2.19 (2018-04-24)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Metrics, Monitoring

Security, Authentication

  • Fix TLS reload when certificate SAN field only includes IP addresses but no domain names.
    • In Go, server calls (*tls.Config).GetCertificate for TLS reload if and only if server's (*tls.Config).Certificates field is not empty, or (*tls.ClientHelloInfo).ServerName is not empty with a valid SNI from the client. Previously, etcd always populates (*tls.Config).Certificates on the initial client TLS handshake, as non-empty. Thus, client was always expected to supply a matching SNI in order to pass the TLS verification and to trigger (*tls.Config).GetCertificate to reload TLS assets.
    • However, a certificate whose SAN field does not include any domain names but only IP addresses would request *tls.ClientHelloInfo with an empty ServerName field, thus failing to trigger the TLS reload on initial TLS handshake; this becomes a problem when expired certificates need to be replaced online.
    • Now, (*tls.Config).Certificates is created empty on initial TLS client handshake, first to trigger (*tls.Config).GetCertificate, and then to populate rest of the certificates on every new TLS connection, even when client SNI is empty (e.g. cert only includes IPs).

etcd

  • Add --initial-election-tick-advance flag to configure initial election tick fast-forward.
    • By default, --initial-election-tick-advance=true, then local member fast-forwards election ticks to speed up "initial" leader election trigger.
    • This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
    • Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
    • However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
    • Now, this can be disabled by setting --initial-election-tick-advance=false.
    • Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring --initial-election-tick-advance at the cost of slow initial bootstrap.
    • If single-node, it advances ticks regardless.
    • Address disruptive rejoining follower node.

Go

v3.2.18 (2018-03-29)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Improved

  • Adjust election timeout on server restart to reduce disruptive rejoining servers.
    • Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
    • Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.

Metrics, Monitoring

Go

v3.2.17 (2018-03-08)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed: v2

Fixed: v3

Go

v3.2.16 (2018-02-12)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed: v3

Go

v3.2.15 (2018-01-22)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed: v3

Go

v3.2.14 (2018-01-11)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Improved

Fixed: v3

Go

v3.2.13 (2018-01-02)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed: v3

Go

v3.2.12 (2017-12-20)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Dependency

Fixed: v3

clientv3

Go

v3.2.11 (2017-12-05)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Dependency

Security, Authentication

See security doc for more details.

Fixed: clientv3

Documentation

  • Remove --listen-metrics-urls flag in monitoring document (non-released in v3.2.x, planned for v3.3.x).

Go

v3.2.10 (2017-11-16)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Dependency

Security, Authentication

See security doc for more details.

  • Revert discovery SRV auth ServerName with *.{ROOT_DOMAIN} to support non-wildcard subject alternative names in the certs (see issue #8445 for more contexts).
    • For instance, etcd --discovery-srv=etcd.local will only authenticate peers/clients when the provided certs have root domain etcd.local (not *.etcd.local) as an entry in Subject Alternative Name (SAN) field.

Fixed: v3

Fixed: clientv3

Go

v3.2.9 (2017-10-06)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Security, Authentication

See security doc for more details.

  • Update golang.org/x/crypto/bcrypt (see golang/crypto@6c586e1).
  • Fix discovery SRV bootstrapping to authenticate ServerName with *.{ROOT_DOMAIN}, in order to support sub-domain wildcard matching (see issue #8445 for more contexts).
    • For instance, etcd --discovery-srv=etcd.local will only authenticate peers/clients when the provided certs have root domain *.etcd.local as an entry in Subject Alternative Name (SAN) field.

Go

v3.2.8 (2017-09-29)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed: v2 client

  • Fix v2 client failover to next endpoint on mutable operation.

Fixed: grpc-proxy

Go

v3.2.7 (2017-09-01)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Security, Authentication

Fixed: clientv3

Go

v3.2.6 (2017-08-21)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed: v3

  • Fix watch restore from snapshot.
  • Fix multiple URLs for --listen-peer-urls flag.
  • Add --enable-pprof flag to etcd configuration file format.

Metrics, Monitoring

  • Fix etcd_debugging_mvcc_keys_total inconsistency.

Go

v3.2.5 (2017-08-04)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

v3 etcdctl

  • Return non-zero exit code on unhealthy endpoint health.

Security, Authentication

See security doc for more details.

  • Server supports reverse-lookup on wildcard DNS SAN. For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server first reverse-lookups the remote IP address to get a list of names mapping to that address (e.g. nslookup IPADDR). Then accepts the connection if those names have a matching name with peer cert's DNS names (either by exact or wildcard match). If none is matched, server forward-lookups each DNS entry in peer cert (e.g. look up example.default.svc when the entry is *.example.default.svc), and accepts connection only when the host's resolved addresses have the matching IP address with the peer's remote IP address. For example, peer B's CSR (with cfssl) SAN field is ["*.example.default.svc", "*.example.default.svc.cluster.local"] when peer B's remote IP address is 10.138.0.2. When peer B tries to join the cluster, peer A reverse-lookup the IP 10.138.0.2 to get the list of host names. And either exact or wildcard match the host names with peer B's cert DNS names in Subject Alternative Name (SAN) field. If none of reverse/forward lookups worked, it returns an error "tls: "10.138.0.2" does not match any of DNSNames ["*.example.default.svc","*.example.default.svc.cluster.local"]. See issue#8268 for more detail.

Metrics, Monitoring

  • Fix unreachable /metrics endpoint when --enable-v2=false.

Fixed: grpc-proxy

Other

  • Add container registry gcr.io/etcd-development/etcd.

Go

v3.2.4 (2017-07-19)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed

  • Do not block on active client stream when stopping server
  • Fix gRPC proxy Snapshot RPC error handling

Go

v3.2.3 (2017-07-14)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed

  • Let clients establish unlimited streams

Added

  • Tag docker images with minor versions
    • e.g. docker pull quay.io/coreos/etcd:v3.2 to fetch latest v3.2 versions

Go

v3.2.2 (2017-07-07)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Improved

  • Rate-limit lease revoke on expiration.
  • Extend leases on promote to avoid queueing effect on lease expiration.

Security, Authentication

See security doc for more details.

  • Server accepts connections if IP matches, without checking DNS entries. For instance, if peer cert contains IP addresses and DNS names in Subject Alternative Name (SAN) field, and the remote IP address matches one of those IP addresses, server just accepts connection without further checking the DNS names. For example, peer B's CSR (with cfssl) SAN field is ["invalid.domain", "10.138.0.2"] when peer B's remote IP address is 10.138.0.2 and invalid.domain is a invalid host. When peer B tries to join the cluster, peer A successfully authenticates B, since Subject Alternative Name (SAN) field has a valid matching IP address. See issue#8206 for more detail.

Fixed: v3

  • Accept connection with matched IP SAN but no DNS match.
    • Don't check DNS entries in certs if there's a matching IP.

Fixed: gRPC gateway

  • Use user-provided listen address to connect to gRPC gateway.
    • net.Listener rewrites IPv4 0.0.0.0 to IPv6 [::], breaking IPv6 disabled hosts.
    • Only v3.2.0, v3.2.1 are affected.

Go

v3.2.1 (2017-06-23)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Fixed: v3

  • Fix backend database in-memory index corruption issue on restore (only 3.2.0 is affected).

Fixed: gRPC gateway

  • Fix Txn marshaling.

Metrics, Monitoring

  • Fix backend database size debugging metrics.

Go

v3.2.0 (2017-06-09)

See code changes and v3.2 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.2 upgrade guide.

Improved

  • Improve backend read concurrency.

Breaking Changes

  • Increased --snapshot-count default value from 10,000 to 100,000.
    • Higher snapshot count means it holds Raft entries in memory for longer before discarding old entries.
    • It is a trade-off between less frequent snapshotting and higher memory usage.
    • User lower --snapshot-count value for lower memory usage.
    • User higher --snapshot-count value for better availabilities of slow followers (less frequent snapshots from leader).
  • clientv3.Lease.TimeToLive returns LeaseTimeToLiveResponse.TTL == -1 on lease not found.
  • clientv3.NewFromConfigFile is moved to clientv3/yaml.NewConfig.
  • embed.Etcd.Peers field is now []*peerListener.
  • Rejects domains names for --listen-peer-urls and --listen-client-urls (3.1 only prints out warnings), since domain name is invalid for network interface binding.

Dependency

Metrics, Monitoring

  • Add etcd_debugging_server_lease_expired_total metrics.

Security, Authentication

See security doc for more details.

  • TLS certificates get reloaded on every client connection. This is useful when replacing expiry certs without stopping etcd servers; it can be done by overwriting old certs with new ones. Refreshing certs for every connection should not have too much overhead, but can be improved in the future, with caching layer. Example tests can be found here.
  • Server denies incoming peer certs with wrong IP SAN. For instance, if peer cert contains any IP addresses in Subject Alternative Name (SAN) field, server authenticates a peer only when the remote IP address matches one of those IP addresses. This is to prevent unauthorized endpoints from joining the cluster. For example, peer B's CSR (with cfssl) SAN field is ["*.example.default.svc", "*.example.default.svc.cluster.local", "10.138.0.27"] when peer B's actual IP address is 10.138.0.2, not 10.138.0.27. When peer B tries to join the cluster, peer A will reject B with the error x509: certificate is valid for 10.138.0.27, not 10.138.0.2, because B's remote IP address does not match the one in Subject Alternative Name (SAN) field.
  • Server resolves TLS DNSNames when checking SAN. For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server authenticates a peer only when forward-lookups (dig b.com) on those DNS names have matching IP with the remote IP address. For example, peer B's CSR (with cfssl) SAN field is ["b.com"] when peer B's remote IP address is 10.138.0.2. When peer B tries to join the cluster, peer A looks up the incoming host b.com to get the list of IP addresses (e.g. dig b.com). And rejects B if the list does not contain the IP 10.138.0.2, with the error tls: 10.138.0.2 does not match any of DNSNames ["b.com"].
  • Auth support JWT token.

Added

  • RPCs
    • Add Election, Lock service.
  • Native client etcdserver/api/v3client
    • client "embedded" in the server.
  • Logging, monitoring
    • Server warns large snapshot operations.

etcd

  • Add --enable-v2 flag to enable v2 API server.
    • --enable-v2=true by default.
  • Add --auth-token flag.
  • v3.2 compactor runs every hour.
    • Compactor only supports periodic compaction.
    • Compactor continues to record latest revisions every 5-minute.
    • For every hour, it uses the last revision that was fetched before compaction period, from the revision records that were collected every 5-minute.
    • That is, for every hour, compactor discards historical data created before compaction period.
    • The retention window of compaction period moves to next hour.
    • For instance, when hourly writes are 100 and --auto-compaction-retention=10, v3.1 compacts revision 1000, 2000, and 3000 for every 10-hour, while v3.2 compacts revision 1000, 1100, and 1200 for every 1-hour.
    • If compaction succeeds or requested revision has already been compacted, it resets period timer and removes used compacted revision from historical revision records (e.g. start next revision collect and compaction from previously collected revisions).
    • If compaction fails, it retries in 5 minutes.

clientv3

  • STM prefetching.
  • Add namespace feature.
  • Add ErrOldCluster with server version checking.
  • Translate WithPrefix() into WithFromKey() for empty key.

v3 etcdctl

  • Add check perf command.
  • Add --from-key flag to role grant-permission command.
  • lock command takes an optional command to execute.

Fixed: v2

  • Allow snapshot over 512MB.

grpc-proxy

  • Proxy endpoint discovery.
  • Namespaces.
  • Coalesce lease requests.

gateway

Other

  • v3 client
    • concurrency package's elections updated to match RPC interfaces.
    • let client dial endpoints not in the balancer.
  • Release
    • Annotate acbuild with supports-systemd-notify.
    • Add nsswitch.conf to Docker container image.
    • Add ppc64le, arm64(experimental) builds.

Go