Skip to content

Latest commit

 

History

History
435 lines (382 loc) · 54.3 KB

README.md

File metadata and controls

435 lines (382 loc) · 54.3 KB

spec

Terminology

Term Definition
VolumeID The identifier of the volume generated by the plugin.
CO Container Orchestration system that communicates with plugins using CSI service RPCs.
SP Storage Provider, the vendor of a CSI plugin implementation.
DR Disaster Recovery.
RPC Remote Procedure Call.

Objective

Define a standard that will enable storage vendors (SP) to develop controllers/plugins for DR or to talk to the different CO systems.

Goals in MVP

The new standard will

  • Provide API at volume level granularity.
  • Enable SP authors to write one replication compliant plugin that “just works” across all COs that implement RPC.
  • Define API (RPCs) that enable:
    • Enable/Disable volume replication.
    • Promote/Demote volume.
    • Resync volume to solve the issue before using the volume.

Non-Goals in MVP

  • Replication at different granular levels
  • Replication of volume snapshots.

Solution Overview

This specification defines an interface along with the minimum operational and packaging recommendations for a storage provider (SP) to implement a Replication compatible plugin. The interface declares the RPCs that a plugin MUST expose.

Architecture

arch

RPC Interface

  • Controller Service: The Controller plugin MUST implement these sets of RPCs.
syntax = "proto3";
package replication;

import "github.com/container-storage-interface/spec/lib/go/csi/csi.proto";
import "google/protobuf/descriptor.proto";
import "google/protobuf/duration.proto";
import "google/protobuf/timestamp.proto";

option go_package = ".;replication";

extend google.protobuf.FieldOptions {
  // Indicates that this field is OPTIONAL and part of an experimental
  // API that may be deprecated and eventually removed between minor
  // releases.
  bool alpha_field = 1100;
}

// Controller holds the RPC methods for replication and all the methods it
// exposes should be idempotent.
service Controller {
  // EnableVolumeReplication RPC call to enable the volume replication.
  rpc EnableVolumeReplication (EnableVolumeReplicationRequest)
  returns (EnableVolumeReplicationResponse) {}
  // DisableVolumeReplication RPC call to disable the volume replication.
  rpc DisableVolumeReplication (DisableVolumeReplicationRequest)
  returns (DisableVolumeReplicationResponse) {}
  // PromoteVolume RPC call to promote the volume.
  rpc PromoteVolume (PromoteVolumeRequest)
  returns (PromoteVolumeResponse) {}
  // DemoteVolume RPC call to demote the volume.
  rpc DemoteVolume (DemoteVolumeRequest)
  returns (DemoteVolumeResponse) {}
  // ResyncVolume RPC call to resync the volume.
  rpc ResyncVolume (ResyncVolumeRequest)
  returns (ResyncVolumeResponse) {}
  // GetVolumeReplicationInfo RPC call to get the volume replication
  // information.
  rpc GetVolumeReplicationInfo (GetVolumeReplicationInfoRequest)
  returns (GetVolumeReplicationInfoResponse) {}
}

EnableVolumeReplication

// EnableVolumeReplicationRequest holds the required information to enable
// replication on a volume.
message EnableVolumeReplicationRequest {
  // The identifier for this volume, generated by the plugin during
  // CreateVolume CSI RPC call.
  // This field is OPTIONAL.
  // This field MUST contain enough information to uniquely identify
  // this specific volume vs all other volumes supported by this plugin.
  // This field SHALL be used by the CO in subsequent calls to refer to
  // this volume.
  // This field is deprecated. Please use "replication_source" to
  // specify the replication source.
  string volume_id = 1 [deprecated = true];
  // The identifier for the replication.
  // Plugin specific parameters passed in as opaque key-value pairs.
  map<string, string> parameters = 2;
  // Secrets required by the plugin to complete the request.
  map<string, string> secrets = 3 [(csi.v1.csi_secret) = true];
  // This field is OPTIONAL.
  // This field MUST contain enough information, together with volume_id,
  // to uniquely identify this specific replication
  // vs all other replications supported by this plugin.
  string replication_id = 4 [(alpha_field) = true];
  // If specified, this field will contain volume or volume group id
  // for replication.
  ReplicationSource replication_source = 5;
}

// EnableVolumeReplicationResponse holds the information to send when
// replication is successfully enabled on a volume.
message EnableVolumeReplicationResponse {
}

Error Scheme

Condition gRPC Code Description Recovery Behavior
Missing required field 3 INVALID_ARGUMENT Indicates that a required field is missing from the request. Caller MUST fix the request by adding the missing required field before retrying.
Replication Source does not exist 5 NOT_FOUND Indicates that the specified source does not exist. Caller MUST verify that the replication_source is correct, the source is accessible, and has not been deleted before retrying with exponential back off.
Operation pending for Replication Source 10 ABORTED Indicates that there is already an operation pending for the specified replication_source. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per replication_source at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same replication_source. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. Caller SHOULD ensure that there are no other calls pending for the specified replication_source, and then retry with exponential back off.
Not authenticated 16 UNAUTHENTICATED The invoked RPC does not carry secrets that are valid for authentication. Caller SHALL either fix the secrets provided in the RPC, or otherwise regalvanize said secrets such that they will pass authentication by the Plugin for the attempted RPC, after which point the caller MAY retry the attempted RPC.
Error is Unknown 2 UNKNOWN Indicates that a unknown error is generated Caller MUST study the logs before retrying

DisableVolumeReplication

// DisableVolumeReplicationRequest holds the required information to disable
// replication on a volume.
message DisableVolumeReplicationRequest {
  // The identifier for this volume, generated by the plugin during
  // CreateVolume CSI RPC call.
  // This field is OPTIONAL.
  // This field MUST contain enough information to uniquely identify
  // this specific volume vs all other volumes supported by this plugin.
  // This field SHALL be used by the CO in subsequent calls to refer to
  // this volume.
  // This field is deprecated. Please use "replication_source" to
  // specify the replication source.
  string volume_id = 1 [deprecated = true];
  // Plugin specific parameters passed in as opaque key-value pairs.
  map<string, string> parameters = 2;
  // Secrets required by the plugin to complete the request.
  map<string, string> secrets = 3 [(csi.v1.csi_secret) = true];
  // The identifier for the replication.
  // This field is OPTIONAL.
  // This field MUST contain enough information, together with volume_id,
  // to uniquely identify this specific replication
  // vs all other replications supported by this plugin.
  string replication_id = 4 [(alpha_field) = true];
  // If specified, this field will contain volume or volume group id
  // for replication.
  ReplicationSource replication_source = 5;
}

// DisableVolumeReplicationResponse holds the information to send when
// replication is successfully disabled on a volume.
message DisableVolumeReplicationResponse {
}

Error Scheme

Condition gRPC Code Description Recovery Behavior
Missing required field 3 INVALID_ARGUMENT Indicates that a required field is missing from the request. Caller MUST fix the request by adding the missing required field before retrying.
Replication Source does not exist 5 NOT_FOUND Indicates that the specified source does not exist. Caller MUST verify that the replication_source is correct, the source is accessible, and has not been deleted before retrying with exponential back off.
Operation pending for Replication Source 10 ABORTED Indicates that there is already an operation pending for the specified replication_source. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per replication_source at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same replication_source. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. Caller SHOULD ensure that there are no other calls pending for the specified replication_source, and then retry with exponential back off.
Not authenticated 16 UNAUTHENTICATED The invoked RPC does not carry secrets that are valid for authentication. Caller SHALL either fix the secrets provided in the RPC, or otherwise regalvanize said secrets such that they will pass authentication by the Plugin for the attempted RPC, after which point the caller MAY retry the attempted RPC.
Error is Unknown 2 UNKNOWN Indicates that a unknown error is generated Caller MUST study the logs before retrying

PromoteVolume

// PromoteVolumeRequest holds the required information to promote volume as a
// primary on local cluster.
message PromoteVolumeRequest {
  // The identifier for this volume, generated by the plugin during
  // CreateVolume CSI RPC call.
  // This field is OPTIONAL.
  // This field MUST contain enough information to uniquely identify
  // this specific volume vs all other volumes supported by this plugin.
  // This field SHALL be used by the CO in subsequent calls to refer to
  // this volume.
  // This field is deprecated. Please use "replication_source" to
  // specify the replication source.
  string volume_id = 1 [deprecated = true];
  // This field is optional.
  // Default value is false, force option to Promote the volume.
  bool force = 2;
  // Plugin specific parameters passed in as opaque key-value pairs.
  map<string, string> parameters = 3;
  // Secrets required by the plugin to complete the request.
  map<string, string> secrets = 4 [(csi.v1.csi_secret) = true];
  // The identifier for the replication.
  // This field is OPTIONAL.
  // This field MUST contain enough information, together with volume_id,
  // to uniquely identify this specific replication
  // vs all other replications supported by this plugin.
  string replication_id = 5 [(alpha_field) = true];
  // If specified, this field will contain volume or volume group id
  // for replication.
  ReplicationSource replication_source = 6;
}

// PromoteVolumeResponse holds the information to send when
// volume is successfully promoted.
message PromoteVolumeResponse{
}

Error Scheme

Condition gRPC Code Description Recovery Behavior
Missing required field 3 INVALID_ARGUMENT Indicates that a required field is missing from the request. Caller MUST fix the request by adding the missing required field before retrying.
Replication Source does not exist 5 NOT_FOUND Indicates that the specified source does not exist. Caller MUST verify that the replication_source is correct, the source is accessible, and has not been deleted before retrying with exponential back off.
Replication Source is not replicated 9 FAILED_PRECONDITION Indicates that the Source corresponding to the specified replication_source could not be promoted due to failed precondition (for example replication is not enabled or Source cannot be promoted without force option). Caller SHOULD ensure that replication is enabled.
Operation pending for Replication Source 10 ABORTED Indicates that there is already an operation pending for the specified replication_source. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per replication_source at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same replication_source. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. Caller SHOULD ensure that there are no other calls pending for the specified replication_source, and then retry with exponential back off.
Call not implemented 12 UNIMPLEMENTED The invoked RPC is not implemented by the Plugin or disabled in the Plugin's current mode of operation. Caller MUST NOT retry.
Not authenticated 16 UNAUTHENTICATED The invoked RPC does not carry secrets that are valid for authentication. Caller SHALL either fix the secrets provided in the RPC, or otherwise regalvanize said secrets such that they will pass authentication by the Plugin for the attempted RPC, after which point the caller MAY retry the attempted RPC.
Error is Unknown 2 UNKNOWN Indicates that a unknown error is generated Caller MUST study the logs before retrying

DemoteVolume

// DemoteVolumeRequest holds the required information to demote volume on local
// cluster.
message DemoteVolumeRequest {
  // The identifier for this volume, generated by the plugin during
  // CreateVolume CSI RPC call.
  // This field is OPTIONAL.
  // This field MUST contain enough information to uniquely identify
  // this specific volume vs all other volumes supported by this plugin.
  // This field SHALL be used by the CO in subsequent calls to refer to
  // this volume.
  // This field is deprecated. Please use "replication_source" to
  // specify the replication source.
  string volume_id = 1 [deprecated = true];
  // This field is optional.
  // Default value is false, force option to Demote the volume.
  bool force = 2;
  // Plugin specific parameters passed in as opaque key-value pairs.
  map<string, string> parameters = 3;
  // Secrets required by the plugin to complete the request.
  map<string, string> secrets = 4 [(csi.v1.csi_secret) = true];
  // The identifier for the replication.
  // This field is OPTIONAL.
  // This field MUST contain enough information, together with volume_id,
  // to uniquely identify this specific replication
  // vs all other replications supported by this plugin.
  string replication_id = 5 [(alpha_field) = true];
  // If specified, this field will contain volume or volume group id
  // for replication.
  ReplicationSource replication_source = 6;
}

// DemoteVolumeResponse holds the information to send when
// volume is successfully demoted.
message DemoteVolumeResponse{
}

Error Scheme

Condition gRPC Code Description Recovery Behavior
Missing required field 3 INVALID_ARGUMENT Indicates that a required field is missing from the request. Caller MUST fix the request by adding the missing required field before retrying.
Replication Source does not exist 5 NOT_FOUND Indicates that the specified source does not exist. Caller MUST verify that the replication_source is correct, the source is accessible, and has not been deleted before retrying with exponential back off.
Replication Source in not replicated 9 FAILED_PRECONDITION Indicates that the Replication Source corresponding to the specified replication_source could not be demoted due to failed precondition (for example replication is not enabled). Caller SHOULD ensure that replication is enabled.
Operation pending for Replication Source 10 ABORTED Indicates that there is already an operation pending for the specified replication_source. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per replication_source at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same replication_source. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. Caller SHOULD ensure that there are no other calls pending for the specified replication_source, and then retry with exponential back off.
Call not implemented 12 UNIMPLEMENTED The invoked RPC is not implemented by the Plugin or disabled in the Plugin's current mode of operation. Caller MUST NOT retry.
Not authenticated 16 UNAUTHENTICATED The invoked RPC does not carry secrets that are valid for authentication. Caller SHALL either fix the secrets provided in the RPC, or otherwise regalvanize said secrets such that they will pass authentication by the Plugin for the attempted RPC, after which point the caller MAY retry the attempted RPC.
Error is Unknown 2 UNKNOWN Indicates that a unknown error is generated Caller MUST study the logs before retrying

ResyncVolume

// ResyncVolumeRequest holds the required information to resync volume.
message ResyncVolumeRequest {
  // The identifier for this volume, generated by the plugin during
  // CreateVolume CSI RPC call.
  // This field is OPTIONAL.
  // This field MUST contain enough information to uniquely identify
  // this specific volume vs all other volumes supported by this plugin.
  // This field SHALL be used by the CO in subsequent calls to refer to
  // this volume.
  // This field is deprecated. Please use "replication_source" to
  // specify the replication source.
  string volume_id = 1 [deprecated = true];
  // This field is optional.
  // Default value is false, force option to Resync the volume.
  bool force = 2;
  // Plugin specific parameters passed in as opaque key-value pairs.
  map<string, string> parameters = 3;
  // Secrets required by the plugin to complete the request.
  map<string, string> secrets = 4 [(csi.v1.csi_secret) = true];
  // The identifier for the replication.
  // This field is OPTIONAL.
  // This field MUST contain enough information, together with volume_id,
  // to uniquely identify this specific replication
  // vs all other replications supported by this plugin.
  string replication_id = 5 [(alpha_field) = true];
  // If specified, this field will contain volume or volume group id
  // for replication.
  ReplicationSource replication_source = 6;
}

// ResyncVolumeResponse holds the information to send when
// volume is successfully resynced.
message ResyncVolumeResponse{
  // Indicates that the volume is ready to use.
  // The default value is false.
  // This field is REQUIRED.
  bool ready = 1;
}

Error Scheme

Condition gRPC Code Description Recovery Behavior
Missing required field 3 INVALID_ARGUMENT Indicates that a required field is missing from the request. Caller MUST fix the request by adding the missing required field before retrying.
Replication Source does not exist 5 NOT_FOUND Indicates that the specified source does not exist. Caller MUST verify that the replication_source is correct, the source is accessible, and has not been deleted before retrying with exponential back off.
Replication Source is not replicated or Replication Source is not demoted 9 FAILED_PRECONDITION Indicates that the replication_source could not be resynced due to failed precondition (for example replication is not enabled on replication_source or thereplication_source is not in the demoted state). Caller SHOULD ensure that Replication is enabled and the Replication Source is demoted.
Operation pending for Replication Source 10 ABORTED Indicates that there is already an operation pending for the specified replication_source. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per replication_source at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same replication_source. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. Caller SHOULD ensure that there are no other calls pending for the specified replication_source, and then retry with exponential back off.
Call not implemented 12 UNIMPLEMENTED The invoked RPC is not implemented by the Plugin or disabled in the Plugin's current mode of operation. Caller MUST NOT retry.
Not authenticated 16 UNAUTHENTICATED The invoked RPC does not carry secrets that are valid for authentication. Caller SHALL either fix the secrets provided in the RPC, or otherwise regalvanize said secrets such that they will pass authentication by the Plugin for the attempted RPC, after which point the caller MAY retry the attempted RPC.
Error is Unknown 2 UNKNOWN Indicates that a unknown error is generated Caller MUST study the logs before retrying

GetVolumeReplicationInfo

// GetVolumeReplicationInfoRequest holds the required information to get
// the Volume replication information.
message GetVolumeReplicationInfoRequest {
  // The identifier for this volume, generated by the plugin during
  // CreateVolume CSI RPC call.
  // This field is OPTIONAL.
  // This field MUST contain enough information to uniquely identify
  // this specific volume vs all other volumes supported by this plugin.
  // This field SHALL be used by the CO in subsequent calls to refer to
  // this volume.
  // This field is deprecated. Please use "replication_source" to
  // specify the replication source.
  string volume_id = 1 [deprecated = true];
  // Secrets required by the plugin to complete the request.
  map<string, string> secrets = 2 [(csi.v1.csi_secret) = true];
  // The identifier for the replication.
  // This field is OPTIONAL.
  // This field MUST contain enough information, together with volume_id,
  // to uniquely identify this specific replication
  // vs all other replications supported by this plugin.
  string replication_id = 3 [(alpha_field) = true];
  // If specified, this field will contain volume or volume group id
  // for replication.
  ReplicationSource replication_source = 4;
}

// GetVolumeReplicationInfoResponse holds the information to send the
// volume replication information.
message GetVolumeReplicationInfoResponse {
  // Holds the last sync time.
  // This field is REQUIRED.
  .google.protobuf.Timestamp last_sync_time = 1;
  // Holds the last sync duration.
  // last_sync_duration states the time taken to sync
  // to execute the last sync operation.
  // This field is OPTIONAL.
  .google.protobuf.Duration last_sync_duration = 2;
  // Holds the last sync bytes.
  // Represents number of bytes transferred
  // with the last synchronization.
  // This field is OPTIONAL.
  // The value of this field MUST NOT be negative.
  int64 last_sync_bytes = 3;
}

Error Scheme

Condition gRPC Code Description Recovery Behavior
Missing required field 3 INVALID_ARGUMENT Indicates that a required field is missing from the request. Caller MUST fix the request by adding the missing required field before retrying.
Replication Source does not exist or details not found 5 NOT_FOUND Indicates that a Replication Source does not exist or details of replication_source are not avaiable at the moment. Caller MUST verify that the replication_source is correct and that the replication_source is accessible and has not been deleted before retrying with exponential back off.
Replication Source is not replicated or not promoted 9 FAILED_PRECONDITION Indicates that the replication information corresponding to the specified replication_source could not retrived due to failed precondition (for example replication is not enabled or not in the primary state). Caller SHOULD ensure that replication is enabled on the replication_source and it is promoted.
Operation pending for Replication Source 10 ABORTED Indicates that there is already an operation pending for the specified replication_source. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per replication_source at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same replication_source. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. Caller SHOULD ensure that there are no other calls pending for the specified replication_source, and then retry with exponential back off.
Call not implemented 12 UNIMPLEMENTED The invoked RPC is not implemented by the Plugin or disabled in the Plugin's current mode of operation. Caller MUST NOT retry.
Not authenticated 16 UNAUTHENTICATED The invoked RPC does not carry secrets that are valid for authentication. Caller SHALL either fix the secrets provided in the RPC, or otherwise regalvanize said secrets such that they will pass authentication by the Plugin for the attempted RPC, after which point the caller MAY retry the attempted RPC.
Error is Unknown 2 UNKNOWN Indicates that a unknown error is generated Caller MUST study the logs before retrying

ReplicationSource

// Specifies what source the replication will be created from. One of the
// type fields MUST be specified.
message ReplicationSource {
  // VolumeSource contains the details about the volume to be replication
  message VolumeSource {
    // Contains identity information for the existing volume.
    // This field is REQUIRED.
    string volume_id = 1;
  }
  // VolumeGroupSource contains the details about
  // the volume group to be replication
  message VolumeGroupSource {
    // Contains identity information for the existing volume group.
    // This field is REQUIRED.
    string volume_group_id = 1;
  }

  oneof type {
    // Volume source type
    VolumeSource volume = 1;
    // Volume group source type
    VolumeGroupSource volumegroup = 2;
  }
}