Skip to content

PDP 27 (Admin Tools)

co-jo edited this page Oct 26, 2020 · 1 revision

Status: Under discussion

Related issues: Issue 2228.

Summary

This proposal outlines the need for a set of admin tools that can help debug and fix problems with a Pravega cluster and how they can be accessed. These tools are meant for accessing Pravega data structures (in ZooKeeper, Tier1, etc) and modify them in order to fix any problems that may have lead to an unrecoverable state. These are not meant for everyday deployment activities, such as starting or stopping services, provisioning and so on.

Goals

Shell:

  • Run from the command line (no UI) or from the IDE
    • The IDE is desired since some problems are so subtle they may require stepping through the debugger to properly diagnose them.
  • Be able to load/specify a configuration compatible with that already loaded into Pravega (SegmentStore and/or Controller)
  • Support a pluggable model where new commands can be easily tied into.

The following are an initial set of goals for individual components. They will evolve as the need arises and as we gain more experience with Pravega itself.

Tier1/BookKeeper:

  • Examine the BookKeeperLog metadata for a particular Log (stored in ZooKeeper)
  • Edit/Delete the contents of a BookKeeperLog metadata
  • Execute orphan ledger cleanup (see issue Issue 1165)

SegmentStore:

  • Be able to diagnose a particular issue with Container Recovery (i.e., why recovery fails)
    • This should only be used for data corruption failures that would prevent a Container from ever recovering again. Failures due to external factors (such as disks full or network connectivity are considered transient and can be fixed by a system admin).
  • Be able to add, replace or delete the contents of an Operation in the Tier1 DurableLog (presumably this operation is the cause of the recovery failure).
  • Be able to replace the contents of a DataFrame written in Tier1 (if this entire frame is believed to be corrupted)
    • This could go into the Tier1/BookKeeper section above, however for Tier1/BookKeeper, every entry is a sequence of bytes and it guarantees its correctness. A DataFrame is a SegmentStore concept, with its own layout, and, as such, is prone to errors stemming from badly written code.

Command Syntax and Sample commands

Every command inside the Shell should follow this syntax: <component> <command> <args> Where:

  • <component> is the main target of the command (i.e., bookkeeper, container, controller)
  • <command> is the actual target-specific command to execute
  • <args> is a space-separated list of arguments that can be passed to the command. TBD whether this will be a key=value list or something else.

Sample commands (actual list TBD later based on needs):

  • config <key1=value1> ... - sets or updates the configuration to use for subsequent commands.
    • Key here refers to the keys in the config file (i.e. pravegaservice.listeningPort) so essentially this would be in lieu of an actual config file.
    • Most of the uses for this would be to set the ZK address so that subsequent commands can connect appropriately.
  • bookkeeper list-logs - list all BK Logs (no details)
  • bookkeeper log-meta <log-id>: output info about a log (metadata)
  • bookkeeper log-details <log-id>: output info about a log (metadata + bk ledgers)
  • bookkeeper log-delete <log-id>: delete a log's contents
  • bookkeeper ledger-cleanup: delete orphan ledgers
  • container set-status <container-id> <online|offline>:
    • online: can be used by SegmentStore but cannot debug
    • offline: SegmentStore cannot load it, but it can be debugged.
  • container recover <container-id> <verbosity>: do a dummy recovery of the Container, without processing anything or otherwise affecting the state of the data in Tier1 (still have a ContainerMetadata, InMemLog and a ReadIndex, but no actual cache storage)
    • verbosity: DataFrame|OperationSummary|OperationDetail (from just reading frames without extracting operations to just listing operations to actually validating operations)
  • container replace-op <seq-no> <TBD>: replace an operation with given SeqNo with another. TBD how to load up the new operation contents.
  • container delete-op <seq-no>: variation of replace-op, but it replaces an op with a no-op
  • container add-op <after-seq-no> <TBD>: insert an operation immediately after the operation with existing SeqNo.

Details

Container fixing

  • Replacing an Operation or DataFrame in an append-only log
    • This can be done by creating a secondary BookKeeperLog for a container log. Call it "Correction Log".
    • Write the overridden operations into this log using the exact sequence number as the one in the original log
    • For overridden DataFrames, create a new operation which clearly indicates the DataFrame address to override; the payload of this operation will be the new contents.
    • Update the recovery algorithm to load up the contents of this "Correction Log" and use the data in it when reading from the main log during recovery (if an operation or Frame appears twice, use the one from the "Correction Log")
    • Update the truncation algorithm to truncate both logs when needed. Optionally delete the "Correction Log" if it is empty.

BookKeeper

  • Cleaning up orphan ledgers
    • Get a list of all Ledgers from BookKeeperAdmin.listLedgers (A)
    • Get a list of all used ledgers from all BookKeeperLog metadata (B)
    • Delete all ledgers in A - B, making sure that we don't delete ledgers that have just been created but not yet added to the metadata.
Clone this wiki locally