Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stellar developer docs for Galexie #1072

Merged
merged 11 commits into from
Nov 12, 2024
14 changes: 14 additions & 0 deletions config/sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ const sidebars: SidebarsConfig = {
{ type: 'ref', id: 'data/rpc/README', label: 'Soroban RPC'},
{ type: 'ref', id: 'data/hubble/README', label: 'Hubble'},
{ type: 'ref', id: 'data/horizon/README', label: 'Horizon'},
{ type: 'ref', id: 'data/galexie/README', label: 'Galexie'},
],
tools: [
{
Expand Down Expand Up @@ -74,6 +75,19 @@ const sidebars: SidebarsConfig = {
collapsible: false,
},
],
galexie: [
{
type: 'category',
label: 'Galexie',
items: [
{
type: "autogenerated",
dirName: "data/galexie",
},
],
collapsible: false,
},
],
soroban_rpc: [
{
type: "category",
Expand Down
2 changes: 1 addition & 1 deletion docs/README.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Information on how to issue assets on the Stellar network and create custom smar

### [Data](/docs/data/README.mdx)

Discover various data availability options: RPC, Hubble, and Horizon.
Discover various data availability options: RPC, Hubble, Horizon, and Galexie.

### [Tools](/docs/tools/README.mdx)

Expand Down
6 changes: 5 additions & 1 deletion docs/data/README.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This section will walk you through the differences between the various platforms

- **[RPC](#rpc)** - live network gateway
- **[Horizon](#horizon)** - API for network state data
- **Galexie** - exports raw ledger metadata files
- **[Galexie](#Galexie)** - exports raw ledger metadata files
urvisavla marked this conversation as resolved.
Show resolved Hide resolved
- **[Hubble](#hubble)** - analytics database for network data

| Features | RPC | Horizon | Galexie | Hubble |
Expand Down Expand Up @@ -70,3 +70,7 @@ Horizon is an API for accessing and interacting with the Stellar network data. I
Horizon stores three types of data (current state, historical state, and derived state) in one database, and the data is available in real-time for transactional use, which makes Horizon more expensive and resource-intensive to operate. If you’re considering using Horizon over the RPC, let us know in the [Stellar Developer Discord](https://discord.gg/stellardev) or file an issue in the [RPC repo](https://github.com/stellar/soroban-rpc) and let us know why!

You can [run your own instance of Horizon](./horizon/admin-guide/README.mdx) or use one of the publicly available Horizon services from [these infrastructure providers](./horizon/horizon-providers.mdx).

## [Galexie](./galexie/README.mdx)

Galexie is a tool for exporting Stellar ledger metadata to Google Cloud Storage.
urvisavla marked this conversation as resolved.
Show resolved Hide resolved
39 changes: 39 additions & 0 deletions docs/data/galexie/README.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: Galaxie Introduction
sidebar_position: 0
---

## What is Galexie?

Galexie is a tool for extracting, processing, exporting Stellar ledger metadata to external storage, and creating a data lake of pre-processed ledger metadata. Galaxy is the foundation of the Composable Data Pipeline (CDP) and serves as the first step in extracting raw Stellar ledger metadata and making it accessible. Learn more about CDP’s benefits and applications in this [blog post](https://stellar.org/blog/developers/composable-data-platform).
urvisavla marked this conversation as resolved.
Show resolved Hide resolved

## What Are the Key Features of Galexie?

Galexie is designed to make streamlined and efficient export of ledger metadata via a simple user-friendly interface. Its key features include:

- Exporting Stellar ledger metadata to cloud storage
- Configurable to export a specified range of ledgers or continuously stream new ledgers as they are created on the Stellar network
- Exporting ledger metadata in XDR which is Stellar Core’s native format.
- Compressing data before export to optimize storage efficiency in the data lake.

**Galexie Architecture**

![](/assets/galexie-architecture.png)

## Why XDR Format?

Exporting data in XDR—the native Stellar Core format—enables Galexie to preserve full transaction metadata, ensuring data integrity while keeping storage efficient. The XDR format maintains compatibility with all Stellar components, providing a solid foundation for applications that require consistent access to historical data. Refer to the [XDR](/docs/learn/encyclopedia/data-format/xdr) documentation for more information on this format.

## Why Run Galexie?

Galexie enables you to make a copy of Stellar ledger metadata over which you have complete control. Galexie can continuously sync your data lake with the latest ledger data freeing you up from tedious data ingestion and allowing you to focus on building customized applications that consume and analyze exported data.

## What Can You Do with the Data Lake Created by Galexie?

Once data is stored in the cloud, it becomes easily accessible for integration with modern data processing and analytics tools, enabling various workflows and insights.

The pre-processed ledger data exported by Galexie can be utilized across various applications, such as:

- Analytics Tools: Analyze trends over time.
- Audit Applications: Retrieve historical transaction data for auditing and compliance.
- Monitoring Systems: Create tools to track network metrics.
6 changes: 6 additions & 0 deletions docs/data/galexie/admin_guide/README.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: Admin Guide
sidebar_position: 15
---

This guide provides step-by-step instructions on installing and running the Galexie.
47 changes: 47 additions & 0 deletions docs/data/galexie/admin_guide/configuring.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
title: Configuring
sidebar_position: 20
---


# Configuring

## Steps to Configure Galexie

1. **Copy the Sample Configuration**

Start with the provided sample file, [`config.example.toml`](https://github.com/stellar/go/blob/master/services/galexie/config.example.toml).

2. **Rename and Update the Configuration**

Rename the file to `config.toml` and adjust settings as needed.

- **Key Settings Include:**
- **Google Cloud Storage (GCS) Bucket**

Specify the GCS bucket where Galexie will export Stellar ledger data. Update `destination_bucket_path` to the complete path of your GCS bucket, including subpaths if applicable.

```toml
destination_bucket_path = "stellar-network-data/testnet"
```

- **Stellar Network**

Set the Stellar network to be used in creating the data lake.

```toml
network = "testnet"
```

- **Data Organization (Optional)**

Configure how the exported data is organized in the GCS bucket. The example below adds 64 ledgers per file and organizes them in a directory of 1000 files.

```toml
# Number of ledgers stored in each file
ledgers_per_file = 64

# Number of files per partition/directory
files_per_partition = 1000
```

12 changes: 12 additions & 0 deletions docs/data/galexie/admin_guide/installing.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Installing
sidebar_position: 30
---

# Installing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a link to the github readme instructions for installing/building locally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I don't think we currently have instructions for building it locally but it might be worth adding them.


To install Galexie, retrieve the Docker image from the [Stellar Docker Hub registry](https://hub.docker.com/r/stellar/stellar-galexie) using the following command:

```shell
docker pull stellar/stellar-galexie
```
6 changes: 6 additions & 0 deletions docs/data/galexie/admin_guide/monitoring.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: Monitoring
sidebar_position: 50
---

# Monitoring
23 changes: 23 additions & 0 deletions docs/data/galexie/admin_guide/prerequisites.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: Prerequisites
sidebar_position: 10
---

# Prerequisites

### 1. Google Cloud Platform (GCP) Account

Galexie exports Stellar ledger metadata to Google Cloud Storage (GCS), so you need a GCP account with:

- Permissions to create a new GCS bucket, or
- Access to an existing bucket with read/write permissions.

### 2. Docker (Recommended)

> **_NOTE:_** While it is possible to natively install Galexie (without Docker), this requires manual dependency management and is recommended only for advanced users.]

Galexie is available as a Docker image, which simplifies installation and setup. Ensure you have Docker Engine installed on your system ([Docker installation guide](https://docs.docker.com/engine/install/)).

## Hardware Requirements
urvisavla marked this conversation as resolved.
Show resolved Hide resolved

The minimum hardware requirements for running Galexie are:
urvisavla marked this conversation as resolved.
Show resolved Hide resolved
103 changes: 103 additions & 0 deletions docs/data/galexie/admin_guide/running.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
---
title: Running
sidebar_position: 40
---

# Running

With the Docker image available and the configuration file set up, you're now ready to run Galexie and start exporting Stellar ledger data to the GCS bucket.

## Command Line Usage

The primary way of running Galexie is using the `append` command.

### Append

Using the `append` command, Galexie can either continuously monitor the network for new ledgers and export them, or export a fixed ledger range and stop when it is exported.

Syntax:

```shell
stellar-galexie append --start <start_ledger> [--end <end_ledger>] [--config-file <config_file>]
chowbao marked this conversation as resolved.
Show resolved Hide resolved
```

Arguments:

`--start <start_ledger>` **(required)**

- The starting ledger sequence number of the range being exported.

`--end <end_ledger>` **(optional)**

- The ending ledger sequence number of the range being exported. If unspecified or set to 0, the exporter will continuously export new ledgers as they appear on the network.

`--config-file <config_file_path>` **(optional)**

- The path to the configuration file. If unspecified, the application will look for a file named `config.toml` in the current directory.

Example usage:

```shell
docker run --platform linux/amd64 -d \
-v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \
-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \
-v ${PWD}/config.toml:/config.toml \
stellar/stellar-galexie \
append --start 350000 --end 450000 --config-file config.toml
```

`--platform linux/amd64`

- Specifies the platform architecture (adjust if needed for your system).

`-v` Mounts volumes to map your local GCP credentials and config.toml file to the container:

- `$HOME/.config/gcloud/application_default_credentials.json`: Your local GCP credentials file.
- `${PWD}/config.toml`: Your local configuration file.

`-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json`

- Sets the environment variable for credentials within the container.

`stellar/stellar-galexie`

- The Docker image name.

#### Resumability:

The `append` command includes built-in resumability, allowing exports to continue seamlessly after an interruption. If Galexie is stopped mid-export, it will scan for the first missing ledger after the specified starting ledger upon restart. Exporting will then resume from that missing ledger, with no manual adjustment needed. To utilize resumability, simply restart Galexie with the same starting ledger, and it will pick up right where it left off.

### Scan-and-fill

While the `append` command is efficient, it may miss data gaps if there are multiple non-sequential gaps in the range. For more thorough verification, the `scan-and-fill` command provides a slower but comprehensive alternative, scanning a specified ledger range to locate and fill any gaps, ensuring data completeness. Due to its slower execution, `scan-and-fill` should be used sparingly and only when data gaps are suspected.

Syntax:

```shell
stellar-galexie scan-and-fill --start <start_ledger> --end <end_ledger> [--config-file <config_file>]
chowbao marked this conversation as resolved.
Show resolved Hide resolved
```

Arguments:

`--start <start_ledger>` **(required)**

- The starting ledger sequence number of the range being exported.

`--end <end_ledger>` **(required)**

- The ending ledger sequence number of the range being exported.

`--config-file <config_file_path>` **(optional)**:

- The path to the configuration file. If unspecified, the exporter will look for a file named “config.toml” in the current directory.

Example usage:

```shell
docker run --platform linux/amd64 -d \
-v "$HOME/.config/gcloud/application_default_credentials.json":/.config/gcp/credentials.json:ro \
-e GOOGLE_APPLICATION_CREDENTIALS=/.config/gcp/credentials.json \
-v ${PWD}/config.toml:/config.toml \
stellar/stellar-galexie \
scan-and-fill --start 64000 --end 68000 --config-file config.toml
```
22 changes: 22 additions & 0 deletions docs/data/galexie/admin_guide/setup.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: Setup
sidebar_position: 10
---

# Setup

### Google Cloud Platform (GCP) credentials

Create application default credentials by using your user account for your GCP project by following these steps:

1. Download the [SDK](https://cloud.google.com/sdk/docs/install).
2. Install and initialize the [gcloud CLI](https://cloud.google.com/sdk/docs/initializing).
3. Create [application default credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc#google-idp) and it should automatically store in this location: `$HOME/.config/gcloud/application_default_credentials.json.`
4. Verify that this file exists before moving on to the next step.

### Google Cloud Storage (GCS) bucket

If you already have a GCS bucket with read and write permissions, you can skip this section. If not, follow these steps:

1. Visit the GCP Console's Storage section (https://console.cloud.google.com/storage) and create a new bucket.
2. Choose a descriptive name for the bucket, such as `stellar-ledger-data`. Refer to [Google Cloud Storage Bucket Naming Guideline](https://cloud.google.com/storage/docs/buckets#naming) for bucket naming conventions. Note down the bucket name, you will need it later during the configuration process.
5 changes: 5 additions & 0 deletions docusaurus.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,11 @@ const config: Config = {
docId: "data/horizon/README",
label: "Horizon",
},
{
type: 'doc',
docId: "data/galexie/README",
label: "Galexie",
},

]
},
Expand Down
Binary file added static/assets/galexie-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.