Skip to content

Commit

Permalink
Improve devcontainer experience
Browse files Browse the repository at this point in the history
- [x] Switch to `docker-compose.yml` instead of `Dockerfile` for dev container setup,
- [x] Sets up minio s3 emulator via docker container,
- [x] Add `CONTRIBUTING.md`.
  • Loading branch information
aykut-bozkurt committed Nov 9, 2024
1 parent 451f347 commit 0e368ae
Show file tree
Hide file tree
Showing 15 changed files with 207 additions and 118 deletions.
11 changes: 11 additions & 0 deletions .devcontainer/.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# S3 tests
AWS_ACCESS_KEY_ID=minioadmin
AWS_SECRET_ACCESS_KEY=minioadmin
AWS_REGION=us-east-1
AWS_S3_TEST_BUCKET=testbucket
MINIO_ROOT_USER=minioadmin
MINIO_ROOT_PASSWORD=minioadmin

# Others
RUST_TEST_THREADS=1
PG_PARQUET_TEST=true
56 changes: 21 additions & 35 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,11 @@ ENV TZ="Europe/Istanbul"
ARG PG_MAJOR=17

# install deps
RUN apt-get update && apt-get -y install build-essential libreadline-dev zlib1g-dev \
flex bison libxml2-dev libxslt-dev libssl-dev \
libxml2-utils xsltproc ccache pkg-config wget \
curl lsb-release sudo nano net-tools git awscli
RUN apt-get update && apt-get -y install build-essential libreadline-dev zlib1g-dev \
flex bison libxml2-dev libxslt-dev libssl-dev \
libxml2-utils xsltproc ccache pkg-config wget \
curl lsb-release ca-certificates gnupg sudo git \
nano net-tools awscli

# install Postgres
RUN sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
Expand All @@ -19,34 +20,20 @@ RUN apt-get update && apt-get -y install postgresql-${PG_MAJOR}-postgis-3 \
postgresql-client-${PG_MAJOR} \
libpq-dev

# download and install MinIO server and client
RUN wget https://dl.min.io/server/minio/release/linux-amd64/minio
RUN chmod +x minio
RUN mv minio /usr/local/bin/minio
# set up permissions so that rust user can create extensions
RUN chmod a+rwx `pg_config --pkglibdir` \
`pg_config --sharedir`/extension \
/var/run/postgresql/

# download and install MinIO admin
RUN wget https://dl.min.io/client/mc/release/linux-amd64/mc
RUN chmod +x mc
RUN mv mc /usr/local/bin/mc

# set up pgrx with non-sudo user
# initdb requires non-root user. This will also be the user that runs the container.
ARG USERNAME=rust
ARG USER_UID=501
ARG USER_GID=$USER_UID
RUN groupadd --gid $USER_GID $USERNAME \
&& useradd --uid $USER_UID --gid $USER_GID -s /bin/bash -m $USERNAME

RUN mkdir /workspaces && chown -R $USER_UID:$USER_GID /workspaces
ARG USER_UID=1000
ARG USER_GID=1000
RUN groupadd --gid $USER_GID $USERNAME
RUN useradd --uid $USER_UID --gid $USER_GID -s /bin/bash -m $USERNAME

# set up permissions so that the user below can create extensions
RUN chmod a+rwx `pg_config --pkglibdir` \
`pg_config --sharedir`/extension \
/var/run/postgresql/
RUN echo "$USERNAME ALL=(ALL) NOPASSWD: ALL" > /etc/sudoers.d/$USERNAME

# add it to sudoers
RUN echo "$USERNAME ALL=(ALL) NOPASSWD: ALL" > /etc/sudoers.d/$USERNAME

# now it is time to switch to user
USER $USERNAME

# install Rust environment
Expand All @@ -59,10 +46,9 @@ RUN cargo install --locked cargo-pgrx@${PGRX_VERSION}
RUN cargo pgrx init --pg${PG_MAJOR} $(which pg_config)
RUN echo "shared_preload_libraries = 'pg_parquet'" >> $HOME/.pgrx/data-${PG_MAJOR}/postgresql.conf

ENV MINIO_ROOT_USER=admin
ENV MINIO_ROOT_PASSWORD=admin123
ENV AWS_S3_TEST_BUCKET=testbucket
ENV AWS_REGION=us-east-1
ENV AWS_ACCESS_KEY_ID=admin
ENV AWS_SECRET_ACCESS_KEY=admin123
ENV PG_PARQUET_TEST=true
# required for pgrx to work
ENV USER=$USERNAME

# git completion
RUN curl -o ~/.git-completion.bash https://raw.githubusercontent.com/git/git/master/contrib/completion/git-completion.bash
RUN echo "source ~/.git-completion.bash" >> ~/.bashrc
3 changes: 3 additions & 0 deletions .devcontainer/create-test-buckets.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

aws --endpoint-url http://localhost:9000 s3 mb s3://$AWS_S3_TEST_BUCKET
18 changes: 7 additions & 11 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
{
"build": {
"dockerfile": "Dockerfile"
},
"name": "pg_parquet Dev Environment",
"dockerComposeFile": "docker-compose.yml",
"service": "app",
"workspaceFolder": "/workspace",
"postStartCommand": "bash .devcontainer/create-test-buckets.sh",
"postAttachCommand": "sudo chown -R rust /workspace",
"customizations": {
"vscode": {
"extensions": [
Expand All @@ -14,12 +17,5 @@
"henriiik.docker-linter"
]
}
},
"postStartCommand": "bash .devcontainer/scripts/setup-minio.sh",
"forwardPorts": [
5432
],
"capAdd": [
"SYS_PTRACE"
]
}
}
32 changes: 32 additions & 0 deletions .devcontainer/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
services:
app:
build:
context: .
dockerfile: Dockerfile
command: sleep infinity
network_mode: host
volumes:
- ..:/workspace
- ${USERPROFILE}${HOME}/.ssh:/home/rust/.ssh:ro
- ${USERPROFILE}${HOME}/.ssh/known_hosts:/home/rust/.ssh/known_hosts:rw
- ${USERPROFILE}${HOME}/.gitconfig:/home/rust/.gitconfig:ro
- ${USERPROFILE}${HOME}/.aws:/home/rust/.aws:ro
env_file:
- .env
cap_add:
- SYS_PTRACE
depends_on:
- minio

minio:
image: minio/minio
env_file:
- .env
network_mode: host
command: server /data
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "http://localhost:9000"]
interval: 6s
timeout: 2s
retries: 3
7 changes: 0 additions & 7 deletions .devcontainer/scripts/setup-minio.sh

This file was deleted.

5 changes: 0 additions & 5 deletions .env_sample

This file was deleted.

59 changes: 20 additions & 39 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,28 +65,25 @@ jobs:
path: ${{ env.SCCACHE_DIR }}
key: pg_parquet-sccache-cache-${{ runner.os }}-${{ hashFiles('Cargo.lock', '.github/workflows/ci.yml') }}

- name: Export environment variables from .env file
uses: falti/dotenv-action@v1
with:
path: .devcontainer/.env
export_variables: true

- name: Install PostgreSQL
run: |
sudo sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt-get update
sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev libxml2-utils xsltproc ccache pkg-config
sudo apt-get -y install build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev \
libxslt-dev libssl-dev libxml2-utils xsltproc ccache pkg-config \
gnupg ca-certificates
sudo apt-get -y install postgresql-${{ env.PG_MAJOR }}-postgis-3 \
postgresql-server-dev-${{ env.PG_MAJOR }} \
postgresql-client-${{ env.PG_MAJOR }} \
libpq-dev
- name: Install MinIO
run: |
# Download and install MinIO server and client
wget https://dl.min.io/server/minio/release/linux-amd64/minio
chmod +x minio
mv minio /usr/local/bin/minio
# Download and install MinIO admin
wget https://dl.min.io/client/mc/release/linux-amd64/mc
chmod +x mc
mv mc /usr/local/bin/mc
- name: Install and configure pgrx
run: |
Expand All @@ -101,48 +98,32 @@ jobs:
cargo fmt --all -- --check
cargo clippy --all-targets --features "pg${{ env.PG_MAJOR }}, pg_test" --no-default-features -- -D warnings
- name: Run tests
- name: Set up permissions for PostgreSQL
run: |
# Set up permissions so that the current user below can create extensions
sudo chmod a+rwx $(pg_config --pkglibdir) \
$(pg_config --sharedir)/extension \
/var/run/postgresql/
# pgrx tests with runas argument ignores environment variables, so
# we read env vars from .env file in tests (https://github.com/pgcentralfoundation/pgrx/pull/1674)
touch /tmp/.env
echo AWS_ACCESS_KEY_ID=${{ env.AWS_ACCESS_KEY_ID }} >> /tmp/.env
echo AWS_SECRET_ACCESS_KEY=${{ env.AWS_SECRET_ACCESS_KEY }} >> /tmp/.env
echo AWS_S3_TEST_BUCKET=${{ env.AWS_S3_TEST_BUCKET }} >> /tmp/.env
echo AWS_REGION=${{ env.AWS_REGION }} >> /tmp/.env
echo PG_PARQUET_TEST=${{ env.PG_PARQUET_TEST }} >> /tmp/.env
- name: Start Minio for s3 emulator tests
run: |
docker run -d --env-file .devcontainer/.env --net=host minio/minio server /data
# Start MinIO server
export MINIO_ROOT_USER=${{ env.AWS_ACCESS_KEY_ID }}
export MINIO_ROOT_PASSWORD=${{ env.AWS_SECRET_ACCESS_KEY }}
minio server /tmp/minio-storage > /dev/null 2>&1 &
while ! nc -z localhost 9000; do
echo "Waiting for localhost:9000..."
sleep 1
done
# Set access key and create test bucket
mc alias set local http://localhost:9000 ${{ env.AWS_ACCESS_KEY_ID }} ${{ env.AWS_SECRET_ACCESS_KEY }}
aws --endpoint-url http://localhost:9000 s3 mb s3://${{ env.AWS_S3_TEST_BUCKET }}
aws --endpoint-url http://localhost:9000 s3 mb s3://$AWS_S3_TEST_BUCKET
- name: Run tests
run: |
# Run tests with coverage tool
source <(cargo llvm-cov show-env --export-prefix)
cargo llvm-cov clean
cargo build --features "pg${{ env.PG_MAJOR }}, pg_test" --no-default-features
cargo pgrx test pg${{ env.PG_MAJOR }} --no-default-features
cargo llvm-cov report --lcov > lcov.info
# Stop MinIO server
pkill -9 minio
env:
RUST_TEST_THREADS: 1
AWS_ACCESS_KEY_ID: test_secret_access_key
AWS_SECRET_ACCESS_KEY: test_access_key_id
AWS_REGION: us-east-1
AWS_S3_TEST_BUCKET: testbucket
PG_PARQUET_TEST: true

- name: Upload coverage report to Codecov
if: ${{ env.PG_MAJOR }} == 17
uses: codecov/codecov-action@v4
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,4 @@
*.lcov
*.xml
lcov.info
.env
playground.rs
5 changes: 4 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,7 @@
"rust-analyzer.check.command": "clippy",
"rust-analyzer.checkOnSave": true,
"editor.inlayHints.enabled": "offUnlessPressed",
}
"files.watcherExclude": {
"**/target/**": true
}
}
109 changes: 109 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
`pg_parquet` is an open source project primarily authored and
maintained by the team at Crunchy Data. All contributions are welcome. The pg_parquet uses the PostgreSQL license and does not require any contributor agreement to submit patches.

Our contributors try to follow good software development practices to help
ensure that the code that we ship to our users is stable. If you wish to
contribute to the pg_parquet in any way, please follow the guidelines below.

Thanks! We look forward to your contribution.

# General Contributing Guidelines

All ongoing development for an upcoming release gets committed to the
**`main`** branch. The `main` branch technically serves as the "development"
branch as well, but all code that is committed to the `main` branch should be
considered _stable_, even if it is part of an ongoing release cycle.

- Ensure any changes are clear and well-documented:
- The most helpful code comments explain why, establish context, or efficiently summarize how. Avoid simply repeating details from declarations. When in doubt, favor overexplaining to underexplaining.
- Do not submit commented-out code. If the code does not need to be used
anymore, please remove it.
- While `TODO` comments are frowned upon, every now and then it is ok to put a `TODO` to note that a particular section of code needs to be worked on in the future. However, it is also known that "TODOs" often do not get worked on, and as such, it is more likely you will be asked to complete the TODO at the time you submit it.
- Make sure to add tests which cover the code lines that are introduced by your changes. See [testing](#testing).
- Make sure to [format and lint](#format-and-lint) the code.
- Ensure your commits are atomic. Each commit tells a story about what changes
are being made. This makes it easier to identify when a bug is introduced into
the codebase, and as such makes it easier to fix.
- All commits must either be rebased in atomic order or squashed (if the squashed
commit is considered atomic). Merge commits are not accepted. All conflicts must be resolved prior to pushing changes.
- **All pull requests should be made from the `main` branch.**

# Commit Messages

Commit messages should be as descriptive and should follow the general format:

```
A one-sentence summary of what the commit is.
Further details of the commit messages go in here. Try to be as descriptive of
possible as to what the changes are. Good things to include:
- What the changes is.
- Why the change was made.
- What to expect now that the change is in place.
- Any advice that can be helpful if someone needs to review this commit and
understand.
```

If you wish to tag a GitHub issue or another project management tracker, please
do so at the bottom of the commit message, and make it clearly labeled like so:

```
Issue: CrunchyData/pg_parquet#12
```

# Submitting Pull Requests

All work should be made in your own repository fork. When you believe your work
is ready to be committed.

## Upcoming Features

Ongoing work for new features should occur in branches off of the `main`
branch.

# Start Local Environment

There are 2 ways to start your local environment to start contributing pg_parquet.
- [Installation From Source](#installation-from-source)
- [Devcontainer](#devcontainer)

## Installation From Source

See [README.md](README.md#installation-from-source).

## Devcontainer

If you want to work on a totally ready-to-work container environment, you can try our
[devcontainer](.devcontainer/devcontainer.json). If you have chance to work on
[vscode editor](https://code.visualstudio.com), you can start pg_parquet project
inside the devcontainer. Please see [how you start the devcontainer](https://code.visualstudio.com/docs/devcontainers/containers).

# Postgres Support Matrix

You can see the current supported Postgres versions from [README.md](README.md#postgres-support-matrix).
Supported Postgres versions exist as Cargo feature flags at [Cargo.toml](Cargo.toml).
By default, pg_parquet is built with the latest supported Postgres version flag enabled.
If you want to build pg_parquet against another Postgres version, you can do it
by specifying the feature flag explicitly like `cargo pgrx build --features "pg16"`.
The same applies for running a session via `cargo pgrx run pg16` or
testing via `cargo pgrx test pg16`.

# Testing

We run `RUST_TEST_THREADS=1 cargo pgrx test` to run all our unit tests. If you
run a specific test, you can do it via regex patterns like below:
`cargo pgrx run pg17 test_numeric_*`.

> [!WARNING]
> Make sure to pass RUST_TEST_THREADS=1 as environment variable to `cargo pgrx test`.
Object storage tests are integration tests which require a storage emulator running
locally. See [ci.yml](.github/workflows/ci.yml) to see how local storage emulators
are started. You can also see the required environment variables from
[.env file](.devcontainer/.env).

# Format and Lint

We use `cargo-fmt` as formatter and `cargo-clippy` as linter. You can check
how we run them from [ci.yml](.github/workflows/ci.yml).
Loading

0 comments on commit 0e368ae

Please sign in to comment.