Skip to content

Latest commit

 

History

History
 
 

core

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

CLP Core

CLP's core is the low-level component that performs compression, decompression, and search.

Contents

Requirements

  • We have built and tested CLP on Ubuntu 18.04 (bionic) and Ubuntu 20.04 (focal).
    • If you have trouble building for another OS, file an issue and we may be able to help.
  • A compiler that supports c++14

Building

  • To build, we require some source dependencies, packages from package managers, and libraries built from source.

Source Dependencies

We use both git submodules and third-party source packages. To download all, you can run this script:

tools/scripts/deps-download/download-all.sh

This will download:

Environment

A handful of packages and libraries are required to build CLP. There are two options to use them:

  • Install them on your machine and build CLP natively
  • Build CLP within a prebuilt docker container that contains the libraries; However, this won't work if you need additional libraries that aren't already in the container.

Native Environment

Packages

If you're using apt-get, you can use the following command to install all:

sudo apt-get install -y ca-certificates checkinstall cmake build-essential \
libboost-filesystem-dev libboost-iostreams-dev libboost-program-options-dev \
libssl-dev pkg-config rsync wget zlib1g-dev

This will download:

  • ca-certificates
  • checkinstall
  • cmake
  • build-essential
  • libboost-filesystem-dev
  • libboost-iostreams-dev
  • libboost-program-options-dev
  • libssl-dev
  • pkg-config
  • rsync
  • wget
  • zlib1g-dev

Libraries

The latest versions of some packages are not offered by apt repositories, so we've included some scripts to download, compile, and install them:

./tools/scripts/lib_install/fmtlib.sh 8.0.1
./tools/scripts/lib_install/libarchive.sh 3.5.1
./tools/scripts/lib_install/lz4.sh 1.8.2
./tools/scripts/lib_install/mariadb-connector-c.sh 3.2.3
./tools/scripts/lib_install/spdlog.sh 1.9.2
./tools/scripts/lib_install/zstandard.sh 1.4.9

Docker Environment

You can use these commands to start a container in which you can build and run CLP:

# Make sure to change /path/to/clp/components/core and /path/to/my/logs below
docker run --rm -it \
  --name 'clp-build-env' \
  -u$(id -u):$(id -g) \
  -v$(readlink -f /path/to/clp/components/core):/mnt/clp \
  -v$(readlink -f /path/to/my/logs):/mnt/logs \
  ghcr.io/y-scope/clp/clp-core-dependencies-x86-ubuntu-focal:main \
  /bin/bash

cd /mnt/clp

Make sure to change /path/to/clp/components/core and /path/to/my/logs to the relevant paths on your machine.

Build

  • Configure the cmake project:

    mkdir build
    cd build
    cmake ../
  • Build:

    make

Running

  • CLP contains two executables: clp and clg
    • clp is used for compressing and extracting logs
    • clg is used for performing wildcard searches on the compressed logs

clp

To compress some logs:

./clp c archives-dir /home/my/logs
  • archives-dir is where compressed logs should be output
    • clp will create a number of files and directories within, so it's best if this directory is empty
    • You can use the same directory repeatedly and clp will add to the compressed logs within.
  • /home/my/logs is any log file or directory containing log files

To decompress those logs:

./clp x archive-dir decompressed
  • archives-dir is where the compressed logs were previously stored
  • decompressed is a directory where they will be decompressed to

You can also decompress a specific file:

./clp x archive-dir decompressed /my/file/path.log
  • /my/file/path.log is the uncompressed file's path (the one that was passed to clp for compression)

More usage instructions can be found by running:

./clp --help

clg

To search the compressed logs:

./clg archives-dir " a *wildcard* search phrase "
  • archives-dir is where the compressed logs were previously stored
  • The search phrase can contain the * wildcard which matches 0 or more characters, or the ? wildcard which matches any single character.

Similar to clp, clg can search a single file:

./clg archives-dir " a *wildcard* search phrase " /my/file/path.log
  • /my/file/path.log is the uncompressed file's path (the one that was passed to clp for compression)

More usage instructions can be found by running:

./clg --help

make-dictionaries-readable

If you'd like to convert the dictionaries of an individual archive into a human-readable form, you can use make-dictionaries-readable.

./make-dictionaries-readable archive-path <output dir>
  • archive-path is a path to a specific archive (inside archives-dir)

See the make-dictionaries-readable README for details on the output format.

Parallel Compression

By default, clp uses an embedded SQLite database, so each directory containing archives can only be accessed by a single clp instance.

To enable parallel compression to the same archives directory, clp/clg can be configured to use a MySQL-type database (MariaDB) as follows:

  • Install and configure MariaDB using the instructions for your platform
  • Create a user that has privileges to create databases, create tables, insert records, and delete records.
  • Copy and change config/metadata-db.yml, setting the type to mysql and uncommenting the MySQL parameters.
  • Install the MariaDB and PyYAML Python packages pip3 install mariadb PyYAML
    • This is necessary to run the database initialization script. If you prefer, you can run the SQL statements in tools/scripts/db/init-db.py directly.
  • Run tools/scripts/db/init-db.py with the updated config file. This will initialize the database CLP requires.
  • Run clp or clg as before, with the addition of the --db-config-file option pointing at the updated config file.
  • To compress in parallel, simply run another instance of clp concurrently.

Note that currently, decompression (clp x) and search (clg) can only be run with a single instance. We are in the process of open-sourcing parallelizable versions of these as well.