Skip to content

Tool & library for binary data processing according JSON formatted rules

Notifications You must be signed in to change notification settings

VladimirTakeda/bpatch

 
 

Repository files navigation

bpatch

PURPOSE: Suppose you need to modify a file and replace several byte sequences with other byte sequences. Furthermore, imagine you want these modification rules to be formalized in a human-readable format, so they're easy to edit. If that's the case, then this tool is exactly what you need.

Domains where bpatch will be surely useful

    Encryption For example, you could create two mirror replacement rules to convert any text into an indecipherable set of unique binary sequences. In this way, only you and your intended recipient would be able to read your messages. Want more details - let the author know
    Data migration; Data transformation If you understand how your data should be processed and what needs replacing, then bpatch is your solution. Furthermore, if you find any missing functionality, don't hesitate to contact the author
    Forensic Analysis and Cybersecurity bpatch can be used to analyze and transform binary data, which is often crucial in these fields
    Software development; Testing; Gaming Industry - Binary data transformations If you need to manipulate binary data frequently and want to automate these changes, bpatch could simplify this process
    Automated Build and Deployment Pipelines bpatch can be part of Continuous Integration/Continuous Deployment (CI/CD) pipelines to handle the transformation of binary data as part of the build or deployment process
    Software development - using of the bpatch as static library Developers can integrate bpatch as static library into their applications to handle binary data flows

MNEMONIC: binary stream, substitution, sequential substitution, encryption, compression, binary data conversion, Test, Testing, Continuous integration, Cybersequrity, Data migration, Data transformation, Encryption, Binary data transformation, json parsing, C++ code style, C++ 20, application architecture, virtual inheritance, data flow

Table of Contents:

Description

bpatch is an application designed to transform a file's data based on easily readable and editable replacement rules. These rules are established in JSON format. Functionality provides sequential, parallel, and mixed modes to perform all possible data manipulation needs. Investigate bpatch_json.md to acquire more details of the current set of possibilities

File Descriptions

The bpatch package exposes next group of files to user:

Entry point

File Description
bpatch.cpp Contains entry point to bpatch application.

Library with the logic

File Description
actionscollection.h The class ActionsCollection serves as the main entry point for processing, housing the JSON parser callback for settings loading, as well as the pipeline for binary lexeme processing. actionscollection.cpp
binarylexeme.h The AbstractBinaryLexeme class encapsulates data; regulates access to the data; offers modification method and static creators. binarylexeme.cpp
bpatchfolders.h Access to the names of Actions and Binary Patterns folders. bpatchfolders.cpp
coloredconsole.h Templated wrapper for text to output in console ERROR in red and Warning in yellow colors. coloredconsole.cpp
consoleparametersreader.h The ConsoleParametersReader class parses console parameters, stores settings for processing, and maintains a 'manual' text (could be found in consoleparametersreader.cpp)
dictionary.h The Dictionary class stores binary lexemes for processing and allows name-based access. dictionary.cpp
dictionarykeywords.h String views with keywords for JSON parsing. dictionarykeywords.cpp
fileprocessing.h This file includes interfaces for file read/write operations and relevant inherited classes. fileprocessing.cpp
flexiblecache.h Data accumulation using a linked list with chunk-based allocation. flexiblecache.cpp
jsonparser.h Contains JSON parsing methods, parsing classes, and a callback class for simplified JSON reading. jsonparser.cpp
processing.h The library entry point. It handles parameter processing, settings reading, file handling, and data streaming to the processing engine. processing.cpp
stdafx.h Precompiled library header with included standard headers. stdafx.cpp
streamreplacer.h An interface of a replacement chain. streamreplacer.cpp
timemeasurer.h The TimeMeasurer class allows for nanosecond time measurement between named program points. timemeasurer.cpp

wildcharacters library

File Description
wildcharacters.h Support of wild characters '*' and '?' in parameters for console application as a static library. wildcharacters.cpp

Unit Tests files

File Description
test.cpp All unit tests are organized into groups inside
pch.h Precompiled header file for the unit tests project. + pch.cpp

Integration Tests files

File Description
in_tests.cmd For Windows. In console. Execute in_tests.cmd with bpatch full name. Like in_tests.cmd C:\path\to\bpatch.exe. Verify the absence of errors in the console output
in_tests.sh For Linux. In console. Execute in_tests.sh with bpatch full name. Like ./in_tests.sh /bin/bpatch/bpatch. Verify the absence of errors in the console output

Files: choicereplace.expected, decimal.expected, hexadecimal.expected, mixed.expected, text.expected, withbin.expected, choicereplace.test, decimal.test, hexadecimal.test, mixed.test, text.test, withbin.test, bin1.data, bin2.data, bin3.data, bin4.data, bin5.data are being used by Integration tests.

Feel free to use the following JSON files as samples when creating your own scripts: choicereplace.json, decimal.json, hexadecimal.json, mixed.json, text.json, withbin.json, tohex.json, fromhex.json

Building

  1. There are two files you can use for rebuilding of the bpatch:

    You must have MSVS 2022 installed;

    The environment variable VS170COMNTOOLS must be present in the system. You can execute developer console from MSVS 2022 for this. Also, ensure that you have installed C++ CMake tools for Windows in the individual components;

    Specify rebuild.cmd Debug for build Debug mode. Release mode will be built by default;

    Specify rebuild.sh Debug for build Debug mode. Release mode will be built by default;

    NOTE: For both scripts and modes the result folders will be recreated from scratch with every build

  2. The minimum version of cmake required for this project is 3.19

  3. Follow the typical cmake build process if you would like to build manually

  4. Upon specifying the folder to generate the MSVS solution or to build under Linux, cmake should automatically download Google Tests

  5. If you are unfamiliar with cmake, you can also refer to the tips provided at the end of the CMakeLists.txt file

Unit Tests

The unit tests files are housed in a separate project, which you can find in the 'testbpatch' subfolder

The unit tests primarily cover the application's main functionality. A portion of these tests is specifically designed to verify the validity of the custom-built JSON parser. It's important to note that this JSON parser does not strictly adhere to all JSON rules. For instance, it will ignore unicode sequences, and it does not consider newline characters within strings as mistakes

Integration Tests

Integration tests for the program have been designed as scripts. All required auxiliary files can be found in the 'IntegrationTests' folder. There are scripts both for Windows in_tests.cmd and Linux in_tests.sh. Execute the tests in the console by providing the script with the name of the bpatch executable as a parameter. Verify the absence of errors in the console output

Architectural Diagrams

Diamod diagram for file IO classes

classDiagram
    direction LR
    class Writer {
        <<interface>>
    }

    %% Relationships
    FileProcessing <|-- ReadFileProcessing : inherits
    ReadFileProcessing <|.. Reader : implemented in
    FileProcessing <|-- WriteFileProcessing : inherits
    WriteFileProcessing <|-- ReadWriteFileProcessing : inherits
    ReadFileProcessing <|-- ReadWriteFileProcessing : inherits
    WriteFileProcessing <|.. Writer : implemented in

    class Reader {
        <<interface>>
    }

    %% Styling and notes
    style Reader fill:#AAffAA,stroke:#000000
    style Writer fill:#ffffAA,stroke:#000000
    style ReadFileProcessing stroke:#AAffAA,stroke-width:3px
    style WriteFileProcessing stroke:#ffffAA,stroke-width:3px
    note for FileProcessing "Owns file descriptor\nand closes it"
Loading

Usage of IO classes

---
title: IO Processing needs 2 interfaces Reader and Writer
---
stateDiagram-v2
    direction LR
    state if_state <<choice>>
    ioffraw : Is One File For Read And Write
    [*] --> ioffraw
    ioffraw --> if_state
    if_state --> ReadWriteFileProcessing: Yes
    if_state --> ReadFileProcessing : NO
    if_state --> WriteFileProcessing : NO
    sc : Separate Instances
    state sc {
        direction LR
        ReadFileProcessing
        WriteFileProcessing
    }
    ReadFileProcessing --> Reader
    ReadWriteFileProcessing --> Reader
    WriteFileProcessing --> Writer
    ReadWriteFileProcessing --> Writer

    classDef clReader fill:#AAffAA,stroke:#000000
    classDef clWriter fill:#ffffAA,stroke:#000000
    classDef clReadFileProcessing stroke:#AAffAA,stroke-width:3px
    classDef clWriteFileProcessing stroke:#ffffAA,stroke-width:3px

    class Reader clReader
    class Writer clWriter
    class ReadFileProcessing clReadFileProcessing
    class WriteFileProcessing clWriteFileProcessing
Loading

Sequence of the data processing

stateDiagram-v2
    direction LR
    lw : Loading Wildcharacters
    [*] --> lw
    state lw {
        direction LR
        fbm : List files by mask '?' and '*'
    }
    IterationF : For every found file
    fbm --> IterationF

    cr : create Reader
    cw : create Writer
    IterationF --> cr
    IterationF --> cw
    note left of IterationF
        Separate Reader and Writer for every file
    end note

    lojp : Loading of JSON Parser
    loaf : Loading of Actions
    [*] --> lojp
    lojp --> loaf

    coac : create Of Actions Collection
    note right of coac
        Reusable Instace;
        Can be used as separate adjustable by JSON
        data stream modifier
    end note

    loaf --> coac
    drrw : Do Read Replace Write
    coac --> drrw
    cr --> drrw
    cw --> drrw

    drrw --> [*]

    classDef clReader fill:#AAffAA,stroke:#000000
    classDef clWriter fill:#ffffAA,stroke:#000000
    classDef acReplacer fill:#0ff0ff,font-weight:bold,stroke-width:2px

    class cr clReader
    class cw clWriter
    class drrw acReplacer
Loading

Interfaces of the data Processing

classDiagram
    class ActionsCollection {
        void DoReplacements(const char toProcess, const bool aEod)
        void SetNextReplacer(std::unique_ptr~StreamReplacer~&& pNext)
    }

    note for ActionsCollection "Description:
         1. ActionsCollection is TJsonCallBack for creation of the Dictionary
            - set of pairs: source lexeme + result lexeme

         2. ActionsCollection is StreamReplacer
         It holds a chain with StreamReplacers. Last StreamReplacer must has a reference
         to the Writer or to the next StreamReplacer

         3. ActionsCollection receives an instance of Writer through the SetNextReplacer method.
         After that it can be used to process any sequence of characters. To complete processing
         the second parameter of the DoReplacements method must be set to the true value while
         first parameter does not matter

         4. ActionsCollection can be reused to transform any amount of byte sequences. Also, it is
         possible to setup other last SreamReplacer instance via SetNextReplacer method to define
         another data destination"

    TJsonCallBack "returns" .. AbstractBinaryLexeme
    ActionsCollection *-- Dictionary
    Dictionary *-- AbstractBinaryLexeme

    namespace ReadingOfReplaceRules {
        class TJsonCallBack
        class Dictionary
        class AbstractBinaryLexeme
    }

    class Reader {
        <<Interface>>
    }
    class SetNextReplacer {
        <<method>>
    }
    class Writer {
        <<Interface>>
    }
    TJsonCallBack <|-- ActionsCollection

    SetNextReplacer ..> ActionsCollection : Setup replacer\n with Writer
    Reader ..> ActionsCollection : Sends characters into\n DoReplacements method

    ActionsCollection *-- `StreamReplacer Instance 1`
    `StreamReplacer Instance 1` *-- "..." `etc... chain of StreamReplacer`
    `etc... chain of StreamReplacer` *-- StreamReplacer

    namespace StreamReplaicersChain {
        class Writer
        class `StreamReplacer Instance 1`
        class `etc... chain of StreamReplacer`
        class StreamReplacer
    }
    Writer --o StreamReplacer

    style Reader fill:#AAffAA,stroke:#000000
    style Writer fill:#ffffAA,stroke:#000000
    style SetNextReplacer fill:#ffffAA,stroke:#000000
Loading

Contacts

Feel free to use email [email protected] along with the title:

  1. bpatch bug found
  2. bpatch improvement
  3. bpatch change request
  4. bpatch feature request
  5. bpatch support request
  6. bpatch collaboration proposal

Copyright

MIT License Copyright <2024> Alexey Zaytsev

Reference

CMakeLists.txt

bpatch.cpp

bpatch_json.md

rebuild.cmd

rebuild.sh


actionscollection.cpp actionscollection.h

binarylexeme.cpp binarylexeme.h

bpatchfolders.cpp bpatchfolders.h

coloredconsole.cpp coloredconsole.h

consoleparametersreader.cpp consoleparametersreader.h

dictionary.cpp dictionary.h

dictionarykeywords.cpp dictionarykeywords.h

fileprocessing.cpp fileprocessing.h

flexiblecache.cpp flexiblecache.h

jsonparser.cpp jsonparser.h

processing.cpp processing.h

stdafx.cpp stdafx.h

streamreplacer.cpp streamreplacer.h

timemeasurer.cpp timemeasurer.h

wildcharacters.cpp wildcharacters.h


pch.cpp pch.h

test.cpp


in_tests.cmd

in_tests.sh


choicereplace.expected decimal.expected hexadecimal.expected mixed.expected text.expected withbin.expected

choicereplace.json decimal.json hexadecimal.json mixed.json text.json withbin.json

choicereplace.test decimal.test hexadecimal.test mixed.test text.test withbin.test

bin1.data bin2.data bin3.data bin4.data bin5.data

tohex.json fromhex.json


JSON

About

Tool & library for binary data processing according JSON formatted rules

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 91.1%
  • CMake 4.7%
  • Batchfile 2.0%
  • Shell 2.0%
  • C 0.2%