A base32
⇌ CJK cipher using the Chinese Commercial Code (CCC), aka the Chinese Telegraph Code.
See demon: https://mondain-dev.github.io/ccc-cipher-js/
There are two components in this cipher:
Base32CJK.py
encodes base32 strings into CJK strings, and decodes the original base32 from CJK codes. Note that the encoding does not work for base64.- Entryption-Decryption scripts
Base32CJK.py [OPTIONS]
OPTIONS:
-i, --in=: input file, use stdin if not supplied
-o, --out=: output file
-e, --encoding: encode the input
-d, --decoding: decode the input
-h, --help: print this help
For encoding a random binary file plain.bin
:
cat /dev/urandom | fold -w ${1:-20} | head -n 20 > plain.bin
base32 plain.bin | python Base32CJK.py -e -o encoded.CJK.txt
For decoding:
python Base32CJK.py -d -i encoded.CJK.txt | base32 -d > decoded.bin
To verify:
diff plain.bin decoded.bin
For compression a random binary file plain.bin
:
cat /dev/urandom | fold -w ${1:-20} | head -n 20 > plain.bin
gzip -c plain.bin | base32 | python Base32CJK.py -e -o compressed.CJK.txt
For decoding:
python Base32CJK.py -d -i compressed.CJK.txt | base32 -d | gunzip -d > decompressed.bin
To verify:
diff plain.bin decompressed.bin
We provide scripts cipher.sh
and decipher.sh
to encrypt and decrypt using AES-256 with PBKDF2 implemented in openssl
.
cipher.sh <encrypted file>
decipher.sh <encrypted CJK file>
For encryption:
./cipher.sh encrypted.txt
This will prompt you to enter and confirm the password, then input the text to be encrypted. If there already exists an encrypted.txt
, you will be asked to enter the password used to encrypt this file, any new plaintext inputs will be appended to the end of the previous ones, and then encrypted.
The encrypted text in base32
will be saved to encrypted.txt
and CJK-encoded encrypted.CJK.txt
.
To decrypt the encrypted.CJK.txt
:
./decipher.sh encrypted.CJK.txt
This will show the decrypted information on the screen, but will not write any unenscrypted info on hard disk.
This cipher is based on the Chinese Commercial Code, and more specifically, a subset of the code that represents a consensus of the current versions used in Mainland China, Hong Kong, Macau, and Taiwan, while allowing the variantion between traditional and simplified characters. We avoid any CCC code points that may be used to encode different characters if different code books are used. It is therefore possible to decode a CJK-text without Base32CJK.py
if any code book is at hand. However to encode a base32
into CJK, the concensus of CCC is required, which may not be available in the code book being used.
For a given encoded CJK character, we first look up it in the code book to find its CCC code point base32
symbols, otherwise we interpret base32
symbol.
This scheme covers all double- and single-quintet strings in base32
. To cover double- and single-sextets (6-bit) in base64
under this scheme, using concensus CCC is inadequate.
To decode the CJK character 歌
, we first identify its CCC code point base32
symbols 2Q
.
$ echo 歌 | python Base32CJK.py -d
2Q======
In another example we decode the CJK character 堡
which has CCC base32
symbol D
.
$ echo 堡 | python Base32CJK.py -d
D