Skip to content

A base32 ⇌ CJK cipher using Chinese Commercial Code

License

Notifications You must be signed in to change notification settings

mondain-dev/ccc-cipher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CCC Cipher

A base32 ⇌ CJK cipher using the Chinese Commercial Code (CCC), aka the Chinese Telegraph Code.

See demon: https://mondain-dev.github.io/ccc-cipher-js/

Usage

There are two components in this cipher:

  • Base32CJK.py encodes base32 strings into CJK strings, and decodes the original base32 from CJK codes. Note that the encoding does not work for base64.
  • Entryption-Decryption scripts
    • cipher.sh encrypts any file (binary or otherwise) using AES-256 with PBKDF2 implemented in openssl, and encodes the encrypted file into CJK characters.
    • decipher.sh decrypts the file encrypted by cipher.sh

Base32CJK.py

Base32CJK.py [OPTIONS]
  OPTIONS:
  -i, --in=:       input file, use stdin if not supplied
  -o, --out=:      output file
  -e, --encoding:  encode the input
  -d, --decoding:  decode the input
  -h, --help:      print this help

Encode-Decode example

For encoding a random binary file plain.bin:

cat /dev/urandom | fold -w ${1:-20} | head -n 20 > plain.bin
base32 plain.bin | python Base32CJK.py -e -o encoded.CJK.txt

For decoding:

python Base32CJK.py -d -i encoded.CJK.txt | base32 -d > decoded.bin

To verify:

diff plain.bin decoded.bin

Compress-Decompress example

For compression a random binary file plain.bin:

cat /dev/urandom | fold -w ${1:-20} | head -n 20 > plain.bin
gzip -c plain.bin | base32 | python Base32CJK.py -e -o compressed.CJK.txt

For decoding:

python Base32CJK.py -d -i compressed.CJK.txt | base32 -d | gunzip -d > decompressed.bin

To verify:

diff plain.bin decompressed.bin

Entryption-Decryption Scripts

We provide scripts cipher.sh and decipher.sh to encrypt and decrypt using AES-256 with PBKDF2 implemented in openssl.

cipher.sh <encrypted file>
decipher.sh <encrypted CJK file>

Example

For encryption:

./cipher.sh encrypted.txt

This will prompt you to enter and confirm the password, then input the text to be encrypted. If there already exists an encrypted.txt, you will be asked to enter the password used to encrypt this file, any new plaintext inputs will be appended to the end of the previous ones, and then encrypted.

The encrypted text in base32 will be saved to encrypted.txt and CJK-encoded encrypted.CJK.txt.

To decrypt the encrypted.CJK.txt:

./decipher.sh encrypted.CJK.txt

This will show the decrypted information on the screen, but will not write any unenscrypted info on hard disk.

Decoding by Hand

This cipher is based on the Chinese Commercial Code, and more specifically, a subset of the code that represents a consensus of the current versions used in Mainland China, Hong Kong, Macau, and Taiwan, while allowing the variantion between traditional and simplified characters. We avoid any CCC code points that may be used to encode different characters if different code books are used. It is therefore possible to decode a CJK-text without Base32CJK.py if any code book is at hand. However to encode a base32 into CJK, the concensus of CCC is required, which may not be available in the code book being used.

Decoding Algorithm

For a given encoded CJK character, we first look up it in the code book to find its CCC code point $c$ and compute $c\equiv p \bmod 1056(=32\cdot32+32)$ with $0 \le p\in\mathbb{Z}&lt;1056$. If $p&lt;1024$, we can decode $p$ as two quintets (5-bit) base32 symbols, otherwise we interpret $p \bmod 1024$ as a single quintet base32 symbol.

This scheme covers all double- and single-quintet strings in base32. To cover double- and single-sextets (6-bit) in base64 under this scheme, using concensus CCC is inadequate.

Examples

To decode the CJK character , we first identify its CCC code point $c=2960\equiv 848 \bmod1056$. Then we decode $p=848$ as two 5-bit code points $26,16$ corresponding to two base32 symbols 2Q.

$ echo 歌 | python Base32CJK.py -d
2Q======

In another example we decode the CJK character which has CCC $1027 \ge 1024$. Hence, we decode $p=1027\equiv3\bmod1024$ as a single base32 symbol D.

$ echo 堡 | python Base32CJK.py -d
D

References

About

A base32 ⇌ CJK cipher using Chinese Commercial Code

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published