This library extends the native codecs library (namely for adding new custom encodings and character mappings) and provides a myriad of new encodings (static or parametrized, like rot or xor), hence its named combining CODecs EXTension.
$ pip install codextNote: Some encodings are available in Python 3 only.
$ codext -i test.txt encode dna-1 GTGAGCGGGTATGTGA $ echo -en "test"| codext encode morse - . ... -Python 3 (includes Ascii85, Base85, Base100 and braille):
$ echo -en "test"| codext encode braille ⠞⠑⠎⠞ $ echo -en "test"| codext encode base100 👫👜👪👫Using codecs chaining:
$ echo -en "Test string"| codext encode reverse gnirts tseT $ echo -en "Test string"| codext encode reverse morse --. -. .. .-. - ... / - ... . - $ echo -en "Test string"| codext encode reverse morse dna-2 AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC $ echo -en "Test string"| codext encode reverse morse dna-2 octal 101107124103101107124103101107124107101107101101101107124103101107124107101107101101101107124107101107124107101107101101101107124107101107124103101107124107101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124124101107101101101107124103101107101101101107124107101107124107101107124107101107101101101107124107101107101101101107124103 $ echo -en "AGTCAGTCAGTGAGAAAGTCAGTGAGAAAGTGAGTGAGAAAGTGAGTCAGTGAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTTAGAAAGTCAGAAAGTGAGTGAGTGAGAAAGTGAGAAAGTC"| codext -d dna-2 morse reverse test stringGetting the list of available codecs:
>>>importcodext>>>codext.list() ['ascii85', 'base85', 'base100', 'base122', ..., 'tomtom', 'dna', 'html', 'markdown', 'url', 'resistor', 'sms', 'whitespace', 'whitespace-after-before']Usage examples:
>>>codext.encode("this is a test", "base58-bitcoin") 'jo91waLQA1NNeBmZKUF'>>>codext.encode("this is a test", "base58-ripple") 'jo9rA2LQwr44eBmZK7E'>>>codext.encode("this is a test", "base58-url") 'JN91Wzkpa1nnDbLyjtf'>>>codecs.encode("this is a test", "base100") '👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫'>>>codecs.decode("👫👟👠👪🐗👠👪🐗👘🐗👫👜👪👫", "base100") 'this is a test'>>>foriinrange(8): print(codext.encode("this is a test", "dna-%d"% (i+1))) GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGACTCACGGACGGCCTATAGAACGGCCTATAGAACGACAGAACTCACGCCCTATCTCAACAGATTGATTAACGCGTGGATTAACGCGTGGATGAGTGGACAGATAAACGCACAGAGACATTCATTAAGCGCTCCATTAAGCGCTCCATCACTCCAGACATAAAGCGAGACTCTGTAAGTAATTCGCGAGGTAATTCGCGAGGTAGTGAGGTCTGTATTTCGCTCTGTGTCTAACTAATTGCGCACCTAATTGCGCACCTACTCACCTGTCTATTTGCGTGTCGAGTGCCTGCCGGATATCTTGCCGGATATCTTGCTGTCTTGAGTGCGGGATAGAGTCACTCGGTCGGCCATATGTTCGGCCATATGTTCGTCTGTTCACTCGCCCATACACT>>>codext.decode("GTGAGCCAGCCGGTATACAAGCCGGTATACAAGCAGACAAGTGAGCGGGTATGTGA", "dna-1") 'this is a test'>>>codecs.encode("this is a test", "morse") '- .... .. ... / .. ... / .- / - . ... -'>>>codecs.decode("- .... .. ... / .. ... / .- / - . ... -", "morse") 'this is a test'>>>withopen("morse.txt", 'w', encoding="morse") asf: f.write("this is a test") 14>>>withopen("morse.txt",encoding="morse") asf: f.read() 'this is a test'>>>codext.decode(""" = X : x n r y Y y p a ` n | a o h ` g o z """, "whitespace-after+before") 'CSC{not_so_invisible}'>>>print(codext.encode("An example test string", "baudot-tape")) ***.** . ****.** . .** .* . *** .****.**** .** .** . **. * .***. **. ** . **. **. ****. *.****.** .*| Codec | Conversions | Comment |
|---|---|---|
a1z26 | text <-> alphabet order numbers | keeps words whitespace-separated and uses a custom character separator |
affine | text <-> affine ciphertext | aka Affine Cipher |
ascii85 | text <-> ascii85 encoded text | Python 3 only |
atbash | text <-> Atbash ciphertext | aka Atbash Cipher |
bacon | text <-> Bacon ciphertext | aka Baconian Cipher |
barbie-N | text <-> barbie ciphertext | aka Barbie Typewriter (N belongs to [1, 4]) |
baseXX | text <-> baseXX | see base encodings (incl base32, 36, 45, 58, 62, 63, 64, 91, 100, 122) |
baudot | text <-> Baudot code bits | supports CCITT-1, CCITT-2, EU/FR, ITA1, ITA2, MTK-2 (Python3 only), UK, ... |
bcd | text <-> binary coded decimal text | encodes characters from their (zero-left-padded) ordinals |
braille | text <-> braille symbols | Python 3 only |
citrix | text <-> Citrix CTX1 ciphertext | aka Citrix CTX1 passord encoding |
dna | text <-> DNA-N sequence | implements the 8 rules of DNA sequences (N belongs to [1,8]) |
excess3 | text <-> XS3 encoded text | uses Excess-3 (aka Stibitz code) binary encoding to convert characters from their ordinals |
gray | text <-> gray encoded text | aka reflected binary code |
gzip | text <-> Gzip-compressed text | standard Gzip compression/decompression |
html | text <-> HTML entities | implements entities according to this reference |
ipsum | text <-> latin words | aka lorem ipsum |
klopf | text <-> klopf encoded text | Polybius square with trivial alphabetical distribution |
leetspeak | text <-> leetspeak encoded text | based on minimalistic elite speaking rules |
letter-indices | text <-> text with letter indices | encodes consonants and/or vowels with their corresponding indices |
lz77 | text <-> LZ77-compressed text | compresses the given data with the algorithm of Lempel and Ziv of 1977 |
lz78 | text <-> LZ78-compressed text | compresses the given data with the algorithm of Lempel and Ziv of 1978 |
manchester | text <-> manchester encoded text | XORes each bit of the input with 01 |
markdown | markdown --> HTML | unidirectional |
morse | text <-> morse encoded text | uses whitespace as a separator |
navajo | text <-> Navajo | only handles letters (not full words from the Navajo dictionary) |
octal | text <-> octal digits | dummy octal conversion (converts to 3-digits groups) |
ordinal | text <-> ordinal digits | dummy character ordinals conversion (converts to 3-digits groups) |
pkzip_deflate | text <-> deflated text | standard Zip-deflate compression/decompression |
pkzip_bzip2 | text <-> Bzipped text | standard BZip2 compression/decompression |
pkzip_lzma | text <-> LZMA-compressed text | standard LZMA compression/decompression |
radio | text <-> radio words | aka NATO or radio phonetic alphabet |
resistor | text <-> resistor colors | aka resistor color codes |
rot | text <-> rot(N) ciphertext | aka Caesar cipher (N belongs to [1,25]) |
rotate | text <-> N-bits-rotated text | rotates characters by the specified number of bits ; Python 3 only |
scytale | text <-> scytale ciphertext | encrypts with L, the number of letters on the rod (belongs to [1,[) |
shift | text <-> shift(N) ciphertext | shift ordinals with N (belongs to [1,255]) |
sms | text <-> phone keystrokes | also called T9 code ; uses "-" as a separator for encoding, "-" or "_" or whitespace for decoding |
southpark | text <-> Kenny's language | converts letters to Kenny's language from Southpark (whitespace is also handled) |
tomtom | text <-> tom-tom encoded text | similar to morse, using slashes and backslashes |
url | text <-> URL encoded text | aka URL encoding |
xor | text <-> XOR(N) ciphertext | XOR with a single byte (N belongs to [1,255]) |
whitespace | text <-> whitespaces and tabs | replaces bits with whitespaces and tabs |
A few variants are also implemented.
| Codec | Conversions | Comment |
|---|---|---|
baudot-spaced | text <-> Baudot code groups of bits | groups of 5 bits are whitespace-separated |
baudot-tape | text <-> Baudot code tape | outputs a string that looks like a perforated tape |
bcd-extended0 | text <-> BCD-extended text | encodes characters from their (zero-left-padded) ordinals using prefix bits 0000 |
bcd-extended1 | text <-> BCD-extended text | encodes characters from their (zero-left-padded) ordinals using prefix bits 1111 |
manchester-inverted | text <-> manchester encoded text | XORes each bit of the input with 10 |
octal-spaced | text <-> octal digits (whitespace-separated) | dummy octal conversion |
ordinal-spaced | text <-> ordinal digits (whitespace-separated) | dummy character ordinals conversion |
southpark-icase | text <-> Kenny's language | same as southpark but case insensitive |
whitespace_after_before | text <-> lines of whitespaces[letter]whitespaces | encodes characters as new characters with whitespaces before and after according to an equation described in the codec name (e.g. "whitespace+2*after-3*before") |