Encryption is a method of turning data into an unusable form that can be made useful only by means of decryption. The purpose is to make data available solely to those who can decrypt it (i.e. make it usable). Typically, data needs to be encrypted to make sure it cannot be obtained in case of unauthorized access. It is the last line of defense after an attacker has managed to break through authorization systems and access control.
This doesn't mean all data needs to be encrypted, because often times authorization and access systems may be enough, and in addition, there is a performance penalty for encrypting and decrypting data. If and when the data gets encrypted is a matter of application planning and risk assessment, and sometimes it is also a regulatory requirement, such as with HIPAA or GDPR.
Data can be encrypted at-rest, such as on disk, or in transit, such as between two parties communicating over the Internet.
Here you will learn how to encrypt and decrypt data using a password, also known as symmetrical encryption. This password must be known to both parties exchanging information.
To properly and securely use encryption, there are a few notions that need to be explained.
A cipher is the algorithm used for encryption. For example, AES256 is a cipher. The idea of a cipher is what most people will think of when it comes to encryption.
A digest is basically a hash function that is used to scramble and lengthen the password (i.e. the encryption key) before it's used by the cipher. Why is this done? For one, it creates a well randomized, uniform-length hash of a key that works better for encryption. It's also very suitable for "salting", which is the next one to talk about.
The "salt" is a method of defeating so-called "rainbow" tables. An attacker knows that two hashed values will still look exactly the same if the originals were. However, if you add the salt value to hashing, then they won't. It's called "salt" because it's sort of mixed with the key to produce something different. Now, a rainbow table will attempt to match known hashed values with precomputed data in an effort to guess a password. Usually, salt is randomly generated for each key and stored with it. In order to match known hashes, the attacker would have to precompute rainbow tables for great many random values, which is generally not feasible.
You will often hear about "iterations" in encryption. An iteration is a single cycle in which a key and salt are mixed in such a way to make guessing the key harder. This is done many times so to make it computationally difficult for an attacker to reverse-guess the key, hence "iterations" (plural). Typically, a minimum required number of iterations is 1000, but it can be different than that. If you start with a really strong password, generally you need less.
IV (or "Initialization Vector") is typically a random value that's used for encryption of each message. Now, salt is used for producing a key based on a password. And IV is used when you already have a key and now are encrypting messages. The purpose of IV is to make the same messages appear differently when encrypted. Sometimes, IV also has a sequential component, so it's made of a random string plus a sequence that constantly increases. This makes "replay" attacks difficult, which is where attacker doesn't need to decrypt a message; but rather an encrypted message was "sniffed" (i.e. intercepted between the sender and receiver) and then replayed, hoping to repeat the action already performed. Though in reality, most high-level protocols already have a sequence in place, where each message has, as a part of it, an increasing packet number, so in most cases IV doesn't need it.
This example uses Golf framework. Install it first.
To run the examples here, create an application "enc" in a directory of its own (see mgrg for more on Golf's program manager):
mkdir enc_example cd enc_example gg -k encCopied!
To encrypt data use encrypt-data statement. The simplest form is to encrypt a null-terminated string. Create a file "encrypt.golf" and copy this:
begin-handler /encrypt public set-string str = "This contains a secret code, which is Open Sesame!" // Encrypt encrypt-data str to enc_str password "my_password" p-out enc_str @ // Decrypt decrypt-data enc_str password "my_password" to dec_str p-out dec_str @ end-handlerCopied!
You can see the basic usage of encrypt-data and decrypt-data. You supply data (original or encrypted), the password, and off you go. The data is encrypted and then decrypted, yielding the original.
In the source code, a string variable "enc_str" (which is created as a "char *") will contain the encrypted version of "This contains a secret code, which is Open Sesame!" and "dec_str" will be the decrypted data which must be exactly the same.
To run this code from command line, make the application first:
gg -q
Copied!
Then have Golf produce the bash code to run it - the request path is "/encrypt", which in our case is handled by function "void encrypt()" defined in source file "encrypt.golf". In Golf, these names always match, making it easy to write, read and execute code. Use "-r" option in gg to specify the request path and get the code you need to run the program:
gg -r --req="/encrypt" --silent-header --execCopied!
You will get a response like this:
72ddd44c10e9693be6ac77caabc64e05f809290a109df7cfc57400948cb888cd23c7e98e15bcf21b25ab1337ddc6d02094232111aa20a2d548c08f230b6d56e9 This contains a secret code, which is Open Sesame!Copied!
What you have here is the encrypted data, and then this encrypted data is decrypted using the same password. Unsurprisingly, the result matches the string you encrypted in the first place.
Note that by default encrypt-data will produce encrypted value in a human-readable hexadecimal form, which means consisting of hexadecimal characters "0" to "9" and "a" to "f". This way you can store the encrypted data into a regular string. For instance it may go to a JSON document or into a VARCHAR column in a database, or pretty much anywhere else. However you can also produce a binary encrypted data. More on that in a bit.
In the previous example, the resulting encrypted data is in a human-readable hexadecimal form. You can also create binary encrypted data, which is not a human-readable string and is also shorter. To do that, use "binary" clause. Replace the code in "encrypt.golf" with:
begin-handler /encrypt public set-string str = "This contains a secret code, which is Open Sesame!" // Encrypt encrypt-data str to enc_str password "my_password" binary // Save the encrypted data to a file write-file "encrypted_data" from enc_str get-app directory to app_dir @Encrypted data written to file <<p-out app_dir>>/encrypted_data // Decrypt data decrypt-data enc_str password "my_password" binary to dec_str p-out dec_str @ end-handlerCopied!
When you want to get binary encrypted data, you should get its length in bytes too, or otherwise you won't know where it ends, since it may contain null bytes in it. Use "output-length" clause for that purpose. In this code, the encrypted data in variable "enc_str" is written to file "encrypted_data", and the length written is "outlen" bytes. When a file is written without a path, it's always written in the application home directory (see directories), so you'd use get-app to get that directory.
When decrypting data, notice the use of "input-length" clause. It says how many bytes the encrypted data has. Obviously you can get that from "outlen" variable, where encrypt-data stored the length of encrypted data. When encryption and decryption are decoupled, i.e. running in separate programs, you'd make sure this length is made available.
Notice also that when data is encrypted as "binary" (meaning producing a binary output), the decryption must use the same.
Make the application:
gg -q
Copied!
Run it the same as before:
gg -r --req="/encrypt" --silent-header --execCopied!
The result is:
Encrypted data written to file /var/lib/gg/enc/app/encrypted_data This contains a secret code, which is Open Sesame!Copied!
The decrypted data is exactly the same as the original.
You can see the actual encrypted data written to the file by using "octal dump" ("od") Linux utility:
od -c /var/lib/gg/enc/app/encrypted_data
Copied!
with the result like:
$ od -c /var/lib/gg/enc/app/encrypted_data 0000000 r 335 324 L 020 351 i ; 346 254 w 312 253 306 N 005 0000020 370 \t ) \n 020 235 367 317 305 t \0 224 214 270 210 315 0000040 # 307 351 216 025 274 362 033 % 253 023 7 335 306 320 0000060 224 # ! 021 252 242 325 H 300 217 # \v m V 351 0000100Copied!
There you have it. You will notice the data is binary and it actually contains the null byte(s).
The data to encrypt in these examples is a string, i.e. null-delimited. You can encrypt binary data just as easily by specifying it whole (since Golf keeps track of how many bytes are there!), or specifying its length in "input-length" clause, for example copy this to "encrypt.golf":
begin-handler /encrypt public set-string str = "This c\000ontains a secret code, which is Open Sesame!" // Encrypt encrypt-data str to enc_str password "my_password" input-length 12 p-out enc_str @ // Decrypt decrypt-data enc_str password "my_password" to dec_str // Output binary data; present null byte as octal \000 string-length dec_str to res_len start-loop repeat res_len use i start-with 0 if-true dec_str[i] equal 0 p-out "\\000" else-if pf-out "%c", dec_str[i] end-if end-loop @ end-handlerCopied!
This will encrypt 12 bytes at memory location "enc_str" regardless of any null bytes. In this case that's "This c" followed by a null byte followed by "ontain" string, but it can be any kind of binary data, for example the contents of a JPG file.
On the decrypt side, you'd obtain the number of bytes decrypted in "output-length" clause. Finally, the decrypted data is shown to be exactly the original and the null byte is presented in a typical octal representation.
Make the application:
gg -q
Copied!
Run it the same as before:
gg -r --req="/encrypt" --silent-header --execCopied!
The result is:
6bea45c2f901c0913c87fccb9b347d0a This c\000ontaiCopied!
The encrypted value is shorter because the data is shorter in this case too, and the result matches exactly the original.
The encryption used by default is AES256 and SHA256 hashing from the standard OpenSSL library, both of which are widely used in cryptography. You can however use any available cipher and digest (i.e. hash) that is supported by OpenSSL (even the custom ones you provide).
To see which algorithms are available, do this in command line:
#get list of cipher providers openssl list -cipher-algorithms #get list of digest providers openssl list -digest-algorithmsCopied!
These two will provide a list of cipher and digest (hash) algorithms. Some of them may be weaker than the default ones chosen by Golf, and others may be there just for backward compatibility with older systems. Yet others may be quite new and did not have enough time to be validated to the extent you may want them to be. So be careful when choosing these algorithms and be sure to know why you're changing the default ones. That said, here's an example of using Camellia-256 (i.e. "CAMELLIA-256-CFB1") encryption with "SHA3-512" digest. Replace the code in "encrypt.golf" with:
begin-handler /encrypt public set-string str = "This contains a secret code, which is Open Sesame!" // Encrypt data encrypt-data str to enc_str password "my_password" \ cipher "CAMELLIA-256-CFB1" digest "SHA3-512" p-out enc_str @ // Decrypt data decrypt-data enc_str password "my_password" to dec_str \ cipher "CAMELLIA-256-CFB1" digest "SHA3-512" p-out dec_str @ end-handlerCopied!
Make the application:
gg -q
Copied!
Run it:
gg -r --req="/encrypt" --silent-header --execCopied!
In this case the result is:
f4d64d920756f7220516567727cef2c47443973de03449915d50a1d2e5e8558e7e06914532a0b0bf13842f67f0a268c98da6 This contains a secret code, which is Open Sesame!Copied!
Again, you get the original data. Note you have to use the same cipher and digest in both encrypt-data and decrypt-data!
You can of course produce the binary encrypted value just like before by using "binary" and "output-length" clauses.
If you've got external systems that encrypt data, and you know which cipher and digest they use, you can match those and make your code interoperable. Golf uses standard OpenSSL library so chances are that other software may too.
To add a salt to encryption, use "salt" clause. You can generate random salt by using random-string statement (or random-crypto if there is a need). Here is the code for "encrypt.golf":
begin-handler /encrypt public set-string str = "This contains a secret code, which is Open Sesame!" // Get salt random-string to rs length 16 // Encrypt data encrypt-data str to enc_str password "my_password" salt rs @Salt used is <<p-out rs>>, and the encrypted string is <<p-out enc_str>> // Decrypt data decrypt-data enc_str password "my_password" salt rs to dec_str p-out dec_str @ end-handlerCopied!
Make the application:
gg -q
Copied!
Run it a few times:
gg -r --req="/encrypt" --silent-header --exec gg -r --req="/encrypt" --silent-header --exec gg -r --req="/encrypt" --silent-header --execCopied!
The result:
Salt used is VA9agPKxL9hf3bMd, and the encrypted string is 3272aa49c9b10cb2edf5d8a5e23803a5aa153c1b124296d318e3b3ad22bc911d1c0889d195d800c2bd92153ef7688e8d1cd368dbca3c5250d456f05c81ce0fdd This contains a secret code, which is Open Sesame! Salt used is FeWcGkBO5hQ1uo1A, and the encrypted string is 48b97314c1bc88952c798dfde7a416180dda6b00361217ea25278791c43b34f9c2e31cab6d9f4f28eea59baa70aadb4e8f1ed0709db81dff19f24cb7677c7371 This contains a secret code, which is Open Sesame! Salt used is nCQClR0NMjdetTEf, and the encrypted string is f19cdd9c1ddec487157ac727b2c8d0cdeb728a4ecaf838ca8585e279447bcdce83f7f95fa53b054775be1bb2de3b95f2e66a8b26b216ea18aa8b47f3d177e917 This contains a secret code, which is Open Sesame!Copied!
As you can see, a random salt value (16 bytes long in this case) is generated for each encryption, and the encrypted value is different each time, even though the data being encrypted was the same! This makes it difficult to crack encryption like this.
Of course, to decrypt, you must record the salt and use it exactly as you did when encrypting. In the code here, variable "rs" holds the salt. If you store the encrypted values in the database, you'd likely store the salt right next to it.
In practice, you wouldn't use a different salt value for each message. It creates a new key every time, and that can reduce performance. And there's really no need for it: the use of salt is to make each key (even the same ones) much harder to guess. Once you've done that, you might not need to do it again, or often.
Instead, you'd use an IV (Initialization Vector) for each message. It's usually a random string that makes same messages appear different, and increases the computational cost of cracking the password. Here is the new code for "encrypt.golf":
begin-handler /encrypt public // Get salt random-string to rs length 16 // Encrypt data start-loop repeat 10 use i start-with 0 random-string to iv length 16 encrypt-data "The same message" to enc_str password "my_password" salt rs iterations 2000 init-vector iv cache @The encrypted string is <<p-out enc_str>> // Decrypt data decrypt-data enc_str password "my_password" salt rs iterations 2000 init-vector iv to dec_str cache p-out dec_str @ end-loop end-handlerCopied!
Make the application:
gg -q
Copied!
Run it a few times:
gg -r --req="/encrypt" --silent-header --exec gg -r --req="/encrypt" --silent-header --exec gg -r --req="/encrypt" --silent-header --execCopied!
The result may be:
The encrypted string is 787909d332fd84ba939c594e24c421b00ba46d9c9a776c47d3d0a9ca6fccb1a2 The same message The encrypted string is 7fae887e3ae469b666cff79a68270ea3d11b771dc58a299971d5b49a1f7db1be The same message The encrypted string is 59f95c3e4457d89f611c4f8bd53dd5fa9f8c3bbe748ed7d5aeb939ad633199d7 The same message The encrypted string is 00f218d0bbe7b618a0c2970da0b09e043a47798004502b76bc4a3f6afc626056 The same message The encrypted string is 6819349496b9f573743f5ef65e27ac26f0d64574d39227cc4e85e517f108a5dd The same message The encrypted string is a2833338cf636602881377a024c974906caa16d1f7c47c78d9efdff128918d58 The same message The encrypted string is 04c914cd9338fcba9acb550a79188bebbbb134c34441dfd540473dd8a1e6be40 The same message The encrypted string is 05f0d51561d59edf05befd9fad243e0737e4a98af357a9764cba84bcc55cf4d5 The same message The encrypted string is ae594c4d6e72c05c186383e63c89d93880c8a8a085bf9367bdfd772e3c163458 The same message The encrypted string is 2b28cdf5a67a5a036139fd410112735aa96bc341a170dafb56818dc78efe2e00 The same messageCopied!
You can see that the same message appears different when encrypted, though when decrypted it's again the same. Of course, the password, salt, number of iterations, and init-vector must be the same for both encryption and decryption.
Note the use of "cache" clause in encrypt-data and decrypt-data. It effectively caches the computed key (given password, salt, cipher/digest algorithms and number of iterations), so it's not computed each time through the loop. With "cache" the key is computed once, and then a different IV (in "init-vector" clause) is used for each message.
If you want to occasionally rebuild the key, use "clear-cache" clause, which supplies a boolean. If true, the key is recomputed, otherwise it's left alone. See encrypt-data for more on this.
You have learned how to encrypt and decrypt data using different ciphers, digests, salt and IV values in Golf. You can also create a human-readable encrypted value and a binary output, as well as encrypt both strings and binary values (like documents).