Official Golf language blog: 2024

Thursday, December 26, 2024

Encryption: ciphers, digests, salt, IV

What is encryption

Encryption is a method of turning data into an unusable form that can be made useful only by means of decryption. The purpose is to make data available solely to those who can decrypt it (i.e. make it usable). Typically, data needs to be encrypted to make sure it cannot be obtained in case of unauthorized access. It is the last line of defense after an attacker has managed to break through authorization systems and access control.

This doesn't mean all data needs to be encrypted, because often times authorization and access systems may be enough, and in addition, there is a performance penalty for encrypting and decrypting data. If and when the data gets encrypted is a matter of application planning and risk assessment, and sometimes it is also a regulatory requirement, such as with HIPAA or GDPR.

Data can be encrypted at-rest, such as on disk, or in transit, such as between two parties communicating over the Internet.

Here you will learn how to encrypt and decrypt data using a password, also known as symmetrical encryption. This password must be known to both parties exchanging information.

Cipher, digest, salt, iterations, IV

To properly and securely use encryption, there are a few notions that need to be explained.

A cipher is the algorithm used for encryption. For example, AES256 is a cipher. The idea of a cipher is what most people will think of when it comes to encryption.

A digest is basically a hash function that is used to scramble and lengthen the password (i.e. the encryption key) before it's used by the cipher. Why is this done? For one, it creates a well randomized, uniform-length hash of a key that works better for encryption. It's also very suitable for "salting", which is the next one to talk about.

The "salt" is a method of defeating so-called "rainbow" tables. An attacker knows that two hashed values will still look exactly the same if the originals were. However, if you add the salt value to hashing, then they won't. It's called "salt" because it's sort of mixed with the key to produce something different. Now, a rainbow table will attempt to match known hashed values with precomputed data in an effort to guess a password. Usually, salt is randomly generated for each key and stored with it. In order to match known hashes, the attacker would have to precompute rainbow tables for great many random values, which is generally not feasible.

You will often hear about "iterations" in encryption. An iteration is a single cycle in which a key and salt are mixed in such a way to make guessing the key harder. This is done many times so to make it computationally difficult for an attacker to reverse-guess the key, hence "iterations" (plural). Typically, a minimum required number of iterations is 1000, but it can be different than that. If you start with a really strong password, generally you need less.

IV (or "Initialization Vector") is typically a random value that's used for encryption of each message. Now, salt is used for producing a key based on a password. And IV is used when you already have a key and now are encrypting messages. The purpose of IV is to make the same messages appear differently when encrypted. Sometimes, IV also has a sequential component, so it's made of a random string plus a sequence that constantly increases. This makes "replay" attacks difficult, which is where attacker doesn't need to decrypt a message; but rather an encrypted message was "sniffed" (i.e. intercepted between the sender and receiver) and then replayed, hoping to repeat the action already performed. Though in reality, most high-level protocols already have a sequence in place, where each message has, as a part of it, an increasing packet number, so in most cases IV doesn't need it.

Prerequisites

This example uses Golf framework. Install it first.

Encryption example

To run the examples here, create an application "enc" in a directory of its own (see mgrg for more on Golf's program manager):

mkdir enc_example
cd enc_example
gg -k enc

Copied!

To encrypt data use encrypt-data statement. The simplest form is to encrypt a null-terminated string. Create a file "encrypt.golf" and copy this:

 begin-handler /encrypt public
     set-string str = "This contains a secret code, which is Open Sesame!"
     // Encrypt
     encrypt-data str to enc_str password "my_password"
     print-out enc_str
     @
     // Decrypt
     decrypt-data enc_str password "my_password" to dec_str
     print-out dec_str
     @
 end-handler

Copied!

You can see the basic usage of encrypt-data and decrypt-data. You supply data (original or encrypted), the password, and off you go. The data is encrypted and then decrypted, yielding the original.

Golf 136 released

Any number expression can now use string subscription as a number, for instance:
set-string str='hello' set-number num = 10+str[0]
A character is treated as an unsigned number ranging from 0-255 (i.e. unsigned byte).

Tuesday, December 24, 2024

Golf 132 released

Individual bytes of a string (binary or text) can now be set using set-string by specifying the byte with a number expression within []. Since Golf is a memory-safe language, setting a byte this way is subject to a range check. For instance:
set-string str[10] = 'a'
An individual byte of a string (binary or text) can now be obtained (as a number) with set-number using a number expression within []. Since Golf is a memory-safe language, getting a byte this way is subject to a range check. For instance:
set-number byte = str[10]

Note that Golf is a very high level language, and it generally does not start with low-level constructs, such as setting and retrieving bytes from memory; rather its statements perform tasks that take typically many lines of code in other languages. So it makes sense an addition like this would be a "side-note" undertaken later in the language; it's not the focus of it. Still, Golf is also a high performance language and so the above two new capabilities are implemented with that in mind, with the minimum of overhead.

Sunday, December 15, 2024

Distributed computing made easy

What is distributed computing

Distributed computing is two or more servers communicating for a common purpose. Typically, some tasks are divvied up between a number of computers, and they all work together to accomplish it. Note that "separate servers" may mean physically separate computers. It may also mean virtual servers such as Virtual Private Servers (VPS) or containers, that may share the same physical hardware, though they appear as separate computers on the network.

There are many reasons why you might need this kind of setup. It may be that resources needed to complete the task aren't all on a single computer. For instance, your application may rely on multiple databases, each residing on a different computer. Or, you may need to distribute requests to your application because a single computer isn't enough to handle them all at the same time. In other cases, you are using remote services (like a REST API-based for instance), and those by nature reside somewhere else.

In any case, the computers comprising your distributed system may be on a local network, or they may be worldwide, or some combination of those. The throughput (how many bytes per second can be exchanged via network) and latency (how long it takes for a packet to travel via network) will obviously vary: for a local network you'd have a higher throughput and lower latency, and for Internet servers it will be the opposite. Plan accordingly based on the quality of service you'd expect.

How servers communicate

Depending on your network(s) setup, different kinds of communication are called for. If two servers reside on a local network, then they would typically used the fastest possible means of communication. A local network typically means a secure network, because nobody else has access to it but you. So you would not need TSL/SSL or any other kind of secure protocol as that would just slow things down.

If two servers are on the Internet though, then you must use a secure protocol (like TSL/SSL or some other) because your communication may be spied on, or worse, affected by man-in-the-middle attacks.

Local network distributed computing

Most of the time, your distributed system would be on a local network. Such network may be separate and private in a physical sense, or (more commonly) in a virtual sense, where some kind of a Private Cloud Network is established for you by the Cloud provider. It's likely that separation is enforced by specialized hardware (such as routers and firewalls) and secure protocols that keep networks belonging to different customers separate. This way, a "local" network can be established even if computers on it are a world apart, though typically they reside as a part of a larger local network.

Either way, as far as your application is concerned, you are looking at a local network. Thus, the example here will be for such a case, as it's most likely what you'll have. A local network means different parts of your application residing on different servers will use some efficient protocol based on TCP/IP. One such protocol is FastCGI, a high-performance binary protocol for communication between servers, clients, and in general programs of all kinds, and that's the one used by Golf. So in principle, the setup will look like this (there'll be more details later):

Next, in theory you should have two servers, however in this example both servers will be on the same localhost (i.e. "127.0.0.1"). This is just for simplicity; the code is exactly the same if you have two different servers on a local network - simply use another IP (such as "192.168.0.15" for instance) for your "remote" server instead of local "127.0.0.1". The two servers do not even necessarily need to be physically two different computers. You can start a Virtual Machine (VM) on your computer and host another virtual computer there. Popular free software like VirtualBox or KVM Hypervisor can help you do that.

In any case, in this example you will start two simple application servers; they will communicate with one another. The first one will be called "local" and the other one "remote" server. The local application server will make a request to the remote one.

Local server

On a local server, create a new directory for your local application server source code:

mkdir $HOME/local_server
cd $HOME/local_server

Copied!

and then create a new file "status.golf" with the following:

 begin-handler /status public
     silent-header
     get-param server
     get-param days

     print-format "/server/remote-status/days=%s", days to payload
     print-format "%s:3800", server to srv_location

     new-remote srv location srv_location \
         method "GET" url-path payload \
         timeout 30

     call-remote srv
     read-remote srv data dt
     @Output is: [<<print-out dt>>]
 end-handler

Copied!

The code here is very simple. new-remote will create a new connection to a remote server, running on IP address given by input parameter "server" (and obtained with get-param) on TCP port 3800. URL payload created in string variable "payload" is passed to the remote server. If it doesn't reply in 30 seconds, then the code would timeout. Then you're using call-remote to actually make a call to the remote server (which is served by application "server" and by request handler "remote-status.golf" below), and finally read-remote to get the reply from it. For simplicity, error handling is omitted here, but you can easily detect a timeout, any network errors, any errors from the remote server, including error code and error text, etc. See the above statements for more on this.

Make and start the local server

How is memory organized in Golf

Golf is a high-performance memory-safe language. A string variable is the actual pointer to a string (whether it's a text or binary). This eliminates at least one extra memory read, making string access as fast as in C. Bytes before the string constitute the ID to a memory table entry, with additional info used by memory-safety mechanisms:

Length (in bytes) of the string,
"Ref count" (Reference count), stating how many Golf variables point to string,
Status is used to describe string, such as whether it's scope is process-wide, if it's a string literal etc,
"Next free" points to the next available string block (if this one was freed too),
"Data ptr" points back to the string, which is used to speed up access.

Memory is always null-terminated, regardless of whether it's text or binary. Here's what that looks like in a picture:

Each memory block (ID+string+trailing null) is a memory allocated by standard C'd memory allocation, while memory table is a continuous block that's frequently cached to produce fast access to string's properties.

Web file manager in less than 100 lines of code

Uploading and download files in web browser is a common task in virtually any web application or service. This article shows how to do this with very little coding - in less than 100 lines of code. The database used is PostgreSQL, and the web server is Nginx.

You will use Golf as an application server and the programming language. It will run behind the web server for performance and security, as well as to enable richer web functionality. This way end-user cannot talk to your application server directly because all such requests go through the web server, while your back-end application can talk directly to your application server for better performance.

Assuming your currently logged-on Linux user will own the application, create a source code directory and also create Golf application named "file-manager":

mkdir filemgr
cd filemgr
gg -k file-manager

Copied!

Next, create PostgreSQL database named "db_file_manager", owned by currently logged-on user (i.e. passwordless setup):

echo "create user $(whoami);
create database db_file_manager with owner=$(whoami);
grant all on database db_file_manager to $(whoami);
\q"  | sudo -u postgres psql

Copied!

Create database configuration file used by Golf that describes the database (it's a file "db"):

echo "user=$(whoami) dbname=db_file_manager" > db

Copied!

Create SQL table that will hold files currently stored on the server:

echo "create table if not exists files (fileName varchar(100), localPath varchar(300), extension varchar(10), description varchar(200), fileSize int, fileID bigserial primary key);" | psql -d db_file_manager

Copied!

Finally, create source Golf files. First create "start.golf" file and copy and paste:

 begin-handler /start public
    @<h2>File Manager</h2>
    @To manage the uploaded files, <a href="<<print-path "/list">>">click here.</a><br/>
    @<br/>
    @<form action="<<print-path "/upload">>" method="POST" enctype="multipart/form-data">
    @    <label for="file_description">File description:</label><br>
    @    <textarea name="filedesc" rows="3" columns="50"></textarea><br/>
    @    <br/>
    @    <label for="filename">File:</label>
    @    <input type="file" name="file" value=""><br><br>
    @    <input type="submit" value="Submit">
    @</form>
 end-handler

Copied!

Create "list.golf" file and copy and paste:

Fixed bug: process-scoped string would be freed at the end of a code block or request handler when --optimize-memory flag is used.

Wednesday, December 4, 2024

Golf 121 released

Added return-handler statement.
Better error message when request is not found.
Better error message when no .golf files found or no begin-handler statements found.

Monday, December 2, 2024

Passing parameters between local request handlers

In Golf, there are no "functions" or "methods" as you may be used to in other languages. Any encapsulation is a request handler - you can think of it as a simple function that executes to handle a request - it can be called from an outside caller (such as web browser, web API, or from a command-line), or from another handler.

By the same token, there are no formal parameters in a way that you may be used to. Instead, there are named parameters, basically name/value pairs, which you can set or get anywhere during the request execution. In addition, your request handler can handle the request body, environment variables, the specific request method etc. (see request). Here though, we'll focus on parameters only.

You'll use set-param to set a parameter, which can then be obtained anywhere in the current request, including in the current handler's caller or callee. Use get-param to obtain a parameter that's set with set-param.

Parameters are very fast - they are static creatures implemented at compile time, meaning only fixed memory locations are used to store them (making for great CPU caching), and any name-based resolution is used only when necessary, and always with fast hash tables and static caching.

Calling one handler from another

In this article, we'll talk about call-handler which is used to call a handler from within another handler.

Here you'll see a few examples of passing input and output parameters between requests handlers. These handlers are both running in the same process of an application (note that application can run as many processes working in parallel). To begin, create an application:

mkdir param
cd param
gg -k param

Copied!

You'll also create two source files ("local.golf" and "some.golf") a bit later with the code below.

Simple service

Let's start with a simple service that provides current time based on a timezone as an input parameter (in file "local.golf"):

 begin-handler /local/time
     get-param tzone // get time zone as input parameter (i.e. "EST", "MST", "PST" etc.)
     get-time to curr_time timezone tzone
     @<div>Current time is <<p-out curr_time>></div>
 end-handler

Copied!

In this case, HTML code is output. Make the application:

Golf 117 released

Added external-call clause to get-req statement. It returns true if the current request handler is called directly from an external entity (web browser, an outside API call, curl call, command line etc.), or false if called from another handler. This allows for greater flexibility in formulating web service's response and its output parameters.

Tuesday, November 26, 2024

Golf 114 released

call-handler statement is now 2.1 times faster. Only the very first call to a local request uses hash table to find it; all subsequent ones use a cached request address. Note that this is true only if the request name is a string constant and not a variable (in which case it's resolved via hash table every time still). However, in most applications request name is a constant string nearly 100% of the time.

Friday, November 22, 2024

Golf 109 released

New "-k" option in gg utility will create a new Golf application, if it didn't already exist. You can still use mgrg utility to create new applications. This option makes it easier to create one with all the default settings. You can also use "-q" flag to compile and make the executable in a single step.

How to create Golf application

To create Golf application with default settings use an option of gg utility:

gg -k my-app

where "my-app" is your application name. If you already have an application with that name, nothing is done, so this is an idempotent operation.

If you already have source code, you can create and compile your application in one step:

gg -k my-app -q

which is a neat shortcut.

What's default settings? Well, it means your application directory (in "/var/lib/gg/my-app") will be owned by the currently logged on user (and other users can't access it), and any Unix socket can connect to your application server. This is a typical setup you'd probably use in most cases, so it's a useful one.

If you'd like to have more options in creating a Golf application, see service manager).

Thursday, November 21, 2024

Getting help for Golf with man pages

Golf installation will create Linux "man" pages (or manual pages).

They contain the same information as the web documentation, and you can use them for a quick help on syntax even when you're working offline.

For instance to get help on "call-web" statement, you would enter in command line:

man call-web

The result would be something like:

Note that the man section for Golf is "2gg".

Tuesday, November 12, 2024

Multi-tenant SaaS (Notes web application) in 200 lines of code

This is a complete SaaS example (Software-as-a-Service) using PostgreSQL as a database, and Golf as a web service engine; it includes user signup/login/logout with an email and password, separate user accounts and data, and a notes application. All in about 200 lines of code!

First create a directory for your application, where the source code will be:

mkdir -p notes
cd notes

Copied!

Setup Postgres database

Create PostgreSQL user (with the same name as your logged on Linux user, so no password needed), and the database "db_app":

echo "create user $(whoami);
create database db_app with owner=$(whoami);
grant all on database db_app to $(whoami);
\q"  | sudo -u postgres psql

Copied!

Create a database configuration file to describe your PostgreSQL database above:

echo "user=$(whoami) dbname=db_app" > db_app

Copied!

Create database objects we'll need - users table for application users, and notes table to hold their notes:

echo "create table if not exists notes (dateOf timestamp, noteId bigserial primary key, userId bigint, note varchar(1000));
create table if not exists users (userId bigserial primary key, email varchar(100), hashed_pwd varchar(100), verified smallint, verify_token varchar(30), session varchar(100));
create unique index if not exists users1 on users (email);" | psql -d db_app

Copied!

Create Golf application

Create application "notes" owned by your Linux user:

sudo mgrg -i -u $(whoami) notes

Copied!

Source code

This executes before any other handler in an application, making sure all requests are authorized, file "before-handler.golf":

vi before-handler.golf

Copied!

Copy and paste:

 before-handler
     set-param displayed_logout = false, is_logged_in = false
     call-handler "/session/check"
 end-before-handler

Copied!

- Signup users, login, logout

This is a generic session management web service that handles user creation, verification, login and logout. Create file "session.golf":

vi session.golf

Copied!

Copy and paste:

Golf 91 released

Fixed SELinux installation bug for Fedora-like distros.
Minimum number of CPUs available is 1 in case there's an issue in determining the number of them.
Fixed issue with highlighting call-handler statement in vim.

Tuesday, November 5, 2024

Golf 87 released

Added --exclude option to gg in order to exclude sub-directories from compilation.
Added p-source-line and p-source-file statements to aid in debugging.
A request handler can now be defined in any source file whose path matches, partially or fully, the request path of the handler. For instance /session/login request handler can be defined either in file "session.golf" or "session/login.golf". A source file can also have multiple request handlers that match path.
Added --single-file option to gg to force each source code file to have just a single request handler.
set-param and get-param statements now work with multiple parameters separated by a comma.
set-cookie and get-cookie statements now work with multiple cookies separated by a comma.
Fixed bug in end-write-string statement where junk text at the end would be ignored without an error message.
Fixed error in parallel compilation.
Fixed bug in get-param where parameter name wouldn't be correct.
Fixed bug where a function in call-extended statement wouldn't work without any parameters.

Monday, October 28, 2024

Golf 76 released

Major enhancement: request handler source files can now be written in (sub)directories, such that the path leading to the request handler matches the request path. For instance, request handler "/session/user/new" will be implemented under directories "session", then "user", and then in file new.golf. This allows for clean and easy to organize structure of an application.
Fixed bug in parallel compilation, where reported file names in error messages were inaccurate.
Faster compilation by using soft links instead of copies where needed.
sub-handler statement has been renamed call-handler for better descriptiveness.
Added "public" and "private" clauses in begin-handler, to improve security features. "private" means that request handler cannot be called by an outside caller (formerly "sub-handler").
Added --public option in gg to toggle between request handlers being public or private by default (by default they are private).
Fixed spurious output from gg -m.
Better error message when trying to compile an application and it was not yet created.
p-path statement now must have a request path object, in order to make a more reliable construction of link URL paths, and to allow for static checking of requests, for faster design and development.
begin-handler statement now must start with a forward slash, which was up until now optional.

Web Services Security

The security of web services is a broad subject, but there are a few common topics that one should be well aware. They affect if your web application, or an API service, will be a success or not.

Many web services provide an interface to a database. SQL injection is a very old, but nonetheless important issue. A naive implementation of SQL execution inside your code could open the door for a malicious actor to drop your tables, insert/update/delete records etc. Be sure that your back-end software is SQL injection proof.

Golf 70 released

Enhanced compiler error message in if-true statement when conditional statement is missing or incomplete.
Fixed bug in p-path statement where it wouldn't work without new-line clause.
Fixed bug in set/get-param statements where it (rarely) may get stuck in wrong compiled code.
C style comment /*...*/ will cause an error at the beginning of statement only for readability and to avoid library detection issues.
Better error message when database config file is not found for a database.

Monday, October 21, 2024

Web services with MariaDB

Create a directory for your project, it'll be where this example takes place. Also create Golf application "stock":

mkdir -p stock-app
cd stock-app
sudo mgrg -i -u $(whoami) stock

Copied!

Start MariaDB command line interface:

sudo mysql

Copied!

Create an application user, database and a stock table (with stock name and price):

create user stock_user;
create database stock_db;
grant all privileges on stock_db.* to stock_user@localhost identified by 'stock_pwd';
use stock_db
create table if not exists stock (stock_name varchar(100) primary key, stock_price bigint);

Copied!

Golf wants you to describe the database: the user name and password, database name, and the rest is the default setup for MariaDB database connection. So create a file "db_stock" (which is your database configuration file, one per each you use):

Golf 65 released

Added --parallel option to gg utility to enable multi-threaded compilation. Now the speed of making large application can be multiple times faster, for instance 3-5 times faster with a typical laptop.
Added high-performance hash for parameters (see set-param, get-param), making them comparable to C's stack variables in performance.
Removed input-count, input-name, input-value clauses from get-req statement as they are superfluous now.
set-param enhanced by making =<value> optional. This is often used when parameter name and the variable assigned to it have the same name. It makes the code reading and writing cleaner.

Thursday, October 17, 2024

Web service calling web service

A web service doesn't necessarily need to be called from "the web", meaning from the web browser or via API across the web. It can be called from another web service that's on a local network.

Typically, when called from the web, HTTPS protocol is used to ensure safety of that call. However, local networks are usually secure, meaning no one else has access to it but your own web services.

Thus, communication between local web services will be much faster if it doesn't use a secure protocol as it incurs the performance cost. Simple, fast and unburdened protocols, such as FastCGI, may be better.

FastCGI is interesting because it actually carries the same information as HTTP, so a web service can operate normally, using GET/POST/etc. request methods, passing parameters in URL, request body, environment variables etc. But at the same time, FastCGI is a fast binary protocol that doesn't incur cost of safety - and for local web-service to web-service communication, that's a good thing.

Just like HTTP, FastCGI can separate standard output from standard error, allowing both streams to be intermixed, but retrieved separately.

Overall, inter-web-service communication can be implemented with the aforementioned protocols in a way that preserves high-level HTTP functionality innate to web services, but with overall better performance.

Monday, October 14, 2024

Golf 56 released

Fixed bug with before-handler and after-handler statements, where the names of files used to implement them (.golf files) were not correctly stated in the documentation.

Sunday, October 13, 2024

What is Web Service

Web service is code that responds to a request and provides a reply over HTTP protocol. It doesn't need to work over the web, despite its name. You can run a web service locally on a server or on a local network. You can even run a web service from command line. In fact, that's an easy way to test them.

The input comes from an HTTP request - this means via URL parameters plus (optional) request body.

The parameters could be in URL's path (such as "/a=b/c=d/...") or in its query string (such as "?a=b&c=d...") or both - this data is typically limited in size to 2KB. Additional parameters could be appended to the request body - this is for instance how files are uploaded in an HTML form.

Request body itself can be any data of any size really - web services typically have an adjustable size limit for this data just to avoid mistakenly (or maliciously) huge ones. A request body could contain for example a JSON document, or some other kind of data.

The output of web service can be HTML code, JSON, XML, an image such as JPG or just about anything really. It's up to the caller of web service to interpret it. One such caller is web browser, another one could be API from an application etc.

What's the difference between a web application and a web service? Well, technically a web application should be a collection of web services, which are typically more basic service providers. That's why web services are often used as endpoints for remote APIs. They generally have a well defined input and output and are not too big. They serve a specialized purpose most of the time.

Friday, October 11, 2024

Cache as a web service

This example shows Apache as the front-end (or "reverse proxy") for cache server - it's assumed you've completed it first. Three steps to setting up Apache quickly:

Enable FastCGI proxy used to communicate with Golf services - this is one time only:

Links

Thursday, December 26, 2024

Wednesday, December 25, 2024

Tuesday, December 24, 2024

Sunday, December 15, 2024

Thursday, December 12, 2024

Sunday, December 8, 2024

Wednesday, December 4, 2024

Monday, December 2, 2024

Friday, November 29, 2024

Tuesday, November 26, 2024

Friday, November 22, 2024

Thursday, November 21, 2024

Tuesday, November 12, 2024

Sunday, November 10, 2024

Tuesday, November 5, 2024

Monday, October 28, 2024

Wednesday, October 23, 2024

Monday, October 21, 2024

Saturday, October 19, 2024

Thursday, October 17, 2024

Monday, October 14, 2024

Sunday, October 13, 2024

Friday, October 11, 2024