Blog post

Using the JSON Configuration Format with a Schema Definition in an Elixir Release

Bartosz Szafran Bartosz Szafran
Illustration: Using the JSON Configuration Format with a Schema Definition in an Elixir Release
Secret alert

This is a secret URL preview of an unreleased article.

Elixir releases are a convenient way of shipping Elixir applications. They not only allow users to create a zero-dependency artifact of software that’s easy to distribute, but they also provide a convenient way of controlling the lifecycle of it (starting, stopping, checking status) with an easy-to-use interface.

Additionally, using releases means that managing an Elixir application doesn’t require knowledge of its ecosystem, so it shouldn’t be a surprise that Elixir releases are the default way of distributing an Elixir application. However, by default, the only supported configuration format is Elixir code, which might be cumbersome to use.

Elixir Configuration Format

The configuration format is basically a file script that sets the options for the application:

import Config

config :my_app, my_option: :value

This is a powerful and expressive way of specifying a configuration, because it allows users to take advantage of all of Elixir’s language features, like so:

import Config

allowed_hosts =
  "ALLOWED_VALUES"
  |> System.fetch_env!()
  |> String.split(";")
  |> Enum.map(fn string -> String.to_integer(string) end)

config :my_app,
  allowed_hosts: allowed_hosts

The above example shows how to retrieve the configuration from the system environmental variable, split it by “;”, and convert the string to an integer.

This kind of configuration format doesn’t force the value of the configuration option to be parsed or validated in actual code modules.

The configuration can be parsed or mapped during startup of the application, or it can refuse to start the application if it isn’t valid. However, it requires knowledge of Elixir and its syntax, so it might be problematic for inexperienced users. Also, rich code snippets can blur the configuration code structure, which makes defining the configuration more difficult.

Configuration Provider

The above problem is known, and there’s an existing mechanism to support other configuration formats: :config_providers. It allows us to plug our module into the configuration initialization. It needs to implement callbacks of Config.Provider behavior, which are init/1 and load/2.

The first of these, init/1, allows initializing the configuration option, and the second one, load/2, is actually used for configuration preparation. It receives the existing configuration from the release.exs file as the first parameter and the result from the init/1 function as the second parameter.

The Config.Provider module documentation gives the example of reading the configuration from the JSON file and using it as our configuration.

JSON-Based Configuration

The JSON-based configuration format has the benefit of being widely understood, which isn’t true of the Elixir format. Together with other features of Elixir releases, it completely eliminates the requirement of being familiar with Elixir to manage and configure an Elixir application.

Unfortunately, using a non-Elixir format doesn’t allow users to transform configuration options as they would in an Elixir format configuration. What’s more is it doesn’t even allow us to use some Elixir primitives as configuration option values.

In Elixir, a term refers to a value. The most problematic terms are:

  • atoms — How do we decide if a value should be an atom or a string?

  • charlist — Just like with atom, how do we differentiate it from a string?

  • maps and keyword lists — These are both represented in JSON by an object. Which should be favored during conversion?

But there are also other issues. For example, it isn’t possible to easily parse the configuration input, e.g. use URI.parse/1 for converting a string URI into a struct. In turn, the developer can’t trust the configuration options and should always verify and parse the input when it’s used. However, sometimes it isn’t even possible, because the configuration option is used for a third-party library.

The Solution

At PSPDFKit, we had a problem with passing the configuration to the production environment, which is a Docker container in our case. Our initial solution was passing the environmental variables; however, that approach was insufficient, because the method wasn’t flexible.

We addressed this by adding a schema for configuration options. The idea is simple: The configuration is still defined as a JSON object, but our custom Config.Provider verifies the structure of the configuration and transforms it when necessary.

Let’s have a look at the sample schema:

@schema %{
    my_app: %{
       :"Elixir.MYAPP.Repo" => %{
        parse: :keyword,
        fields: %{
          username: %{parse: :string},
          password: %{parse: :string},
          database: %{parse: :string},
          hostname: %{parse: :string},
          port: %{parse: :integer},
          ssl_opts: %{
            parse: :keyword,
            fields: %{
              cacerts: %{env_var: "CACERTS", parse: &__MODULE__.cacerts/1},
              verify: %{parse: :atom},
              server_name_indication: %{parse: :charlist}
            }
          }
        }
      }
    }
  }


  @spec cacerts(String.t()) :: [der_or_encrypted_der :: binary()]
  def cacerts(cacerts),
    do:
      cacerts
      |> :public_key.pem_decode()
      |> Enum.map(fn {_, der, _} -> der end)

And here’s the corresponding configuration:

{
  "my_app": {
    "username": "postgres",
    "password": "postgres",
    "database": "postgres",
    "hostname": "localhost",
    "port":     5432,
    "ssl_opts": {
      "cacerts": "-----BEGIN CERTIFICATE----- (... etc)",
      "verify": verify_peer,
      "[server_name_indication](server_name_indication)": "localhost"
    }
  }
}

The configuration schema is a map that defines how the configuration should be parsed, validated, or even populated. Each root key of the configuration schema map is an application and its configuration option. Each configuration option is described by another map describing the configuration option value.

Schema Definition Options

Now, let’s look at the available options:

%{parse: :string}
%{parse: :integer} # etc...

The configuration option value is a scalar primitive value. This will instruct the configuration parser to ensure the type and convert it if necessary — e.g. string "1" to integer 1:

%{
  parse: :keyword, # or :map
  fields: %{
    field_name: %{parse: :string}
  }
}

The configuration option value is either a keyword, or it’s a map that’s converted to a keyword. If the :map option is specified, it works similarly, but it ensures the configuration option value is a map. The fields value contains the configuration schema for the children options:

%{
  parse: :list,
  items: %{parse: :string}
 }

This expects a list in the configuration. items specifies the configuration schema for the elements of the list, so the above example expects a list of strings:

%{parse: fn x ->
           bytes = String.to_integer(x)
           :crypto.strong_rand_bytes(bytes)
         end
  }

Sometimes the above configuration options aren’t expressive enough. In such a case, it’s possible to pass a function as a parse value in the configuration schema. This allows us to define any custom validation or transformation of the input.

One of the examples when this might be useful is presented above — let’s say one of the libraries requires variable-length binary data as a configuration option value, e.g. salt for a hashing function.

It isn’t possible to express this in plain text, so using a transformation function addresses such a use case:

%{parse: :string, env_var: "ENV_VAR_NAME"}

Sometimes, it’s more convenient to retrieve the configuration option from the system environmental variables. Adding the env_var key instructs the validator to retrieve it from there, instead of expecting it in the provided configuration:

%{parse: :string, optional?: true}

Adding optional?: true allows us to define the configuration option value as optional. This won’t raise an error if the configuration doesn’t specify it. In that case, the configuration option won’t be present in the application configuration:

%{parse: :string, optional?: true, default: ""}

It’s also possible to define a default value for optional options. Now, if the configuration option value is missing, the default one will be used for the application configuration.

Summary

This post discussed the schema-based approach for reading configurations and the advantages and disadvantages of using the default Elixir configuration format. The schema-based approach was presented as a way to have benefits of both approaches.

Author
Bartosz Szafran
Bartosz Szafran Server and Services Engineer

Bartosz is a software engineer primarily interested in technologies pertaining to Erlang VM and distributed and large-scale systems. He’s also a functional programming enthusiast. In his free time, Bartosz enjoys spending time in nature and eating pierogi.

Free trial Ready to get started?
Free trial