Blog post

Declarative Validation with Elixir

Illustration: Declarative Validation with Elixir

PSPDFKit Server is distributed as a Docker image, which is run by our customers on their own infrastructure. To support different modes of operation, it exposes a fairly extensive set of configuration options via environment variables.

Each environment variable value needs to be validated, and when it’s invalid, it needs to be categorized as one of the following:

  • error — the value must be changed, as it stops the application from working as expected.

  • warning — the value should be changed, but it allows the application to work.

  • deprecations — the value will need to be changed, as it allows the application to work now, but it will stop working in a future version.

We recently started looking into improving the application component responsible for these validations to provide more precise and actionable feedback depending on how an invalid value is categorized.

In this blog post, we’ll look at a possible implementation of such a validator, which is responsible for finding and categorizing issues.

Use Case

PSPDFKit Server exposes a management dashboard, which is protected behind a static username and password combination. Both values can be configured via environment variables.

Our constraints for these two values are:

  • they’re required, so they must be defined

  • they should be changed from their default values to provide better security

General Architecture

The validator will receive a key-value structure (e.g. a map), with configuration names and values, and it should return a data structure that collects errors, warnings, and deprecations.

Validation rules should be expressed with a terse domain-specific language that favors composition so that additional rules can be layered on over time with minimal effort.

Basic Validation

As a first attempt, we simply want to design a validate/1 function that accepts a map of configuration options and returns uncategorized errors (if present).

To follow along, you can create a file called config_validator.exs and run it with elixir config_validator.exs.

We can start by writing out how our API could look:

defmodule Config.Validator do
  @type reason :: String.t()
  @type result :: %{data: Enum.t(), errors: [reason()]}

  @spec validate(Enum.t()) :: result()
  def validate(data) do
    result = %{
      data: data,
      errors: []
    }

    result
    |> validate_key(:username, [required()])
    |> validate_key(:password, [required()])
  end
end

%{}
|> Config.Validator.validate()
|> IO.inspect()

The implementation relies on a result data structure, which is passed through a series of functions, each of which then applies one or more validators to a specific key in the included data.

A validator’s responsibility is to either let the result through unmodified or add a specific element to the errors list.

This design pushes toward predictable extension paths:

  • To add a new validation to an existing key, add a new validator to the already existing list.

  • To add validations for a new key, pipe the result through a new validate_key/3 function call.

Implementing a validator requires defining a function that returns another function that can be used by validate_key/3.

We can implement required/0 as:

defp required do
  fn
    nil -> {:invalid, "is required"}
    _other -> :ok
  end
end

The returned function pattern matches the received value and returns a {:invalid, "is required"} tuple when it’s nil.

The implementation of validate_key/3 requires finding the value of the specified key and then iterating over the passed validators, applying them to the value and adding relevant errors as needed:

defp validate_key(initial_result, key, validators) do
  value = get_in(initial_result, [:data, key])

  Enum.reduce(validators, initial_result, fn validator, result ->
    case validator.(value) do
      :ok ->
        result

      {:invalid, reason} ->
        update_in(result, [:errors, key], fn
          nil -> [reason]
          other_reasons -> [reason | other_reasons]
        end)
    end
  end)
end

Running our file, we should see the following output:

%{
  data: %{},
  errors: [password: ["is required"], username: ["is required"]]
}

Adding a New Validator

We mentioned before that the username and password should be changed from their defaults so that we can write a validator that detects usage of default values:

defp not_default(defaults) do
  fn value ->
    if value in defaults do
      {:invalid, "is a default and is not allowed"}
    else
      :ok
    end
  end
end

We could then use it as:

# snip
result
|> validate_key(:username, [required(), not_default(["username"])])
|> validate_key(:password, [required(), not_default(["secret", "password"])])
# snip

At this point, we can run:

%{username: "username"}
|> Config.Validator.validate()
|> IO.inspect()

And we should see:

%{
  data: %{username: "username"},
  errors: [
    password: ["is required"],
    username: ["is a default and is not allowed"]
  ]
}

Categorizing Validations

To support different error categories, we can extend the result map and extend the validate_key/3 function to accept a category as argument. For example, to support warnings:

def validate(data) do
  acc = %{
    data: data,
    errors: [],
    warnings: []
  }

  acc
  |> validate_key(:username, :errors, [required()])
  |> validate_key(:username, :warnings, [not_default(["username"])])
  |> validate_key(:password, :warnings, [required()])
  |> validate_key(:password, :warnings, [not_default(["secret"])])
end

defp validate_key(initial_acc, category, key, validators) do
  value = get_in(initial_acc, [:data, key])

  Enum.reduce(validators, initial_acc, fn validator, acc ->
    case validator.(value) do
      :ok ->
        acc

      {:invalid, reason} ->
        update_in(acc, [category, key], fn
          nil -> [reason]
          other_reasons -> [reason | other_reasons]
        end)
    end
  end)
end

To streamline the implementation in validate/1, we can create specialized validate_as_error/3 and validate_as_warning/3 functions, which would internally dispatch to validate_key/4 as needed:

defp validate_as_error(initial_acc, key, validators) do
  validate_key(initial_acc, :errors, key, validators)
end

defp validate_as_warning(initial_acc, key, validators) do
  validate_key(initial_acc, :warnings, key, validators)
end

Another implementation route (which is left as an exercise to the reader) is to revise the structure of validators and change it to a keyword list where each key is a category and the value is a list of validation functions for that very category.

The expected outcome of any implementation is to return the following:

%{
  data: %{username: "username"},
  errors: [password: ["is required"]],
  warnings: [username: ["is a default and is not allowed"]]
}

Richer Error Messages

Validation errors should return a more structured reason so that they can be consumed in different ways:

  • Errors should be logged in the console, potentially as a summary and then with a longer explanation of what to do to fix them.

  • Warnings can be displayed in the dashboard with a short description and then expanded into a long description.

For example, the required() validator can be extended as:

defp required do
  reason = %{
    type: :required,
    short_desc: "The value is required, please include it in your configuration",
    long_desc: """
    The value is required as PSPDFKit Server is not able to use a default value.
    Please update your configuration and try again.
    """
  }
  fn
    nil -> {:invalid, reason}
    _other -> :ok
  end
end

As we rely on the shape of this error, it also makes sense to define a struct for it rather than use a literal map:

defmodule Config.Validator.Error do
  defstruct [:type, :short_desc, :long_desc]
end

Conclusion

In this blog post, we looked at how it’s possible to design and implement a completely custom validation system — starting from a few carefully chosen primitives — in order to have maximum flexibility on how errors are reported to the end user.

It’s worth mentioning that the Elixir ecosystem already offers quality data validation libraries with distinctive feature sets which may fit your use case. Some examples include Ecto (via Ecto.Changeset and Ecto.Schema), Saul, and Norm. As always, some initial research before writing any code can help you make a well-informed decision on how to proceed.

Explore related topics

Free trial Ready to get started?
Free trial