Generating Test Data in Phoenix application using ExMachina & Faker

Generating Test Data in Phoenix application using ExMachina & Faker

The purpose behind factories usage in testing and how to use ExMachina & Faker to setup test data in an Elixir Phoenix project.
Long Nguyen
Long Nguyen
December 03, 2021
Web

Table of Contents

When jumping from the Ruby/Rails world to Elixir & Phoenix, one may wonder how to generate test data?

In a Ruby on Rails application, we typically use external dependencies, called gems, such as Fabricator and FactoryBot. Luckily, there is a similar approach in Elixir. We can use hex packages such as ExMachina. Let’s find out how we can use it.

First things first, why use factories?

When writing tests, setting up test data is always needed. As an example, let’s assume that we have a User schema with three attributes: name, email and birthday.

defmodule MyApp.UserManager.User do
  use Ecto.Schema
  import Ecto.Changeset

  schema "users" do
    field :name, :string
    field :email, :string
    field :birthday, :date
  end

  def changeset(user, attrs) do
    user
    |> cast(attrs, [:name, :email, :birthday)
    |> validate_required([:name, :email])
  end
end

def MyApp.UserManager do
  import Ecto.Query, warn: false
  alias MyApp.Repo
  alias MyApp.UserManager.User

  def get_user_by_email(%User{email: email}) do
    Repo.get_by(User, email: email)
  end
end

We could write a test like the following:

describe "get_user_by_email/1" do
  test "given an user email, returns the user" do
    user_attrs = %{name: "John Doe", email: "[email protected]"}

    {:ok, user} = 
      %User 
      |> User.changeset(user_attrs) 
      |> Repo.insert()
  
    created_user = UserManager.get_user_by_email(user.email)
    assert(created_user.email) == "[email protected]"
  end
end

Test data is generated by hardcoding the user_attrs map and using Repo.insert.

For other test cases, when there is no need to insert data into the database, Repo.insert can be omitted. Assertions are thus against the changeset.

For both types of test, this approach has the following drawbacks:

  • Irrelevant test data: in the test above, we only care about the user email.

    However, due to the validation rules (both the name and email attributes are required), we need to specify the name attribute in the test data. That gets tedious when we have many validations and constraints; we need to specify the attributes to fulfill those rules, even though the test has nothing to do with those attributes.

  • Coupling: imagine if we need to change a validation rule.

    For example, in our case, let’s assume the birthday attribute is now required. We would then need to update all tests to follow that change.

How about generating just the right amount of information - only the data that impacts the expectation in the setup and none of the irrelevant data included?

As you may already guess, test factories are introduced to achieve that very purpose.

In fact, factories are not a test-specific concept. It is a design pattern that comes in handy for testing. Similar to a real-world factory, a test factory is responsible for “manufacturing” (in this case) data — hence the term. The idea is that factories provide convenient methods to define groups of data with default values that can create a valid record. In the tests, we can then explicitly override the attributes that are pertinent to the test.

For example, in our test case above, we can generate the test data by only overriding the email attribute, leaving irrelevant data out of the scope. For example

user = insert(:user, email: "[email protected]")

That sounds good, so tell me, how I can use Factories in Elixir?

In the Elixir world, ExMachina is a popular test factory generator. To use it in a Phoenix project, as always, it first needs to be added as a dependency:

def deps do
  [
    ...
    {:ex_machina, "~> 2.7.0", only: :test}
  ]
end

Next, add this line to test/test_helper.exs before ExUnit.start:

{:ok, _} = Application.ensure_all_started(:ex_machina)

Then let’s create a file at test/support/factory.ex to define our first factory:

defmodule MyApp.Factory do
  use ExMachina.Ecto, repo: MyApp.Repo

  def user_factory do
    %MyApp.UserManager.User{
      name: "John Doe",
      email: sequence(:email, &"email-#{&1}@example.com")
    }
  end

  def adult_user_factory do
    %MyApp.User{
      name: "John Doe",
      email: sequence(:email, &"email-#{&1}@example.com")
      birthday: Date.utc_now |> Date.add(-365 * 18)
    }
  end
end

We can now generate a test user in our tests in different ways:

# `build` returns an unsaved user.
user = build(:user)
adult = build(:adult_user)

# override the default email
user = build(:user, %{email: "[email protected]"}) 

# `insert` returns the user after saving to the database
user = insert(:user)
users = insert_list(3, :user, %{email: "[email protected]")

# `params_for` returns a plain map without any Ecto-specific attributes.
params_for(:user)

The official doc provides much more information.

With these simple factory methods, we can generate only the relevant data in our test. So our test can now be updated to:

describe "get_user_by_email/1" do
  test "given an user email, returns the user" do
    user = insert(:user, email: "[email protected]")
  
    created_user = UserManager.get_user_by_email(user.email)
    assert(created_user.email) == "[email protected]"
  end
end

No more irrelevant data 🎉

Even if our validation constraints change, for example, the birthday attribute is required, we can simply update the user_factory method to add that attribute without changing the test data across every test.

Organizing factories

If we keep defining more factory methods in that test/support/factory.ex file, it can quickly get large and hard to manage. To get over that, we can split the factories into multiple files.

Let’s start with our UserFactory .

# test/support/factory.ex
defmodule MyApp.Factory do
  use ExMachina.Ecto, repo: MyApp.Repo

  # Define your factories in /test/factories and declare it here,
  use MyApp.UserFactory
end

# test/support/factories/user_factory.ex
defmodule MyApp.UserFactory do
  alias MyApp.UserManager.User

  defmacro __using__(_opts) do
    quote do
      def user_factory do
        %User{
          name: "John Doe",
          email: sequence(:email, &"email-#{&1}@example.com")
        }
      end
    
      # other user factory methods
    end
  end
end

That piece of code may look a bit complicated so let’s find out what it does.

First, we split all of our user factory methods into a new UserFactory module. We then declare that UserFactory in the main MyApp.Factory module on this line:

use MyApp.UserFactory

# that line will be compiled into
require MyApp.UserFactory
MyApp.UserFactory.__using__()

It means the line use MyApp.UserFactory will require the UserFactory module and then call the macro named __using__ of that module. That’s the behavior of the use macro (read more in the official Elixir doc)

Inside the __using__ macro, we define several factory methods inside the quote. quote returns the underlying representation of Elixir code. By doing all of the above steps, it injects the factory methods into the current context.

All good? We now have well-organized factories to generate test data. But can we go a step further? You may notice that our test data is not unique:

def user_factory do
  %User{
    name: "Jone Doe",
    email: sequence(:email, &"email-#{&1}@example.com")
  }
end

The generated users always have the same name. We can work around this issue by generating a random string for the attribute name, but that would get tedious if we have more attributes to care for, each of them may require a different way to generate random data. How about a more elegant solution to generate the value for each attribute? That’s where Faker comes into play.

A short introduction to Faker

Faker is a pure Elixir library for generating fake data.
source

As usual, we need to define it as a dependency before using it:

def deps do
  [
    ...
    {:ex_machina, "~> 2.7.0", only: :test},
    {:faker, "~> 0.16", only: :test}
  ]
end

Then it can be used in the factories:

def user_factory do
  %User{
    name: Faker.Person.name(),
    email: Faker.Internet.email(),
    birthday: Faker.Date.between(~D[1980-01-01], Date.utc_today())
  }
end

No changes in the way we use our factories:

# `build` returns an unsaved user.
user = build(:user)
# %User{name: "Mike Burns", email: "[email protected]", birthday: ~D[1992-03-25]}

# override the default email
user = build(:user, %{email: "[email protected]"}) 
# %User{name: "Danny Kirlin", email: "[email protected]", birthday: ~D[1982-10-10]}

It looks way more elegant and consistent, right? The generated data also appears to be more real.

Check out the Faker doc for more in-depth guides & examples.

Wrapping up

Testing is beneficial, we all know that, but many may be unwilling to do testing due to the tedious tasks of setting up test data. With the right tools, we can alleviate that pain and maybe, make it a joy. But as with many tools, use test factories wisely. It exists to solve specific problems.

💡 If you want a simpler solution to define test factories, without relying on a third party like ExMachina, we can just use Ecto to implement such methods. Read more about Ecto’s Test factories.

If this is the kind of challenges you wanna tackle, Nimble is hiring awesome web and mobile developers to join our team in Bangkok, Thailand, Ho Chi Minh City, Vietnam, and Da Nang, Vietnam✌️

Join Us

Recommended Stories:

Accelerate your digital transformation.

Subscribe to our newsletter and get latest news and trends from Nimble