Elixir and the Wall of Tests

Here's an experience I tend to have over and over:

  • Generate a new Elixir / Phoenix project
  • Use the super cool Phoenix generators to make a new model
  • Change the model substantially
  • Realize that the tests generated with the model are now super broken

When I get into this situation running mix text produces a huge number of errors. For some reason, I always want to scroll to the top of the error list and work on the first one produced, which makes fixing all of the errors a painful process of save, clear terminal, run command, scroll up, repeat.

Run just one test 🧘

The first thing I do to get some presence of mind and move forward in this situation is stop running the tests. No, not all the tests, though it certainly can be tempting to delete the file and forget about testing. No, just tell the test runner to one a single test at a time.

In ex-unit, Elixir's testing framework which comes built-in to Phoenix, this can be done with what they call "tags".

What are tags?

Test tags are a very flexible feature that allows the programmer to categorize tests throughout the project, and include or exclude whole selections of test based on their tags.

For example?

One example of that comes to mind would be tagging "slow" tests and configuring them only to run before a merge into a repository's main branch.

Back to the topic

Okay, back on topic, here's how we use tags to run just one test at a time. First, find the first test in a broken suite and add this line above its declaration:

@tag :focus
test "does a thing when I do a thing", %{
...

Here I've declared that this test is tagged with the atom "focus". Now, I can run just this test by adding the --only tag to the test command:

mix test --only focus

Now, all of the other tests will be skipped, and only the tagged test will be run. When I've fixed that test, I'll move the tag to the next test in-line. If I've fixed all the tests in a describe block, and I want to verify that they're all working together, I can move that @tag :focus line right above the describe block, and all of the test inside of it will be run. Hurrah!

Keyboard fatigue 🥱

Another thing that gets tiresome when working through the wall of tests is command-tabbing to my terminal, up-arrowing, and enter-ing after every file change. mix_test_watch to the rescue! This package will re-run your tests for you after every file change, and it accepts all the same arguments as the normal test command!

Just add it to your mix.exs file:

      ...
      {:phoenix_ecto, "~> 4.1"},      
      # 👇 Add this line
      {:mix_test_watch, "~> 1.0", only: :dev, runtime: false},
      ...

And and change test to test.watch in your CLI command:

mix test.watch --only focus

That's a wrap 🌯

If you're like me, these two steps will help you actually see what's going wrong with the broken tests, instead of just closing your terminal in fear of that massive wall of red text and browsing Twitter until you forget why you were procrastinating in the first place.

UUIDs for User IDs

Integers By Default 🔢

Phoenix's generators save a ton of time writing boilerplate code. Pow is an Elixir package and Phoenix extension that offers a great way to get user authentication up and running in very little time. But by default Phoenix's generators use auto-incrementing integers for user IDs.

What's Wrong with Integer IDs 🤔?

I've been bitten nastily by integer IDs multiple times in my career.

One time a backup failure caused the counter on user IDs to be reset, and the account records were erased while the data still existed, un-attached. When new accounts got created, those IDs got re-used, and existing data got attached to the new accounts, super bad news. (This is also an argument for using a SQL store with strong relational guarantees, rather than the NoSQL store the company had adopted at the time).

On another occasion, user IDs were being randomly generated, but they were still rather small integers, and the likelihood of a collision was high due to the rate at which new accounts were being created. A simple oversight in the application code was performing an upsert on the newly generated user accounts instead of an insert. The Database would have detected a conflict and thrown an error in the case of an insert on a re-used user ID, but since the app was upserting, re-used user IDs just clobbered old accounts with new credentials. Yikes!

Neither of these problems were caused by integer IDs, but in both cases, using a longer, richer user ID would have massively reduced the likelihood that the bugs would have had negative outcomes.

Switching to UUIDs 👷🏿‍♀️

Let's say you've already gone through the process of generating a new Phoenix app, and you've already followed all the guides from Pow on getting set up, and then you realize that your user IDs are integers. Stop! Don't throw away all of your code and start over. I found myself in the same situation this morning, but thanks to Cam Stuart on GitHub, I got my IDs switched over in no time. Here are the steps.

Change the migration.

Open up priv/repo/migrations/<timestamp>_create_users.exs
Change
  def change do
    create table(:users) do
...
  def change do
    create table(:users, primary_key: false) do
...

Which tells the migration not to create the default auto-incrementing ID column. Then add a line to create your own ID column:

  def change do
    create table(:users, primary_key: false) do
    add :id, :uuid, primary_key: true
...

Change the Model

Next open up lib/boots/users/user.ex and add two module attributes above the "schema" declaration:

  @primary_key {:id, :binary_id, autogenerate: true} # Add this
  @foreign_key_type :binary_id # And add this
  schema "users" do
    ...

Those module attributes tell the schema that it should use UUIDs (here represented as :binary_id) for the ID column, and that it should auto-generate them. It also tells any other schemas that when they're making a foreign key that references this table, they should also use UUIDs.

Update the Test

If you've followed the Pow guide on adding API tokens, you'll have a failing test now. Open up test/your_app_web/api_auth_plug_test.exs and change the lines that create a test user with an integer ID:

    user =
      Repo.insert!(%User{id: 1, email: "test@example.com"})

To use a UUID instead:

    user =
      Repo.insert!(%User{id: "e2c54c31-e574-4c9f-8672-dad23449d4cf", email: "test@example.com"})

Change your Generators

Now that you've fixed your users, let's make it so that any new models you generate will use UUIDs by default. Open up config/config.exs and add the following lines to the end of the file:

config :your_app, :generators,
  migration: true,
  binary_id: true,
  sample_binary_id: "11111111-1111-1111-1111-111111111111"

🚨 Don't forget to change :your_app to the actual name of your app!

This should tell phoenix that whenever it's generating a new model, it should use binary IDs for them.

All Done!

Congratulations! Your app is now a little more robust 👏. It's worth noting that there has been some discussion about whether using random UUIDs significantly hurts Postgres's performance by causing it to do more random seeks. After my research, the evidence against UUIDs has appeared weak enough to me that I'd rather have the safety than be concerned about possible performance loss. I'd love to hear from you if you have more information on this, though!

On Pagination and Sort Order

The Bug 🐞

We've been testing the next generation of our sync engine at Day One prior to public release, and we found a funny bug. We have an endpoint for fetching a big list of entries that have changed since the last fetch. This endpoint is also how a new phone with no content would pull down all entries from a journal. But when we did try to sync all of our journal contents down to a new device, we were missing entries!

Enter Sherlock 🔎

We investigated the internal workings of the endpoint to see what was up. The missing entry was in the database, so that wasn't the issue. We tested the query that fetched the rows and the entry showed up, so that wasn't the issue. Internally we fetch groups of entries from the database in pages and write them out to the endpoint so we figured it must have been some flawed logic with the pagination. We logged the pagination pattern, but everything looked perfect.

Then we looked closer at the response returned by the server.

We were getting the right number of entries pulled down from that endpoint, but it turned out some of the entries were missing from the feed, and others were repeated!

This was a big indicator that we had a problem with sorting. We compared the entry that showed up twice with the entry that didn't show up at all. Sure enough, we found that they had the exact same "sync date", which is the column that our query sorted by.

    builder->orderBy([|"sync_date asc"|]);

The Fix 🛠

All we had to do was add a unique column into the sort clause:

    builder->orderBy([|"sync_date asc, id asc"|]);

And the problem was fixed.

But Why? 🤔

When paginating, the same query is executed over again each time a new page is fetched. Each time, a small window of the results are fetched by skipping over the number of items fetched by previous pages. In order for this to work, however, the overall result set has to be the same for every page**. If the result set changes as new pages are fetched, we run the risk of dropping or repeating items.

In our case, the sync date on an entry was pretty close to unique. It's got millisecond-accuracy. And most of the time, two entries aren't synced in the exact same millisecond. But, as we now know from testing on real journal data, there are entries with identical sync dates. When this happens, The order in which those two entries are returned in a result state is considered unstable. Our query had an unstable sort.

So something about skipping through the items in our result set caused Postgres to reverse the order of those two rows every time. One of them was missed, the other repeated.

Once we added a unique column to the sort, the sort became a stable sort. Which means that Postgres always knows how to order the results in the query, no matter how much skipping or limiting we do. With that change in place, skipping old rows no longer changed the order and all the entries were properly included in the result set.

Don’t Make Me Tap Twice

I've started rough work on a new app for digital "morning baskets". While I just used Expo when starting Storytime (another app I have in-progress), I decided to give the famous Ignite boilerplate from Infinite Red a chance. In short, it's fantastic. In just a few days of side-work time I've almost completed a fully-functioning minimum viable product (MVP) of this new app.

But I quickly ran into that annoying situation where you're editing a text field, and you want to press the "submit" button, but you've gotta tap it twice so that the keyboard gets dismissed before you can progress.

Thanks to a detailed blog post I found a quick solution. In a project generated by Ignite 6.x, we can open up app/components/screen/screen.tsx and find the place where ScrollView is used:

...
<ScrollView
    style={[preset.outer, backgroundStyle]}
    contentContainerStyle={[preset.inner, style]}
>
    {props.children}
</ScrollView>
...

All we have to do is add keyboardShouldPersistTaps="handled" to the props of ScrollView.

...
<ScrollView
    style={[preset.outer, backgroundStyle]}
    contentContainerStyle={[preset.inner, style]}
    keyboardShouldPersistTaps="handled" // <- Here!
>
    {props.children}
</ScrollView>
...

This instructs the scrollview to dismiss the keyboard if it receives a tap gesture which none of its children have handled, but to leave the keyboard alone if one of the children in the view handles the event. In my case, I navigate right after the button click, and this action dismisses the keyboard automatically anyway. So it resolves the problem for me!