One thing that folks sometimes want or need to do is to be able to “soft delete” records in a database. This means that instead of actually removing the record from a database, you in some way keep it there but make it “invisible” to your application.
There are many different ways of doing this, but I’ve recently settled on the way I think I’m going to do it from now on when I have applications on Ecto 3 and using Postgres. I’ve put together a little example application to show how it works, and that’s what I’m going to walk through today.
The basic plan
The simplist way of describing how this works is that on every table we add a
deleted_at column. If that column is null then the record has not been
deleted. If that column has a timestamp in it, then that record has been soft
deleted. We’re going to configure our application so that by default it only
“sees” records that have not been soft deleted, but if we want we can see all
records, including those that have been soft deleted. We can also “hard delete”
records if we want to, which will fully remove that record from the database.
There are two things that are called “schema”s that we’ll be talking about here, so before we move on let’s describe them and give them better names.
First off, Postgres has a concept of a schema, which relates to essentially a
namespace for a group of relations. By default, all relations are created in the
"public" schema, but you can create other schemas with the
CREATE SCHEMA schema_name
command. So, you can do
CREATE SCHEMA filtered to have a new
available where you can create relations. After you create those relations, you
can query from them like
SELECT * from public.users or
SELECT * from filtered.users.
From now on, when we’re talking about this schema, we’ll always call it a Postgres schema.
The second schema is the one that we use in Ecto. You’ve probably seen the code
use Ecto.Schema before - that’s what we’re talking about. A module that
Ecto.Schema we’re going to call an Ecto schema.
Ok, so let’s look at some specifics!
Here are the tests
that show how this works. You’ll see at this level it looks just like any other
normal test - nothing strange going on here. We just have some well named
functions in a module called
SoftDelete.Users that do just what they say they
do - creating, retrieving, updating and deleting users. But there are some extra
functions there that are specific to soft delete, like
Now, let’s look at that
Users module to see what’s going on there.
Most of these functions look just the same as any other normal application
without soft delete functionality, which is the goal here. The first thing you
might notice that’s different as we scan down that file is the implementation
delete_user. In that, if the deletion is successful, instead of returning
the record that was deleted, we actually have to look up another record and
return that instead. A little deviation there, but not too bad.
Then we get to those special functions that deal with the soft deleted users. In
order to access our soft deleted users, we use the
prefix option in
Ecto.Query.from/2. This is
where the real magic happens! This
prefix corresponds to the Postgres schema
to make our query in. So, when we look at the implementation for
list_deleted_users, we see that we’re telling Ecto to return all users in the
So if soft deleted users are in the
"public" Postgres schema, where are the
The Ecto schema
The Ecto schema definition
User looks just the same as in any other application. We’re defining
the fields on our object, and we have two changeset functions - nothing
interesting to see here.
However, instead of
use Ecto.Schema, we see
use SoftDelete.Schema, so let’s
check in there to see what’s going on.
And in that file, on line
we see the important annotation! We’re setting that Ecto schema’s
@schema_prefix to be
"filtered". This tells Ecto that, unless we tell it
otherwise, all operations for this Ecto schema are to be performed against the
filtered.users relation instead of
public.users, which is the default.
So, now we know that in
filtered.users we have all of the users that haven’t
been soft deleted, but in
public.users we have all users, including those that
have been soft deleted. So, where does
filtered.users come from?
In the migration that creates our
we first see what looks like a totally normal
create table(:users) call. And
indeed it is a totally normal call! But, after that, we get something a
little special. We’re executing some custom SQL, so on the
up part of the
migration we’re calling
PERFORM prepare_table_for_soft_delete('users');, and
down part of the migration we’re calling
PERFORM reverse_table_soft_delete('users');. That must be doing something important, so
let’s see what’s going on there.
The really important part
So the first migration in this app is where all the really important and interesting stuff happens.
So much SQL, though! Ughhh, how will I ever understand this?!
Let’s start from the beginning and take it step by step.
On line 9
CREATE SCHEMA filtered, like I mentioned above. This creates a Postgres
filtered for us. Easy enough, right?
The next chunk of SQL on lines 11-21
do the most important thing in this app - actually soft deleting our records
instead of deleting them. Here we’re creating a Postgres trigger, which is just
a function that’s executed after a certain event happens. This particular
trigger says basically “After a record has been deleted, re-insert it back into
the table it came from and set its
deleted_at field to be the current
timestamp”. There are Reasons™ why that has to be an
AFTER DELETE trigger
instead of a
BEFORE DELETE trigger, but they’re not too important, really, so
I’ll spare you from having to read all of them.
Then, the next chunk of SQL on lines 23-34
prepare_table_for_soft_delete function that we saw in the previous
migration. This function accepts a single argument, which is supposed to be the
name of a table that you’ve created, and then does four things:
1) Add a
deleted_at field to the given table
2) Create an index on that
deleted_at field, because it’s going to be used a
lot for queries
3) Set the trigger that we created on lines 11-21 to be active on that table
4) Create a Postgres view in the
filtered Postgres schema with the same name
as the given table, but that only returns records that haven’t been soft deleted.
Once that function has been called, a newly created table has everything it needs to be used in our application and have soft deleted records!
Finally, the chunk of SQL on lines
gives us the
reverse_table_soft_delete function that just undoes the four
things listed above.
By now we’ve covered just about everything in the app, but the one thing that I
didn’t show is how we actually handle hard deletes for when we really want a
record to be removed from the database for good. That’s defined in a new
function that we added to the
The great thing about this method of handling soft delete is it minimally affects how Ecto works, and you can see that by looking at that Repo. It’s totally normal with the exception of that one additional function, and that function itself is fairly normal as well.
What’s going on in
hard_delete/1 is we have an
Ecto.Multi operation, and the
first thing we do is to disable the Postgres trigger that we set to do soft
deletes for us. Once that’s disabled, we actually delete the record as we
normally would, and once that’s done we re-enable the trigger again so we go
back to defaulting all delete operations to be soft deletes. And yes, this
disabling of the trigger is local to the transaction that the
Multi is run in,
so you don’t need to worry about messing with a form of global state there and
accidentally deleting records you meant to soft delete because of a race
So, that’s it! Not too scary, right? It’s a great soft delete implementation that you barely see any evidence of in your actual application, works 100% with all normal Ecto behavior, and doesn’t require much in the way of hacky SQL to make work.
Yes, there are performance tradeoffs here, but that’s just the nature of the beast with soft deletes. If performance becomes an issue for you there are many ways you can refine this to make it faster, but this shouldn’t add more than a millisecond or two to most operations.