Safeguarding changes using the plan-execute pattern
At the company I’m currently working at, a.k.a. re:cap, we have some manual data imports that are business-critical, e.g., a customer-provided CSV file that we want to import into our database and that affects business operations. A simple pattern I came to love to make this process fast & safe is using the plan-execute pattern.
What is the plan-execute pattern?
The plan-execute pattern (or plan-apply pattern) is a way to preview changes before they are applied.
If you know Terraform, you already know the plan-execute pattern. Terraform is a tool that is used to manage infrastructure using a declarative programming language. Changing infrastructure is extremely delicate, remove one line of code, and the production server could be deleted, increasing the avg. heart rate of the team by 50 BPM. To safeguard our servers and heart rates, an infrastructure update is typically done with the following steps:
- The developer changes the infrastructure code as they desire, e.g., increasing the memory of a server from 8 to 16 GB.
- The developer executes the command "terraform plan". This will compare the wanted changes to what is currently active. The result is a plan on “how to move the current state to the desired state”.
- The developer verifies that the changes look sound.
- The developer executes "terraform apply" to actually execute the changes.
The key safeguard is that Terraform doesn’t directly execute the changes but first creates a plan to transition the current state to the desired one, allowing the user to review and approve the plan. This ensures that the developer can verify the changes are as intended, catching any undesired effects early.
Some more examples:
- Checkout pages use the plan-execute pattern - they show you what you’ll buy, the amount of money to be deducted, where to send it, etc. One can read through all this before hitting the "buy" (or rather "execute") button.
- A git diff is a plan-execute pattern - it shows you the changes that you are about to commit.
- A file deletion dialog is a plan-execute pattern - it often shows how many files will be deleted and asks you to confirm.
Plan-execute pattern for database changes
Not every update is as important as updating your infrastructure and needs an explicit plan & execute step. But the pattern to split important changes into a plan & execute step is useful not only in infrastructure management but in all cases where one can preview what the final changes will look like. Database updates fit this very well, and I will show you how this can look like.
An example: say you are working on the system for a supermarket chain and the prices for each product are shown on a tiny screen that is controlled by a centralized server, which gets the prices from a database. The prices are updated quite often, but sadly still in a manual way as the supermarkets’ fast paced environment never allowed them to address tech debt. To update the prices, someone has to upload a CSV file with two columns: product_id, price. Let’s model this with a plan-execute pattern to make sure that the price for frozen pizza is always correct. I’ll use Go in the following code examples.
First, we define our product data model: a simple ID & price combination. Before people scream that one should never use a float for a monetary value, we will use an int that represents "money" in cents.
type Product struct {
ID string
Price int
}
Then we define how our ProductPlan is supposed to look. Our update plan has:
- Added - products that are only in the new dataset and thus will be added.
- NoChanges - products that are both in the old and new datasets, but the price did not change.
- Updated - products that are both in the old and new datasets, but the price changed.
- Removed - products that are only in the old dataset and thus will be removed.
type ProductPair struct {
OldProduct Product
NewProduct Product
}
type ProductUpdatePlan struct {
Added []Product
NoChanges []Product
Updated []ProductPair
Deleted []Product
}
The CreatePlan function then can look as follows. The cool thing here is that it’s a pure function; it has no side effects and doesn’t even return an error. It’s very easy to test this.
func CreatePlan(
oldProducts []Product, newProducts []Product,
) ProductUpdatePlan {
plan := ProductUpdatePlan{}
oldProductMap := make(map[string]Product)
for _, product := range oldProducts {
oldProductMap[product.ID] = product
}
newProductMap := make(map[string]Product)
for _, product := range newProducts {
oldProduct, ok := oldProductMap[product.ID]
if !ok {
plan.Added = append(plan.Added, product)
} else if oldProduct.Price != product.Price {
plan.Updated = append(plan.Updated, ProductPair{
OldProduct: oldProduct,
NewProduct: product,
})
} else {
plan.NoChanges = append(plan.NoChanges, product)
}
newProductMap[product.ID] = product
}
for _, product := range oldProducts {
_, ok := newProductMap[product.ID]
if !ok {
plan.Deleted = append(plan.Deleted, product)
}
}
return plan
}
To actually execute these changes, we define an interface that will represent the database. We keep it simple and do not use any batch updates/inserts here.
type Repo interface {
InsertProduct(product Product) error
UpdateProduct(product Product) error
DeleteProduct(product Product) error
GetProducts() ([]Product, error)
Transaction(func(repo Repo) error) error
}
func (pup ProductUpdatePlan) Execute(repo Repo) error {
for _, product := range pup.Added {
err := repo.InsertProduct(product)
if err != nil {
return err
}
}
for _, productPair := range pup.Updated {
err := repo.UpdateProduct(productPair.NewProduct)
if err != nil {
return err
}
}
for _, product := range pup.Deleted {
err := repo.DeleteProduct(product)
if err != nil {
return err
}
}
return nil
}
The overall flow could look like this - this could be the function an endpoint would call; the preview flag controls if it’s just a preview or the actual import.
func UpdateProducts(
repo repo, csv string, preview bool
) (ProductUpdatePlan, error) {
// Implementing parseCSV is left as an exercise for the reader.
products, err := parseCSV(csv)
if err != nil {
return ProductUpdatePlan{}, err
}
var plan ProductUpdatePlan
err = repo.Transaction(func(repo Repo) error {
oldProducts, err := repo.GetProducts()
if err != nil {
return ProductUpdatePlan{}, err
}
plan = CreatePlan(oldProducts, products)
if preview {
nil
}
return plan.Execute(repo)
})
return plan, err
}
With this, we can offer a user experience that feels safe and predictable. The user can clearly see how their changes will interact with the system.
Interactive example
To have an excuse for building this blog with React.js, here is an interactive toy example of how this could look. You can change the provided CSV data, click "Preview changes" and see the update plan. You can then execute the update plan to the database by clicking "Execute changes". The products and CSV data are seeded with some sensible data. There is no error handling, you have to rely on the plan-execute pattern for the safeguarding ;).
State changes between preview & execute
A fundamental problem with the plan-execute pattern is that the state can change between preview and execute. This is especially true for systems that have a lot of concurrent users. E.g., if you preview a price change for a product, and in the meantime, someone else changes the price, the preview is not valid anymore. This is a problem that is hard to solve in a general way and is often solved by locking the resource, which can lead to other problems. But directly executing the changes without a preview is also not a good solution, as it can lead to unexpected results.
Most plan-execute patterns do not require that the state is exactly the same between preview and execute e.g. a file delete dialog will not stop deleting files because someone already deleted one of them. Terraform will not stop applying changes because someone else applied a change in the meantime (it does have its own locking mechanism though). It's more about giving the user a way to see what will happen and then execute that plan, making sure that the updates make sense, e.g. preventing the user from accidentally deleting their production database.
When should one use the plan-execute pattern?
This pattern does not make sense for most data updates; as this entails more work and complexity. It makes sense when:
- You’re updating something that is hard to grasp, but doing the update creates immediate negative effects.
- The updates matter, and it’s not trivial to revert them.
- It would help the user be more confident and be faster in checking that what they are about to do is correct, e.g., wouldn’t it be nice if email clients would tell you how many people you are about to send an email to if you are sending it to a group?
- Either locking the resource is ok, state changes between preview and execute are acceptable or it's ok to deny execution if the state changed between plan & execute (E.g. the delete file dialog would refuse to execute if one of the affected file got deleted in the meantime).
Snapshots & undo
The plan-execute pattern does not prevent a faulty update from being applied, so it is important as well that there is an undo option. Sometimes this is not completely possible - e.g., when Terraform deletes your production database, you can’t just "undo" that.But in many cases, it is possible. E.g. snapshot the database before the update and then revert to that snapshot if the update was faulty.
Some update plans can actually be used to implement this feature, e.g., git commits are exactly that. Git commits are update plans that you can freely undo and redo.This works only so long as all changes to the system are made through the update plans, so for most systems, this is not feasible.
Why is this not a database feature already?
Personally I'm using this feature mostly with database updates and would really like this to be a first class citizen from them. I asked /r/Database if anybody knows of some database features that could enable something like this. Here is the thread if you are interested, my conclusion:
- DoltDB basically supports this OOTB with its ability to create branches and native diff support. I find this project extremely intriguing personally, effortlessly creating snapshots is a game changer (I'm not affiliated with them).
- Traditional databases do not have good support for this use case, CDC was suggested multiple times, but that only helps you after the commit. One could do scd2 (append only, you can query any version of the data) or something similar (append only, but clean up afterwards). This moves the complexity into the data model, which I'm personally not a fan of.
For me the application logic approach I showed above is the best for me personally as it is agnostic to whatever technology you use to actually save the data to and only entails moving around application logic.