Criteria-Based Deletion

Phil Wilkins
12 min readMar 9, 2022

--

Originally published at https://manningbooks.medium.com on March 9, 2022

Although batch operations provide the ability to delete several resources with a single API call, there’s an underlying requirement that we must know in advance the unique identifiers of the resource we want to delete. Many scenarios where we’re uninterested in deleting a specific list of resources and instead are more interested in deleting any resources that happen to match a specific set of criteria. This design pattern provides a mechanism by which we can safely and atomically remove all resources matching certain criteria rather than by a list of identifiers.

Motivation

It’s not all that uncommon to want to operate on more than one resource at a time. More specifically, you certainly might want to “clear out” a set of specific resources. Compared to the other batch operations, deletion is by far the most straightforward, requiring no other information to perform an action: given an ID, remove the resource.

This is a wonderful piece of functionality, but it requires that we already know exactly which resources we want to delete. This means that if we don’t yet know this information, we first have to investigate to determine which should be deleted. For example, imagine we want to delete all ChatRoom resources that are flagged as "archived". To make this happen, we need to discover which resources have this particular setting, and then use those identifiers to remove the resources with the batch Delete method, as shown in Listing 1.

Listing 1. Criteria-based deletion using standard List and batch Delete methods.

functiondeleteArchivedChatRooms(): void {
const archivedRooms = ListChatRooms({ filter: "archived: true" }); // #A
return BatchDelete({ // #B
ids: archivedRooms.map( (room) => room.id )
});
}

#A First, we must find all resources which are “archived”. #B Once we’ve the identifiers, we can delete all of them.

Unfortunately, there are several problems with this design. First, and most obviously, it requires at least two separate API calls, and to make things even worse, listing resources is unlikely to be a single request and it’s more likely to involve a long list of repeated requests to find all of the matching resources. Second, and most importantly, these two methods being stitched together lead to a non-atomic result. By the time we collect all of the IDs of archived resources, it’s possible that some of them might have been unarchived. This means that when we’re deleting these resources, we might be deleting resources that aren’t archived anymore!

Because of these major issues, it’s important that we’ve an alternative which provides a single method allowing us to delete resources based on some set of criteria rather than exclusively based on a list of identifiers.

Overview

This pattern introduces an idea of a new custom method: Purge. The purpose of the Purge method is to accept a simple filter which can be executed, and any results matching that filter criteria is deleted. In essence, it’s a combination of the standard List method with the batch Delete method, but rather than piping the output of one method into the input of another method (as seen in Listing 19.1), we can use a single API call to accomplish our goal.

Although the method and its purpose are straightforward, we also have to consider the obvious concern: this method is dangerous. Users aren’t immune from mistakes, and we’re often worried about users deleting data that they later regret having deleted. In this case, rather than handing users a tool to delete a single resource (standard Delete) or even a tool to delete lots of resources (batch Delete), we’re now handing them the biggest tool of all. The Purge method allows users to delete resources without even being aware of the full extent of what they’re deleting. Under the right conditions (e.g., a filter that matches all resources), we’re able to wipe out all the data stored in the system entirely!

To avoid this potentially catastrophic result, we’ll provide two specific levers that users can rely on as guard rails. First, we’ll require an explicit Boolean flag be set on the request ( force) before deleting anything. Second, in the case that this flag isn't set (and therefore the request won't be executed), we'll provide a method to preview what would have happened if the request was executed. This includes both a count of the number of items which are deleted ( purgeCount) as well as a preview of some of the items that happen to match the list of results ( purgeSample). In some ways, this is a bit like having a validateOnly field on the request, but with the opposite default: that a request is always for validation only unless explicitly requested otherwise.

Implementation

To see how this method works, let’s start by looking at the typical flow of the Purge method. As shown in Figure 1, the process begins with a request providing the filter to be applied, but leaving the force flag set to false (or left unset). Because this is the equivalent of a validation request, no resources are deleted, and instead the purgeSample field is populated with a list of identifiers of resources which would have been deleted. Additionally, the purgeCount field provides the number of resources that match the filter and therefore would be deleted by the request.

With this new information from the validation request, we can double check that the resources returned in the purgeSample field are indeed ones that match the intent expressed in the filter. If we decide to follow through with deleting all the resources included, we can send the same request again, this time setting the force flag to true. In the response, only the purgeCount field should be populated with the number of resources deleted by executing the request.

Figure 1. Interaction pattern for the Purge method.

Now that we’ve an idea of the overall flow, let’s get into the tricky details of each of these fields and explore how they work together, starting with the filter field.

Filtering results

As you might expect, the filter field should work exactly as it does on the standard List method. The whole point of the Purge method is that it provides almost identical functionality to the standard List method combined with the batch Delete method. This means that a filter specified and executed by the Purge method should behave identically to that same filter having been provided to a standard List method.

One unusual, and scary, consequence of this is that as an empty or unset filter on the standard List method returns all resources hosted by the API, this same behavior is therefore expected for the Purge method.We’re in a tricky situation where if a user were to forget to specify a filter (or has a coding error resulting in the filter being set to undefined or an empty string), the method matches all existing resources and therefore, if forced, would delete all resources (an example of this type of behavior is shown in Listing 2). Although certainly dangerous, this is why the other failsafes are built into this design.

Listing 2. Minor typos resulting in disastrous consequences.

functiondeleteMatchingMessages(filter: string): number { // #A
const result = PurgeMessages({
parent: "chatRooms/1",
filter: fliter, // #B
force: true,
});
return result.purgeCount;
}

#A This method accepts a filter string and deletes all ChatRoom resources matching the filter.

#B Unfortunately, a typo here (fliter instead of filter) results in the filter being undefined, and therefore always matches all resources.

Although it might be safer to prevent requests like these, the unfortunate reality is that users have a need to perform this type action, and, further, the consistency with the standard List method is critical — otherwise users may start to think that filters work differently for different methods. As a result, we can’t reject requests with a missing filter, for example. Scenarios like these are the primary drivers behind the idea that, by default, the Purge method acts as though we’re asking for a “preview” only. In the next section, we’ll explore this in more detail.

Validation only by default

We can rely on a special validateOnly flag to make an API method validate the incoming request only and not execute the request. Purposefully, we chose the name of this field to push for a default value such that the method behaves normally unless explicitly asked to do otherwise (see Section 5.2 for more discussion on this topic).

Although this default is fine for those cases, as we learned earlier, this default is exceptionally dangerous for the Purge method as it allows a tiny mistake to result in deleting a potentially large amount of data from the API. As we can see in Listing 3, if we relied exclusively on request validation, forgetting to specify that the request was for validation only results in deleting data rather than providing some sort of preview as we might hope.

Listing 3. An omission with the wrong default leading to disastrous results.

#A If we rely on a validateOnly flag, omitting it entirely can result in accidentally deleting lots of data!

As a result, this happens to be one of the few scenarios where we want to have a method crippled by default rather than the other way around. If a field is forgotten (perhaps a user didn’t properly read the documentation), the default behavior should be safe for the user and not lead to catastrophic consequences.

To make this happen, we rely on a field called force which does the exact same thing as the validateOnly field, but it's named to lead to a different default behavior. Thanks to this difference, forgetting to set this field (or leaving it set to false) leads to a completely safe result: no data is deleted. In addition to no data being deleted, we get a useful preview of the results had the request been executed. This preview is made up of two key pieces of information: a count of the number of matching resources as well as a sample set of those matching resources. In the next section, we'll start by looking at how this count of results works.

Result count

Regardless of whether a Purge request is to be executed or for validation only, one quite useful piece of information to have in mind is a count of how many resources happen to match the provided filter. To do this, Purge responses should include a purgeCount field that provides this information.

One catch exists though: as the value should be an exact count of items deleted in a live request ( force: true), when the request is for validation only this value can opt to provide a reasonable estimate rather than an exact count. The reason for this is that in some cases it might be computationally intensive to find and count all the possible matches. Because we're not going through the process of deleting them all and would like to avoid wasting computing power, it's not a big deal to rely on an estimate over a perfectly accurate count. That said, the goal is to be as realistic as possible, and it's important that the estimate be at least somewhat reflective of reality.

One thing to consider when relying on an estimated value for this field is that underestimates can be devilishly misleading and should be avoided as much as possible. To see why, consider the scenario where a response indicates that an estimated one hundred resources match a given filter. This might give a user a false sense of confidence that the Purge method won’t remove many resources. If, in truth, the number of matching resources is closer to one thousand, the user will first be shocked when the true results come in (showing one thousand resources having been deleted compared to the estimate of one hundred), but it’s exceptionally frustrating when they realize that they would have revised their filter expression had the estimate been more reflective of reality (say, 750 matching resources).

For this and a variety of other reasons, seeing the number of matching resources for a given filter is certainly useful information (both for live requests and validation requests), but there’s an even more useful bit of data for validation-only requests that we could provide as well: a sample set of the resources that match and therefore will be deleted. In the next section, we explore how best to do this.

Result sample set

As we’ve seen, a preview of the number of resources matching a provided filter is useful but imperfect. In fact, even when the count’s an exact number rather than an estimate, we need to accept that this is a single metric that counts all matching resources and in many cases, this type of aggregate can be misleading. For example, consider if fifty of one hundred items match a filter. How do we know we’re about to delete the right fifty items. Perhaps we meant to delete all fifty archived resources and instead are about to delete all fifty unarchived resources instead! Counts of matching items are often helpful for noticing glaring problems, but fail when the issues are more subtle.

To address this problem, in addition to the count of matching resources, we can rely on a validation response providing a sample subset of items that would be deleted in a field called purgeSample. This field should contain a list of identifiers of matching resources which can then be spot-checked for accuracy. For example, we could check a few of the resources returned and verify that they are marked as archived and not the other way around.

Despite the fact that this requires some extra work, it’s useful. For example, in a user interface we might retrieve a few of these items using the batch Get method and display them to a user for verification to be sure that the resources listed in the preview look like ones they intend to delete. If the user decides that none of these resources look out of place, they can proceed with executing the request by resending it with force set to true.

This leads to an obvious question: how many items should be in this “preview sample”? In general, because the ‘ is to help catch any mistakes in the filter expression, it’s important that the sample size be large enough for a user to notice if something looks out of place (e.g. if a resource that shouldn’t match happens to appear in the sample set). As a result, a good guideline is to provide at least one hundred items for larger data sets and exact matches for smaller data sets which are relatively inexpensive to query.

Consistency

The final issue to worry about is quite a bit trickier and one that deserves at least a mention here: consistency. What happens if we send a validation-only Purge request which results in a few resources to be deleted, but by the time we send the request for execution the data has changed such that many more resources match the filter? Is there a way to have any guarantee that the data returned during validation matches the data which is deleted during execution? Unfortunately, the short answer is, “no.”

Even if we have the ability to perform queries over snapshots of data at a specific point in time, executing the Purge request over data as it appeared in the past is unlikely to lead to the desired result. As a matter of fact, if this behavior is the intent, then it’s already supported by the combination of a standard List request and a batch Delete request.

Additionally, although it’s technically possible that we could require the Purge method to fail in the case where data has changed between the time of a validation request and a live request, the method becomes practically useless on any sufficiently large, concurrent, volatile data set. In the world of APIs, this isn’t an uncommon occurrence.

Final API definition

Now that we’ve a full grasp on how the Purge method works, a short example. Listing 4 shows a method to remove all Message resources provided a set of criteria to be specified in a filter string.

Listing 4. Final API definition

abstract class ChatRoomApi {
@post("/{parent=chatRooms/*}/messages:purge")
PurgeMessages(req: PurgeMessagesRequest): PurgeMessagesResponse;
}

interface PurgeMessagesRequest {
parent: string;
filter: string;
force?: boolean;
}

interface PurgeMessagesResponse {
purgeCount: number;
purgeSample: string[];
}

Trade-offs

In case you haven’t noticed, this is a dangerous method. It’s a bit like handing a bazooka to users of an API, giving them the ability to destroy a lot of data easily and quickly. Although the method as designed attempts to provide as many safety checks as possible, it still opens the door to the possibility that users mistakenly destroy a large amount, if not all, of their data. As a result, unless this is an absolute necessity, it’s generally a good idea to avoid supporting this functionality in an API.

Exercises

  1. Why should the custom Purge method be limited to only those cases where it’s absolutely necessary?
  2. Why does the Purge method default to executing validation only?
  3. What happens if the filter criteria is left empty?
  4. If a resource supports soft-deletion, what’s the expected behavior of the purge method on that resource? Should it soft-delete the resources? Or expunge them?
  5. What’s the purpose of returning the count of affected resources? What about the sample set of matching resources?

Summary

  • The custom Purge method should be used to delete multiple resources matching a specific set of filter criteria, but should only be supported if absolutely necessary.
  • By default, Purge requests should be exclusively for validation rather than deleting resources.
  • All Purge responses should include a count of the number of resources affected (and a sample set of matching results), though this may be estimated for a validation request.
  • The Purge method should adhere to the same consistency guidelines as the standard List method.

That’s all for now. If you want to learn more about the book, check it out on Manning’s liveBook platform here.

Originally published at https://manningbooks.medium.com on March 9, 2022.

--

--

Phil Wilkins
Phil Wilkins

Written by Phil Wilkins

Techie, author, blogger, https://blog.mp3monster.org -- my blog covering tech https://cloud-native.info I work for Oracle, but all opinions are my own.

No responses yet