The Data61 Cloud Undo Tool allows you to rollback inadvertent changes you made to resources on Amazon Web Services (AWS) EC2, Auto-scaling, etc. It comes as an extension to the AWS command line tools and can be downloaded for free!
Simply type “aws-checkpoint” before making changes, then do what you need to do, and then decide: if the state of your cloud resources is as you wanted and everything works fine, type “aws-commit”; if not, type “aws-rollback” to go back to the checkpointed state. Read more…
Based on fundamental research we analyzed what can be undone, and devised a formal proof to that end. Using these research results, we implemented the Undoability Checker, a tool that can automatically analyze which aspects of your cloud resources can be undone, and which cannot.
This tool allows you to “undo” changes you made to certain resources on Amazon Web Services (AWS) Elastic Compute Cloud (EC2). For this purpose, you can record a “checkpoint,” usually a state in which everything is working fine. Then you make the changes you want to make (the “do” part). If the results are what you wanted, you can “commit” your changes; if anything went wrong or the results aren’t what you hoped for, you can “rollback” (or “undo”) the changes, which brings your system back to the checkpoint.
The tool is for AWS EC2 users who control their EC2 resources primarily through the command line tools offered by Amazon.
Once you installed the Data61 Cloud Undo Tool, all EC2 commands should behave as usual (given you only use the command line tools). In addition, the undo tool gives you four new commands: aws-checkpoint, aws-commit, aws-rollback, and aws-undelete.
One important thing to note: if the Data61 Cloud Undo Tool is installed, and a checkpoint has been set, “delete” commands are turned into “pseudo-deletes”. That means, the undo tool only deletes resources logically when you issue the delete command. Otherwise, rollback wouldn’t be possible. Of course that means you will be billed for these resources, until you issue an “aws-commit” command – at that point the resources will actually be deleted. Therefore you should consider how much time you take before issuing a commit/rollback command, since the meter is ticking, so to say.
On the other hand, pseudo-delete means that resources may not be visible from the command line, but will be visible from the EC2 Web console – after all, the undo tool can only affect things under its control, i.e., commands on the command line.
The implementation covers the following resource types:
The tool may still contain bugs, so be careful when handling important resources.
The undo tool is the result of research efforts by Data61. Several research publications describe the workings of the tool in detail, and are available from the downloads page.
In short, rollback in the tool works as follows. The tool captures the state of cloud resources when tasked with a checkpoint and a rollback command, and records these states locally. The checkpointed state and the current state are translated to a formal language from AI Planning, named Planning Domain Definition Language (PDDL). These state representations are handed to an AI Planner (namely the FastForward planner by Joerg Hoffmann), together with a PDDL model of the commands available on EC2. The planner then generates an undo plan, i.e., a sequence of actions which, when executed, brings the system back to the checkpoint.
The undo tool also records which resources are logically deleted. When tasked to “commit”, the tool deletes these resources on EC2. Therefore, before committing, undelete is available as a command.
One of the pieces of research conducted looked at the undoability of all commands in the formal model, i.e., the degree to which the commands may have side-effects which cannot be undone by any other command. This is described in the respective publications available from the downloads page. If you are interested in the formal guarantees (and their limitations), please refer to these publications.
In essence, most things can be undone – if one abstracts from things like public DNS names, internal IP addresses, and startup timestamps. These exceptions change invariably when stopping instances (virtual machines) on EC2. This brief summary can, however, not do the details of our analysis justice – so if the details are of importance to you, please have a look at the publications.