Workspace 6.21.5
Understanding how a Workflow Executes

Understanding Operations and Connections explained the basic concepts behind Workspace, namely operations and connections. It also explained a simplified view of how a workflow executes connected operations. Workspace is, however, much more powerful than that and it is important to understand a bit more of this area in order to get the expected behaviour.

The Up To Date Concept

Understanding Operations and Connections presented workspace execution as stepping through a sequence of operations in some predictable order. The reality is actually a little different. Rather than looking at a set of operations and connections and deciding in what order to execute things, Workspace uses the concept of whether or not something is up to date. Every operation, connection, input and output has an "up to date" state associated with it. A workflow does nothing until it is asked to bring one of these things up to date. When this happens, it works out what it depends on and brings those things up to date first. Then and only then will that particular thing be brought up to date by whatever means is appropriate. A simple example will illustrate this.

Workflow - Not up-to-date

Let us assume something asked the workflow to bring an output of operation B up to date (we'll discuss later how this can happen). The steps the workflow would go through would be something like this:

  • The output requires operation B to be brought up to date.
  • Operation B requires its inputs to be brought up to date.
  • One of the inputs has a connection to operation A, so that connection has to be brought up to date.
  • The connection depends on an output of operation A, so that output has to be brought up to date.
  • That output requires operation A to be up to date.
  • Operation A requires its inputs to be brought up to date.

At this point, there are no further dependencies, since nothing is connected to any of the inputs to operation A. From here, the process essentially unwinds and performs execution as it does so:

  • Execute operation A. This also makes the outputs of A up to date.
  • Update the connection between A and B. This also makes the connected input of B up to date.
  • Execute operation B. This also makes the outputs of B up to date.
Workflow - Update to date

At this point, everything is now up to date and Workspace passes control back to whatever asked for the original output. This may seem like a lot of work, but consider now what happens if we change the value for another one of the inputs of B, this input may be a "hard-coded" value such as the text for a label on a graph. The workflow will look something like this:

Workflow - Part update to date

When we ask for the workflow to be executed again then the process is much quicker:

  • Operation B needs to be brought up to date.
  • Operation B only requires its not up to date inputs to be brought up to date.

But the input from Operation A to Operation B is already up to date from the previous run. So Workspace does not have to re-execute A and instead only has to re-execute B because that was the only part of the workflow that was affected when we changed the value of one of its other inputs. If executing A is a time-consuming activity, this just saved us a potentially substantial amount of time. Now consider a much less trivial workflow example where there are possibly hundreds of operations and connections, and where the user is frequently changing the value of a few inputs to see the effect of that change. Workspace is now efficient because it is only executing the things that need updating after each change. Understanding this is critical to using Workspace.

Workspace only executes those things that are needed and that are not already up to date.

Workspace Inputs, Outputs and Nesting

Summarising the above, Workspace can be thought of as being demand driven rather than following sequential execution. The natural question that follows is how to ask Workspace to bring something up to date. Afterall, without that, Workspace won't execute anything. The most straightforward way to do this is to add a special operation called a WorkspaceOutput to the workflow and connect other operations to it. It obeys the same rules as other operations as far as connections and dependencies go, but it has special meaning to the workflow containing it.

Workspace Dependency output

Here, the WorkspaceOutput operation can be clearly identified to the right of operation B. A WorkspaceOutput operation has a different icon and background color to other operations, and in this case, we've given it a label that clearly identifies it to the user. All the preceding discussion has talked about asking Workspace to bring something up to date. However, Workspace itself can be put into an execution mode, and in this state it will take control of what needs to be updated. Specifically, Workspace keeps track of all WorkspaceOutput operations it contains. When Workspace is in execution mode, it will bring all of its WorkspaceOutput operations up to date and then go dormant, consuming no more CPU usage while in this dormant state. Whenever any part of the workflow is changed by some outside action, it wakes up and then brings all the WorkspaceOutput operations up to date again, executing only those parts of the workflow that are required and that are not already up to date. Again, when all of its WorkspaceOutput operations are up to date, Workspace goes dormant. This process continues until Workspace is taken out of execution mode. The graphical Workspace editor provides a menu entry and a toolbar button for turning execution on and off. The editor also automatically turns off execution if an error is encountered.

When a workflow contains no WorkspaceOutput operation, putting it into execution mode may appear to have no effect. This is because Workspace hasn't been given anything to bring up to date. As a general guide, a workflow should generally have at least one WorkspaceOutput operation.

As can probably be guessed by the name, the WorkspaceOutput operation can do a bit more than just tell Workspace what it needs to bring up to date. It is possible to create nested workflows, allowing you to collect together a set of operations and connections and to treat them as a simple operation in a parent workflow. When doing this, you need a way to pass data into and out of the nested workflow. The WorkspaceOutput operation is the way to pass data back out to the parent workflow. As you might expect, there is also a WorkspaceInput operation which allows you to receive data from the parent workflow. An example child workflow might look something like this:

Inputs and outputs for a workflow

In the parent workflow, the above would then be just like any other operation, only with a slightly different appearance to identify it as a nested workflow:

A nested workflow in the parent workflow

Nesting does make working with complex workflow arrangements significantly easier.

Nested workflows can be set to one of two modes; atomic and non-atomic.

  • Atomic nested workflows mimic the behaviour of other typical operations whereby any change the nested workflows inputs will set all operations of the nested workflow to not up-to-date. When that nested workflow is executed again, every connected operation in that nested workflow will be brought up-to-date.
  • Non-atomic nested workflow's up-to-date behaviour is no different to the equivalent un-nested workflow. This type of nested workflow is used only to simplify complex workflows into smaller, more managable components.
Atomic workflows have a different operation icon

There are essentially two main ways to create a nested workflow. The first is to drag and drop a Workspace from the operation catalogue onto the canvas. You can then double-click it and it will open in a new tab. A second approach is where you want to take some of the contents of an existing workflow and convert them into a nested Workspace. Simply select the operations you want to put into a nested Workspace, right-click on an empty part of the canvas and in the context menu that pops up, select "Nested workspace from selection". This second approach is particularly useful because it preserves all connections between the selected and non-selected operations by adding inputs and outputs to the nested workflow as necessary. You can also take a nested workflow and pull up its contents into the parent workflow by right-clicking on it and selecting "Explode contents" from the context menu.

For more details on things like setting the names and data types of the inputs and outputs of a nested workspace, see Understanding Inputs and Outputs.