Loops and result gathering
Promises allows us to parallelize step execution using simple loops. Consider the following example, which computes the sum of squares for a given list of integers:
from freyja import Step, Automation, Input, Output
class Square(Step):
num = Input(int)
squared = Output(int)
def execute(self):
self.squared = self.num * self.num
class SumOfSquares(Step):
def execute(self):
squares = []
for i in range(1, 5):
square = Square(f"Square-{i}", num=i)
squares.append(square.squared)
print(f"Result is {sum(squares)}")
if __name__ == "__main__":
Automation(SumOfSquares).run()
This example iterates over numbers one to five, computes the square for each, and stores all squares in a list. The final result is the sum of this list.
The use of promises (step output 'squared' in this example) is not limited to establish dependencies between steps via input/output assignments. Because promises are regular Python objects, they can be assigned to variables, stored in lists or dictionaries, passed to functions, used inside regular Python classes, or used indirectly as inputs to some other steps.
In this example, we stash output promises in a list first, which we then pass into Python's sum()
function to compute the total.
When executed, you should see something like this:
$ python sum_of_squares.py run
2019-03-19 15:14:38,792 INFO [ freyja.log: 68]: (MainThread ) Logging configured
2019-03-19 15:14:38,792 INFO [freyja.graph: 362]: (MainThread ) Process: 5959
2019-03-19 15:14:38,792 INFO [freyja.graph: 489]: (MainThread ) Instantiating Step <SumOfSquares "main">
2019-03-19 15:14:38,793 INFO [freyja.graph: 217]: (MainThread ) Step <SumOfSquares ("main")> queued for execution
2019-03-19 15:14:38,793 INFO [freyja.graph: 648]: (main ) Initiating execution for for: Step <SumOfSquares ("main")>
2019-03-19 15:14:38,793 INFO [freyja.graph: 658]: (main ) Execution started for: Step <SumOfSquares ("main")>
2019-03-19 15:14:38,793 INFO [freyja.graph: 664]: (main ) RUNNING: main
2019-03-19 15:14:38,794 INFO [freyja.graph: 489]: (main ) Instantiating Step <Square "Square-1">
2019-03-19 15:14:38,794 INFO [freyja.graph: 217]: (main ) Step <Square ("main.Square-1")> queued for execution
2019-03-19 15:14:38,794 INFO [freyja.graph: 489]: (main ) Instantiating Step <Square "Square-2">
2019-03-19 15:14:38,794 INFO [freyja.graph: 217]: (main ) Step <Square ("main.Square-2")> queued for execution
2019-03-19 15:14:38,794 INFO [freyja.graph: 489]: (main ) Instantiating Step <Square "Square-3">
2019-03-19 15:14:38,795 INFO [freyja.graph: 648]: (main.Square-2) Initiating execution for for: Step <Square ("main.Square-2")>
2019-03-19 15:14:38,795 INFO [freyja.graph: 658]: (main.Square-2) Execution started for: Step <Square ("main.Square-2")>
2019-03-19 15:14:38,795 INFO [freyja.graph: 648]: (main.Square-1) Initiating execution for for: Step <Square ("main.Square-1")>
2019-03-19 15:14:38,795 INFO [freyja.graph: 217]: (main ) Step <Square ("main.Square-3")> queued for execution
2019-03-19 15:14:38,796 INFO [freyja.graph: 658]: (main.Square-1) Execution started for: Step <Square ("main.Square-1")>
2019-03-19 15:14:38,796 INFO [freyja.graph: 489]: (main ) Instantiating Step <Square "Square-4">
2019-03-19 15:14:38,796 INFO [freyja.graph: 664]: (main.Square-2) RUNNING: Square-2
2019-03-19 15:14:38,796 INFO [freyja.graph: 648]: (main.Square-3) Initiating execution for for: Step <Square ("main.Square-3")>
2019-03-19 15:14:38,797 INFO [freyja.graph: 690]: (main.Square-2) Execution finished for: Step <Square ("main.Square-2")>
2019-03-19 15:14:38,797 INFO [freyja.graph: 658]: (main.Square-3) Execution started for: Step <Square ("main.Square-3")>
2019-03-19 15:14:38,798 INFO [freyja.graph: 664]: (main.Square-1) RUNNING: Square-1
2019-03-19 15:14:38,798 INFO [freyja.graph: 217]: (main ) Step <Square ("main.Square-4")> queued for execution
2019-03-19 15:14:38,798 INFO [freyja.graph: 690]: (main.Square-1) Execution finished for: Step <Square ("main.Square-1")>
2019-03-19 15:14:38,799 INFO [freyja.graph: 664]: (main.Square-3) RUNNING: Square-3
2019-03-19 15:14:38,799 INFO [freyja.graph: 648]: (main.Square-4) Initiating execution for for: Step <Square ("main.Square-4")>
2019-03-19 15:14:38,800 INFO [freyja.graph: 690]: (main.Square-3) Execution finished for: Step <Square ("main.Square-3")>
2019-03-19 15:14:38,800 INFO [freyja.graph: 658]: (main.Square-4) Execution started for: Step <Square ("main.Square-4")>
2019-03-19 15:14:38,801 INFO [freyja.graph: 664]: (main.Square-4) RUNNING: Square-4
Result is 30
2019-03-19 15:14:38,801 INFO [freyja.graph: 690]: (main.Square-4) Execution finished for: Step <Square ("main.Square-4")>
2019-03-19 15:14:38,801 INFO [freyja.graph: 690]: (main ) Execution finished for: Step <SumOfSquares ("main")>
2019-03-19 15:14:38,802 INFO [freyja.graph: 114]: (Executor-main) Executor done
2019-03-19 15:14:38,802 INFO [freyja.graph: 406]: (MainThread )
-----------------------------------------------------------------------
Execution summary:
Steps instantiated: 5
Steps incomplete: 0
Steps executed: 5
Steps failed: 0
-----------------------------------------------------------------------
Notice that all five steps execute in parallel, although they were instantiated inside a regular loop, one after the other.
By the way, in Python a more elegant shorthand syntax for the loop above uses list comprehension, which achieves the same result:
squares = [
Square(f"Square-{i}", num=i).squares
for i in range(1, 5)
]
print(f"Result is {sum(squares)}")
Scatter-gather
The above example demonstrates how you would implement a typical scatter-gather mechanism with the ADK: you instantiate a step inside a loop and collect results in a list (scatter). After the loop, you pass this list to some aggregate function or to another step as input to compute final results (gather).
Before you parallelize the execution of apps on the Seven Bridges Platform inside a loop, make sure that you have read and understood Automation loops vs. CWL scatter part of this tutorial.
Next: Conditionals
Updated almost 4 years ago