Type checking
Inputs and outputs of the automation steps have defined types. Let's have a look at what the ADK does with this information.
Consider the following example:
from freyja import Step, Automation, Input, Optional
class A(Step):
optional_str = Input(Optional[str])
def execute(self):
B("B", mandatory_str=self.optional_str)
class B(Step):
mandatory_str = Input(str)
def execute(self):
print(self.mandatory_str)
if __name__ == "__main__":
Automation(A).run()
Step A defines an optional string input (i.e. None values are allowed) while step B's string input is mandatory (i.e. None is not allowed).
Inside step A we call step B and simply pass on the string input.
When this script is executed with a value for optional_str
, the automation executes just fine:
$ python types.py run --optional_str foo
2019-03-28 20:49:09,214 INFO [ freyja.log: 68]: (MainThread ) Logging configured
2019-03-28 20:49:09,215 INFO [freyja.graph: 362]: (MainThread ) Process: 23410
2019-03-28 20:49:09,215 INFO [freyja.graph: 489]: (MainThread ) Instantiating Step <A "main">
2019-03-28 20:49:09,215 INFO [freyja.graph: 217]: (MainThread ) Step <A ("main")> queued for execution
2019-03-28 20:49:09,216 INFO [freyja.graph: 648]: (main ) Initiating execution for for: Step <A ("main")>
2019-03-28 20:49:09,216 INFO [freyja.graph: 658]: (main ) Execution started for: Step <A ("main")>
2019-03-28 20:49:09,216 INFO [freyja.graph: 664]: (main ) RUNNING: main
2019-03-28 20:49:09,216 INFO [freyja.graph: 489]: (main ) Instantiating Step <B "B">
2019-03-28 20:49:09,217 INFO [freyja.graph: 217]: (main ) Step <B ("main.B")> queued for execution
2019-03-28 20:49:09,217 INFO [freyja.graph: 690]: (main ) Execution finished for: Step <A ("main")>
2019-03-28 20:49:09,217 INFO [freyja.graph: 648]: (main.B ) Initiating execution for for: Step <B ("main.B")>
2019-03-28 20:49:09,217 INFO [freyja.graph: 658]: (main.B ) Execution started for: Step <B ("main.B")>
2019-03-28 20:49:09,218 INFO [freyja.graph: 664]: (main.B ) RUNNING: B
foo
2019-03-28 20:49:09,218 INFO [freyja.graph: 690]: (main.B ) Execution finished for: Step <B ("main.B")>
2019-03-28 20:49:09,218 INFO [freyja.graph: 114]: (Executor-main) Executor done
2019-03-28 20:49:09,218 INFO [freyja.graph: 406]: (MainThread )
-----------------------------------------------------------------------
Execution summary:
Steps instantiated: 2
Steps incomplete: 0
Steps executed: 2
Steps failed: 0
-----------------------------------------------------------------------
However, when we execute the same script without providing optional_str argument , we see the following error:
$ python types.py run
[...]
freyja.error.TypeValidationError: Expected type <class 'str'> got <class 'NoneType'>.
freyja.error.PortValidationError: Validation for port B.mandatory_str failed.
[...]
This time the ADK complains (rightfully) that for input B.mandatory_str
, the provided type NoneType
is incompatible with the expected type str
.
The ADK performs late type checking for promises
The above examples demonstrate how type checking works with the ADK:
- When you assign the output of an upstream step to an input of a downstream step, the ADK checks for type compatibility when the downstream step is executed (late) and not when it is initialized (early).
- When you assign a real value (i.e. not a promise) to a step input, then type checking is performed immediately at step initialization (early) because the assigned value is known.
This behavior allows us to assign inputs to outputs that, based on their defined types, are potentially incompatible (as in this example assigning an Optional[str]
to str
).
However, the ADK proceeds with execution until it knows for sure if actually assigned values are incompatible with defined types (as in this example a None
value is provided for an input that does not accept None
).
This increases flexibility for automation developers, because often times developers know that types will match at runtime although on paper they do not.
While this is the behavior we would expect from a dynamically typed language like Python, it carries the risk that type mismatches go undetected until late into the automation run, although they could have been spotted immediately during development based on type definitions.
This is of course the old debate whether dynamic or static type checking is better. There is simply no right or wrong - both dynamic and static type checking have advantages and disadvantages.
For future releases we will consider offering a choice, i.e. whether the ADK should perform early or late type checking when starting an automation.
Updated over 3 years ago