Type checking

Inputs and outputs of the automation steps have defined types. Let's have a look at what the ADK does with this information.

Consider the following example:

from freyja import Step, Automation, Input, Optional
 
class A(Step):
    optional_str = Input(Optional[str])
    def execute(self):
        B("B", mandatory_str=self.optional_str)
 
class B(Step):
    mandatory_str = Input(str)
    def execute(self):
        print(self.mandatory_str)
 
if __name__ == "__main__":
    Automation(A).run()

Step A defines an optional string input (i.e. None values are allowed) while step B's string input is mandatory (i.e. None is not allowed).

Inside step A we call step B and simply pass on the string input.

When this script is executed with a value for optional_str, the automation executes just fine:

$ python types.py run --optional_str foo
2019-03-28 20:49:09,214    INFO [  freyja.log:  68]: (MainThread  ) Logging configured
2019-03-28 20:49:09,215    INFO [freyja.graph: 362]: (MainThread  ) Process: 23410
2019-03-28 20:49:09,215    INFO [freyja.graph: 489]: (MainThread  ) Instantiating Step <A "main">
2019-03-28 20:49:09,215    INFO [freyja.graph: 217]: (MainThread  ) Step <A ("main")> queued for execution
2019-03-28 20:49:09,216    INFO [freyja.graph: 648]: (main        ) Initiating execution for for: Step <A ("main")>
2019-03-28 20:49:09,216    INFO [freyja.graph: 658]: (main        ) Execution started for: Step <A ("main")>
2019-03-28 20:49:09,216    INFO [freyja.graph: 664]: (main        ) RUNNING: main
2019-03-28 20:49:09,216    INFO [freyja.graph: 489]: (main        ) Instantiating Step <B "B">
2019-03-28 20:49:09,217    INFO [freyja.graph: 217]: (main        ) Step <B ("main.B")> queued for execution
2019-03-28 20:49:09,217    INFO [freyja.graph: 690]: (main        ) Execution finished for: Step <A ("main")>
2019-03-28 20:49:09,217    INFO [freyja.graph: 648]: (main.B      ) Initiating execution for for: Step <B ("main.B")>
2019-03-28 20:49:09,217    INFO [freyja.graph: 658]: (main.B      ) Execution started for: Step <B ("main.B")>
2019-03-28 20:49:09,218    INFO [freyja.graph: 664]: (main.B      ) RUNNING: B
foo
2019-03-28 20:49:09,218    INFO [freyja.graph: 690]: (main.B      ) Execution finished for: Step <B ("main.B")>
2019-03-28 20:49:09,218    INFO [freyja.graph: 114]: (Executor-main) Executor done
2019-03-28 20:49:09,218    INFO [freyja.graph: 406]: (MainThread  )
-----------------------------------------------------------------------
Execution summary:
    Steps instantiated: 2
    Steps incomplete:   0
    Steps executed:     2
    Steps failed:       0
-----------------------------------------------------------------------

However, when we execute the same script without providing optional_str argument , we see the following error:

$ python types.py run
[...]
freyja.error.TypeValidationError: Expected type <class 'str'> got <class 'NoneType'>.
freyja.error.PortValidationError: Validation for port B.mandatory_str failed.
[...]

This time the ADK complains (rightfully) that for input B.mandatory_str, the provided type NoneType is incompatible with the expected type str.

The ADK performs late type checking for promises

The above examples demonstrate how type checking works with the ADK:

  • When you assign the output of an upstream step to an input of a downstream step, the ADK checks for type compatibility when the downstream step is executed (late) and not when it is initialized (early).
  • When you assign a real value (i.e. not a promise) to a step input, then type checking is performed immediately at step initialization (early) because the assigned value is known.

This behavior allows us to assign inputs to outputs that, based on their defined types, are potentially incompatible (as in this example assigning an Optional[str] to str).

However, the ADK proceeds with execution until it knows for sure if actually assigned values are incompatible with defined types (as in this example a None value is provided for an input that does not accept None).

This increases flexibility for automation developers, because often times developers know that types will match at runtime although on paper they do not.

While this is the behavior we would expect from a dynamically typed language like Python, it carries the risk that type mismatches go undetected until late into the automation run, although they could have been spotted immediately during development based on type definitions.

This is of course the old debate whether dynamic or static type checking is better. There is simply no right or wrong - both dynamic and static type checking have advantages and disadvantages.

For future releases we will consider offering a choice, i.e. whether the ADK should perform early or late type checking when starting an automation.