Dagster allows for code versioning and memoization of previous outputs based upon that versioning. Listed here are APIs related to versioning and memoization.
Abstract class for defining a strategy to version ops and resources.
When subclassing, get_op_version must be implemented, and get_resource_version can be optionally implemented.
get_op_version should ingest an OpVersionContext, and get_resource_version should ingest a ResourceVersionContext. From that, each synthesize a unique string called a version, which will be tagged to outputs of that solid in the pipeline. Providing a VersionStrategy instance to a job will enable memoization on that job, such that only steps whose outputs do not have an up-to-date version will run.
VersionStrategy that checks for changes to the source code of ops and resources.
Only checks for changes within the immediate body of the op/resource’s decorated function (or compute function, if the op/resource was constructed directly from a definition).
Provides execution-time information for computing the version for an op. .. attribute:: op_def
The definition of the op to compute a version for.
- type:
OpDefinition
The parsed config to be passed to the op during execution.
Any
Provides execution-time information for computing the version for a resource.
The definition of the resource whose version will be computed.
The parsed config to be passed to the resource during execution.
Any
Base class for IO manager enabled to work with memoized execution. Users should implement
the load_input
and handle_output
methods described in the IOManager
API, and the
has_output
method, which returns a boolean representing whether a data object can be found.
The user-defined method that returns whether data exists given the metadata.
context (OutputContext) – The context of the step performing this check.
True if there is data present that matches the provided context. False otherwise.
bool
See also: dagster.IOManager
.
Provide this tag to a run to toggle memoization on or off. {MEMOIZED_RUN_TAG: "true"}
toggles memoization on, while {MEMOIZED_RUN_TAG: "false"}
toggles memoization off.