Changelog History
Page 12
-
v0.10.1 Changes
Community Contributions
- β¬οΈ Reduced image size of
k8s-example
by 25% (104 MB) (thanks @alex-treebeard and @mrdavidlaing!) - π§ [dagster-snowflake]
snowflake_resource
can now be configured to use the SQLAlchemy connector (thanks @basilvetas!)
π New
- π When setting
userDeployments.deployments
in the Helm chart,replicaCount
now defaults to 1 if not specified.
π Bugfixes
- π Fixed an issue where the Dagster daemon process couldnβt launch runs in repository locations containing more than one repository.
- π Fixed an issue where Helm chart was not correctly templating
env
,envConfigMaps
, andenvSecrets
.
π Documentation
- β Added new troubleshooting guide for problems encountered while using the
QueuedRunCoordinator
to limit run concurrency. - β Added documentation for the sensor command-line interface.
- β¬οΈ Reduced image size of
-
v0.10.0 Changes
Major Changes
- β± A native scheduler with support for exactly-once, fault tolerant, timezone-aware scheduling.
A new Dagster daemon process has been added to manage your schedules and sensors with a
reconciliation loop, ensuring that all runs are executed exactly once, even if the Dagster daemon
experiences occasional failure. See the
Migration Guide for
instructions on moving from
SystemCronScheduler
orK8sScheduler
to the new scheduler. - First-class sensors, built on the new Dagster daemon, allow you to instigate runs based on changes in external state - for example, files on S3 or assets materialized by other Dagster pipelines. See the Sensors Overview for more information.
- Dagster now supports pipeline run queueing. You can apply instance-level run concurrency limits and prioritization rules by adding the QueuedRunCoordinator to your Dagster instance. See the Run Concurrency Overview for more information.
- The
IOManager
abstraction provides a new, streamlined primitive for granular control over where and how solid outputs are stored and loaded. This is intended to replace the (deprecated) intermediate/system storage abstractions, See the IO Manager Overview for more information. - A new Partitions page in Dagit lets you view your your pipeline runs organized by partition. You can also launch backfills from Dagit and monitor them from this page.
- A new Instance Status page in Dagit lets you monitor the health of your Dagster instance, with repository location information, daemon statuses, instance-level schedule and sensor information, and linkable instance configuration.
- Resources can now declare their dependencies on other resources via the
required_resource_keys
parameter on@resource
. - Our support for deploying on Kubernetes is now mature and battle-tested Our Helm chart is now easier to configure and deploy, and weβve made big investments in observability and reliability. You can view Kubernetes interactions in the structured event log and use Dagit to help you understand whatβs happening in your deployment. The defaults in the Helm chart will give you graceful degradation and failure recovery right out of the box.
- Experimental support for dynamic orchestration with the new
DynamicOutputDefinition
API. Dagster can now map the downstream dependencies over a dynamic output at runtime.
π₯ Breaking Changes
β¬οΈ Dropping Python 2 support
- π Weβve dropped support for Python 2.7, based on community usage and enthusiasm for Python 3-native public APIs.
π Removal of deprecated APIs
π These APIs were marked for deprecation with warnings in the 0.9.0 release, and have been removed in π the 0.10.0 release.
- The decorator
input_hydration_config
has been removed. Use thedagster_type_loader
decorator instead. - The decorator
output_materialization_config
has been removed. Usedagster_type_materializer
instead. - π The system storage subsystem has been removed. This includes
SystemStorageDefinition
,@system_storage
, anddefault_system_storage_defs
. Use the newIOManagers
API instead. See the IO Manager Overview for more information. - π The
config_field
argument on decorators and definitions classes has been removed and replaced withconfig_schema
. This is a drop-in rename. - The argument
step_keys_to_execute
to the functionsreexecute_pipeline
andreexecute_pipeline_iterator
has been removed. Use thestep_selection
argument to select subsets for execution instead. - Repositories can no longer be loaded using the legacy
repository
key in yourworkspace.yaml
; useload_from
instead. See the Workspaces Overview for documentation about how to define a workspace.
π₯ Breaking API Changes
SolidExecutionResult.compute_output_event_dict
has been renamed toSolidExecutionResult.compute_output_events_dict
. A solid execution result is returned from methods such asresult_for_solid
. Any call sites will need to be updated.- The
.compute
suffix is no longer applied to step keys. Step keys that were previously namedmy_solid.compute
will now be namedmy_solid
. If you are using any API method that takes a step_selection argument, you will need to update the step keys accordingly. - π The
pipeline_def
property has been removed from theInitResourceContext
passed to functions decorated with@resource
.
Dagstermill
- If you are using
define_dagstermill_solid
with theoutput_notebook
parameter set toTrue
, you will now need to provide a file manager resource (subclass ofdagster.core.storage.FileManager
) on your pipeline mode under the resource key"file_manager"
, e.g.:
from dagster import ModeDefinition, local_file_manager, pipeline from dagstermill import define_dagstermill_solid my_dagstermill_solid = define_dagstermill_solid("my_dagstermill_solid", output_notebook=True, ...) @pipeline(mode_defs=[ModeDefinition(resource_defs={"file_manager": local_file_manager})]) def my_dagstermill_pipeline(): my_dagstermill_solid(...)
Helm Chart
- β± The schema for the
scheduler
values in the helm chart has changed. Instead of a simple toggle on/off, we now require an explicitscheduler.type
to specify usage of theDagsterDaemonScheduler
,K8sScheduler
, or otherwise. If your specifiedscheduler.type
has required config, these fields must be specified underscheduler.config
. - β‘οΈ
snake_case
fields have been changed tocamelCase
. Please update yourvalues.yaml
as follows:pipeline_run
βpipelineRun
dagster_home
βdagsterHome
env_secrets
βenvSecrets
env_config_maps
βenvConfigMaps
- The Helm values
celery
andk8sRunLauncher
have now been consolidated under the Helm valuerunLauncher
for simplicity. Use the fieldrunLauncher.type
to specify usage of theK8sRunLauncher
,CeleryK8sRunLauncher
, or otherwise. By default, theK8sRunLauncher
is enabled. - 0οΈβ£ All Celery message brokers (i.e. RabbitMQ and Redis) are disabled by default. If you are using
the
CeleryK8sRunLauncher
, you should explicitly enable your message broker of choice. - π
userDeployments
are now enabled by default.
Core
- π² Event log messages streamed to
stdout
andstderr
have been streamlined to be a single line per event. - π Experimental support for memoization and versioning lets you execute pipelines incrementally, selecting which solids need to be rerun based on runtime criteria and versioning their outputs with configurable identifiers that capture their upstream dependencies.
To set up memoized step selection, users can provide a
MemoizableIOManager
, whosehas_output
function decides whether a given solid output needs to be computed or already exists. To execute a pipeline with memoized step selection, users can supply thedagster/is_memoized_run
run tag toexecute_pipeline
.To set the version on a solid or resource, users can supply the
version
field on the definition. To access the derived version for a step output, users can access theversion
field on theOutputContext
passed to thehandle_output
andload_input
methods ofIOManager
and thehas_output
method ofMemoizableIOManager
.- β± Schedules that are executed using the new
DagsterDaemonScheduler
can now execute in any timezone by adding anexecution_timezone
parameter to the schedule. Daylight Savings Time transitions are also supported. See the Schedules Overview for more information and examples.
Dagit
- Countdown and refresh buttons have been added for pages with regular polling queries (e.g. Runs, Schedules).
- π Confirmation and progress dialogs are now presented when performing run terminations and deletions. Additionally, hanging/orphaned runs can now be forced to terminate, by selecting "Force termination immediately" in the run termination dialog.
- The Runs page now shows counts for "Queued" and "In progress" tabs, and individual run pages show timing, tags, and configuration metadata.
- The backfill experience has been improved with means to view progress and terminate the entire backfill via the partition set page. Additionally, errors related to backfills are now surfaced more clearly.
- Shortcut hints are no longer displayed when attempting to use the screen capture command.
- The asset page has been revamped to include a table of events and enable organizing events by partition. Asset key escaping issues in other views have been fixed as well.
- π Miscellaneous bug fixes, frontend performance tweaks, and other improvements are also included.
Kubernetes/Helm
- π The Dagster Kubernetes documentation has been refreshed.
Helm
- We've added schema validation to our Helm chart. You can now check that your values YAML file is correct by running:
helm lint helm/dagster -f helm/dagster/values.yaml
- β Added support for resource annotations throughout our Helm chart.
- β Added Helm deployment of the dagster daemon & daemon scheduler.
- β Added Helm support for configuring a compute log manager in your dagster instance.
- π User code deployments now include a user
ConfigMap
by default. - π Changed the default liveness probe for Dagit to use
httpGet "/dagit_info"
instead oftcpSocket:80
Dagster-K8s [Kubernetes]
- β Added support for user code deployments on Kubernetes.
- β Added support for tagging pipeline executions.
- π Fixes to support version 12.0.0 of the Python Kubernetes client.
- π Improved implementation of Kubernetes+Dagster retries.
- π² Many logging improvements to surface debugging information and failures in the structured event log.
Dagster-Celery-K8s
- π Improved interrupt/termination handling in Celery workers.
Integrations & Libraries
- β Added a new
dagster-docker
library with aDockerRunLauncher
that launches each run in its own Docker container. (See Deploying with Docker docs for an example.) - β Added support for AWS Athena. (Thanks @jmsanders!)
- β Added mocks for AWS S3, Athena, and Cloudwatch in tests. (Thanks @jmsanders!)
- π Allow setting of S3 endpoint through env variables. (Thanks @marksteve!)
- π Various bug fixes and new features for the Azure, Databricks, and Dask integrations.
- Added a
create_databricks_job_solid
for creating solids that launch Databricks jobs.
- β± A native scheduler with support for exactly-once, fault tolerant, timezone-aware scheduling.
A new Dagster daemon process has been added to manage your schedules and sensors with a
reconciliation loop, ensuring that all runs are executed exactly once, even if the Dagster daemon
experiences occasional failure. See the
Migration Guide for
instructions on moving from
-
v0.9.22 Changes
π New
- When using a solid selection in the Dagit Playground, non-matching solids are hidden in the RunPreview panel.
- The CLI command dagster pipeline launch now accepts --run-id
π Bugfixes
- π [Helm/K8s] Fixed whitespacing bug in ingress.yaml Helm template.
-
v0.9.22.post0 Changes
π Bugfixes
- π [Dask] Pin dask[dataframe] to <=2.30.0 and distributed to <=2.30.1
-
v0.9.21 Changes
Community Contributions
- π Fixed helm chart to only add flower to the K8s ingress when enabled (thanks @PenguinToast!)
- π Updated helm chart to use more lenient timeouts for liveness probes on user code deployments (thanks @PenguinToast!)
π Bugfixes
- [Helm/K8s] Due to Flower being incompatible with Celery 5.0, the Helm chart for Dagster now uses a specific image
mher/flower:0.9.5
for the Flower pod.
-
v0.9.20 Changes
π New
- β± [Dagit] Show recent runs on individual schedule pages
- β± [Dagit] Itβs no longer required to run
dagster schedule up
or press the Reconcile button before turning on a new schedule for the first time - [Dagit] Various improvements to the asset view. Expanded the Last Materialization Event view. Expansions to the materializations over time view, allowing for both a list view and a graphical view of materialization data.
Community Contributions
- β‘οΈ Updated many dagster-aws tests to use mocked resources instead of depending on real cloud resources, making it possible to run these tests locally. (thanks @jmsanders!)
π Bugfixes
- π fixed an issue with retries in step launchers
- π [Dagit] bugfixes and improvements
- π Fixed an issue where dagit sometimes left hanging processes behind after exiting
Experimental
- π [K8s] The dagster daemon is now optionally deployed by the helm chart. This enables run-level queuing with the QueuedRunCoordinator.
-
v0.9.19 Changes
π New
- π Improved error handling when the intermediate storage stores and retrieves objects.
- π New URL scheme in Dagit, with repository details included on all paths for pipelines, solids, and schedules
- π Relaxed constraints for the AssetKey constructor, to enable arbitrary strings as part of the key path.
- π§ When executing a subset of a pipeline, configuration that does not apply to the current subset but would be valid in the original pipeline is now allowed and ignored.
- π GCSComputeLogManager was added, allowing for compute logs to be persisted to Google cloud storage
- The step-partition matrix in Dagit now auto-reloads runs
π Bugfixes
- π Dagit bugfixes and improvements
- β± When specifying a namespace during helm install, the same namespace will now be used by the K8sScheduler or K8sRunLauncher, unless overridden.
@pipeline
decorated functions with -> None typing no longer cause unexpected problems.- π Fixed an issue where compute logs might not always be complete on Windows.
-
v0.9.18 Changes
π₯ Breaking Changes
- 0οΈβ£
CliApiRunLauncher
andGrpcRunLauncher
have been combined intoDefaultRunLauncher
. If you had one of these run launchers in yourdagster.yaml
, replace it withDefaultRunLauncher
or remove therun_launcher:
section entirely.
π New
- β Added a type loader for typed dictionaries: can now load typed dictionaries from config.
π Bugfixes
- π Dagit bugfixes and improvements
- Added error handling for repository errors on startup and reload
- Repaired timezone offsets
- Fixed pipeline explorer state for empty pipelines
- Fixed Scheduler table
- π User-defined k8s config in the pipeline run tags (with key
dagster-k8s/config
) will now be passed to the k8s jobs when using thedagster-k8s
anddagster-celery-k8s
run launchers. Previously, only user-defined k8s config in the pipeline definitionβs tag was passed down.
Experimental
- β Run queuing: the new
QueuedRunCoordinator
enables limiting the number of concurrent runs. TheDefaultRunCoordinator
launches jobs directly from Dagit, preserving existing behavior.
- 0οΈβ£
-
v0.9.17 Changes
π New
- β± [dagster-dask] Allow connecting to an existing scheduler via its address
- [dagster-aws] Importing dagster_aws.emr no longer transitively importing dagster_spark
- [dagster-dbr] CLI solids now emit materializations
Community contributions
- π Docs fix (Thanks @kaplanbora!)
π Bug fixes
PipelineDefinition
's that do not meet resource requirements for its types will now fail at definition time- π Dagit bugfixes and improvements
- π Fixed an issue where a run could be left hanging if there was a failure during launch
π Deprecated
- We now warn if you return anything from a function decorated with
@pipeline
. This return value actually had no impact at all and was ignored, but we are making changes that will use that value in the future. By changing your code to not return anything now you will avoid any breaking changes with zero user-visible impact.
-
v0.9.16 Changes
π₯ Breaking Changes
- β Removed
DagsterKubernetesPodOperator
indagster-airflow
. - β Removed the
execute_plan
mutation fromdagster-graphql
. - π
ModeDefinition
,PartitionSetDefinition
,PresetDefinition
,@repository
,@pipeline
, andScheduleDefinition
names must pass the regular expressionr"^[A-Za-z0-9_]+$"
and not be python keywords or disallowed names. SeeDISALLOWED_NAMES
indagster.core.definitions.utils
for exhaustive list of illegal names. - β¬οΈ
dagster-slack
is now upgraded to use slackclient 2.x - this means that this resource will only support Python 3.6 and above. - π [K8s] Added a health check to the helm chart for user deployments, which relies on a new
dagster api grpc-health-check
cli command present in Dagster0.9.16
and later.
π New
- β Add helm chart configurations to allow users to configure a
K8sRunLauncher
, in place of theCeleryK8sRunLauncher
. - βCopy URLβ button to preserve filter state on Run page in dagit
Community Contributions
- Dagster CLI options can now be passed in via environment variables (Thanks @xinbinhuang!)
- π New
--limit
flag on thedagster run list
command (Thanks @haydarai!)
π Bugfixes
- β Addressed performance issues loading the /assets table in dagit. Requires a data migration to create a secondary index by running dagster instance reindex.
- π Dagit bugfixes and improvements
- β Removed