databricks-cli

Commit Graph

Author	SHA1	Message	Date
Andrew Nester	56ed9bebf3	Added support for creating all-purpose clusters (#1698 ) ## Changes Added support for creating all-purpose clusters Example of configuration ``` bundle: name: clusters resources: clusters: test_cluster: cluster_name: "Test Cluster" num_workers: 2 node_type_id: "i3.xlarge" autoscale: min_workers: 2 max_workers: 7 spark_version: "13.3.x-scala2.12" spark_conf: "spark.executor.memory": "2g" jobs: test_job: name: "Test Job" tasks: - task_key: test_task existing_cluster_id: ${resources.clusters.test_cluster.id} notebook_task: notebook_path: "./src/test.py" targets: development: mode: development compute_id: ${resources.clusters.test_cluster.id} ``` ## Tests Added unit, config and E2E tests	2024-09-23 10:42:34 +00:00
Andrew Nester	bcab6ca37b	Fixed detecting full syntax variable override which includes type field (#1775 ) ## Changes Fixes #1773 ## Tests Confirmed manually	2024-09-18 10:23:07 +00:00
Pieter Noordhuis	b451905b6e	Expand library globs relative to the sync root (#1756 ) ## Changes Library glob expansion happens during deployment. Before that, all entries that refer to local paths in resource definitions are made relative to the _sync root_. Before #1694, they were made relative to the _bundle root_. This PR didn't update the library glob expansion code to use the sync root path. If you were using the sync paths setting with library globs, the CLI would fail to expand the globs because the code was using the wrong path to anchor those globs. This change fixes the issue. ## Tests Manually confirmed that this fixes the issue reported in #1755.	2024-09-09 09:56:16 +00:00
Andrew Nester	02e83877f4	Added listing cluster filtering for cluster lookups (#1754 ) ## Changes We added a custom resolver for the cluster to add filtering for the cluster source when we list all clusters. Without the filtering listing could take a very long time (5-10 mins) which leads to lookup timeouts. ## Tests Existing unit tests passing	2024-09-06 11:34:57 +00:00
Andrew Nester	72030844c5	Fixed variable override in target with full variable syntax (#1749 ) ## Changes This PR makes sure that both of this override syntax for variables work correctly ``` targets: dev: variables: cluster1: spark_version: "14.2.x-scala2.11" node_type_id: "Standard_DS3_v2" num_workers: 4 spark_conf: spark.speculation: false spark.databricks.delta.retentionDurationCheck.enabled: false cluster2: default: spark_version: "14.2.x-scala2.11" node_type_id: "Standard_DS3_v2" num_workers: 4 spark_conf: spark.speculation: false spark.databricks.delta.retentionDurationCheck.enabled: false ``` ## Tests Added regression test --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-09-04 17:16:40 +00:00
Andrew Nester	ca6332a5a4	Fixed complex variables are not being correctly merged from include files (#1746 ) ## Changes Fixes an `Error: no value assigned to required variable <variable>.` when the main complex variable definition is defined in one file but target override is defined in separate file which is included in the main one. ## Tests Added regression test	2024-09-04 11:24:55 +00:00
Pieter Noordhuis	6e8cd835a3	Add paths field to bundle sync configuration (#1694 ) ## Changes This field allows a user to configure paths to synchronize to the workspace. Allowed values are relative paths to files and directories anchored at the directory where the field is set. If one or more values traverse up the directory tree (to an ancestor of the bundle root directory), the CLI will dynamically determine the root path to use to ensure that the file tree structure remains intact. For example, given a `databricks.yml` in `my_bundle` that includes: ```yaml sync: paths: - ../common - . ``` Then upon synchronization, the workspace will look like: ``` . ├── common │ └── lib.py └── my_bundle ├── databricks.yml └── notebook.py ``` If not set behavior remains identical. ## Tests * Newly added unit tests for the mutators and under `bundle/tests`. * Manually confirmed a bundle without this configuration works the same. * Manually confirmed a bundle with this configuration works.	2024-08-21 15:33:25 +00:00
Pieter Noordhuis	af5048e73e	Share test initializer in common helper function (#1695 ) ## Changes These tests inadvertently re-ran mutators, the first time through `loadTarget` and the second time by running `phases.Initialize()` themselves. Some of the mutators that are executed in `phases.Initialize()` are also run as part of `loadTarget`. This is overdue a refactor to make it unambiguous what runs when. Until then, this removes the duplicated execution. ## Tests Unit tests pass.	2024-08-20 12:54:56 +00:00
shreyas-goenka	242d4b51ed	Report all empty resources present in error diagnostic (#1685 ) ## Changes This PR addressed post-merge feedback from https://github.com/databricks/cli/pull/1673. ## Tests Unit tests, and manually. ``` Error: experiment undefined-experiment is not defined at resources.experiments.undefined-experiment in databricks.yml:11:26 Error: job undefined-job is not defined at resources.jobs.undefined-job in databricks.yml:6:19 Error: pipeline undefined-pipeline is not defined at resources.pipelines.undefined-pipeline in databricks.yml:14:24 Name: undefined-job Target: default Found 3 errors ```	2024-08-20 00:22:00 +00:00
Lennart Kats (databricks)	78d0ac5c6a	Add configurable presets for name prefixes, tags, etc. (#1490 ) ## Changes This adds configurable transformations based on the transformations currently seen in `mode: development`. Example databricks.yml showcasing how some transformations: ``` bundle: name: my_bundle targets: dev: presets: prefix: "myprefix_" # prefix all resource names with myprefix_ pipelines_development: true # set development to true by default for pipelines trigger_pause_status: PAUSED # set pause_status to PAUSED by default for all triggers and schedules jobs_max_concurrent_runs: 10 # set max_concurrent runs to 10 by default for all jobs tags: dev: true ``` ## Tests * Existing process_target_mode tests that were adapted to use this new code * Unit tests specific for the new mutator * Unit tests for config loading and merging * Manual e2e testing	2024-08-19 18:18:50 +00:00
Andrew Nester	48ff18e5fc	Upload local libraries even if they don't have artifact defined (#1664 ) ## Changes Previously for all the libraries referenced in configuration DABs made sure that there is corresponding artifact section. But this is not really necessary and flexible, because local libraries might be built outside of dabs context. It also created difficult to follow logic in code where we back referenced libraries to artifacts which was difficult to fllow This PR does 3 things: 1. Allows all local libraries referenced in DABs config to be uploaded to remote 2. Simplifies upload and glob references expand logic by doing this in single place 3. Speed things up by uploading library only once and doing this in parallel ## Tests Added unit + integration tests + made sure that change is backward compatible (no changes in existing tests) --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-08-14 09:03:44 +00:00
shreyas-goenka	7ae80de351	Stop tracking file path locations in bundle resources (#1673 ) ## Changes Since locations are already tracked in the dynamic value tree, we no longer need to track it at the resource/artifact level. This PR: 1. Removes use of `paths.Paths`. Uses dyn.Location instead. 2. Refactors the validation of resources not being empty valued to be generic across all resource types. ## Tests Existing unit tests.	2024-08-13 12:50:15 +00:00
Pieter Noordhuis	f3ffded3bf	Merge job parameters based on their name (#1659 ) ## Changes This change enables overriding the default value of job parameters in target overrides. This is the same approach we already take for job clusters and job tasks. Closes #1620. ## Tests Mutator unit tests and lightweight end-to-end tests.	2024-08-06 16:12:18 +00:00
Andrew Nester	d26f3f4863	Fixed incorrectly cleaning up python wheel dist folder (#1656 ) ## Changes In https://github.com/databricks/cli/pull/1618 we introduced prepare step in which Python wheel folder was cleaned. Now it was cleaned everytime instead of only when there is a build command how it is used to work. This PR fixes it by only cleaning up dist folder when there is a build command for wheels. Fixes #1638 ## Tests Added regression test	2024-08-06 09:54:58 +00:00
Andrew Nester	809c67b675	Expand and upload local wheel libraries for all task types (#1649 ) ## Changes Fixes #1553 ## Tests Added regression test	2024-08-05 14:44:23 +00:00
Andrew Nester	1fb8e324d5	Added test for negation pattern in sync include exclude section (#1637 ) ## Changes Added test for negation pattern in sync include exclude section	2024-07-31 13:42:23 +00:00
shreyas-goenka	a52b188e99	Use dynamic walking to validate unique resource keys (#1614 ) ## Changes This PR: 1. Uses dynamic walking (via the `dyn.MapByPattern` func) to validate no two resources have the same resource key. The allows us to remove this validation at merge time. 2. Modifies `dyn.Mapping` to always return a sorted slice of pairs. This makes traversal functions like `dyn.Walk` or `dyn.MapByPattern` deterministic. ## Tests Unit tests. Also manually.	2024-07-29 13:04:02 +00:00
shreyas-goenka	37b9df96e6	Support multiple paths for diagnostics (#1616 ) ## Changes Some diagnostics can have multiple paths associated with them. For instance, ensuring that unique resource keys are used across all resources. This PR extends `diag.Diagnostic` to accept multiple paths. This PR is symmetrical to https://github.com/databricks/cli/pull/1610/files ## Tests Unit tests	2024-07-25 15:16:27 +00:00
Andrew Nester	39fc86e83b	Split artifact cleanup into prepare step before build (#1618 ) ## Changes Now prepare stage which does cleanup is execute once before every build, so artifacts built into the same folder are correctly kept Fixes workaround 2 from this issue #1602 ## Tests Added unit test	2024-07-24 09:13:49 +00:00
shreyas-goenka	4bf88b4209	Support multiple locations for diagnostics (#1610 ) ## Changes This PR changes `diag.Diagnostics` to allow including multiple locations associated with the diagnostic message. The diagnostics that now return multiple locations with this PR are: 1. Warning for unknown keys in config. 2. Use of experimental.run_as 3. Accidental sync.exludes that exclude all files. ## Tests Existing unit tests pass. New unit test case to assert on error message when multiple locations are included. Example output: ``` ➜ bundle-playground-2 ~/cli2/cli/cli bundle validate Warning: You are using the legacy mode of run_as. The support for this mode is experimental and might be removed in a future release of the CLI. In order to run the DLT pipelines in your DAB as the run_as user this mode changes the owners of the pipelines to the run_as identity, which requires the user deploying the bundle to be a workspace admin, and also a Metastore admin if the pipeline target is in UC. at experimental.use_legacy_run_as in resources.yml:10:22 databricks.yml:13:22 Name: fix run_if Target: default Workspace: User: shreyas.goenka@databricks.com Path: /Users/shreyas.goenka@databricks.com/.bundle/fix run_if/default Found 1 warning ```	2024-07-23 17:20:11 +00:00
Andrew Nester	040b374430	Override complex variables with target overrides instead of merging (#1567 ) ## Changes At the moment we merge values of complex variables while more expected behaviour is overriding the value with the target one. ## Tests Added unit test	2024-07-04 11:57:29 +00:00
Andrew Nester	3d2f7622bc	Fixed bundle not loading when empty variable is defined (#1552 ) ## Changes Fixes #1544 ## Tests Added regression test	2024-07-02 12:40:39 +00:00
Pieter Noordhuis	a0df54ac41	Add extra tests for the sync block (#1548 ) ## Changes Issue #1545 describes how a nil entry in the sync block caused an error. The fix for this issue is in #1547. This change adds end-to-end test coverage. ## Tests New test passes on top of #1547.	2024-07-01 13:08:50 +00:00
Gleb Kanterov	e8b76a7f13	Improve `bundle validate` output (#1532 ) ## Changes This combination of changes allows pretty-printing errors happening during the "load" and "init" phases, including their locations. Move to render code into a separate module dedicated to rendering `diag.Diagnostics` in a human-readable format. This will be used for the `bundle deploy` command. Preserve the "bundle" value if an error occurs in mutators. Rewrite the go templates to handle the case when the bundle isn't yet loaded if an error occurs during loading, that is possible now. Improve rendering for errors and warnings: - don't render empty locations - render "details" for errors if they exist Add `root.ErrAlreadyPrinted` indicating that the error was already printed, and the CLI entry point shouldn't print it again. ## Tests Add tests for output, that are especially handy to detect extra newlines	2024-07-01 09:01:10 +00:00
shreyas-goenka	4d8eba04cd	Compare `.Kind()` instead of direct equality checks on a `dyn.Value` (#1520 ) ## Changes This PR makes two changes: 1. In https://github.com/databricks/cli/pull/1510 we'll be adding multiple associated location metadata with a dyn.Value. The Go compiler does not allow comparing structs if they contain slice values (presumably due to multiple possible definitions for equality). In anticipation for adding a `[]dyn.Location` type field to `dyn.Value` this PR removes all direct comparisons of `dyn.Value` and instead relies on the kind. 2. Retain location metadata for values in convert.FromTyped. The change diff is exactly the same as https://github.com/databricks/cli/pull/1523. It's been combined with this PR because they both depend on each other to prevent test failures (forming a test failure deadlock). Go patch used: ``` @@ var x expression @@ -x == dyn.InvalidValue +x.Kind() == dyn.KindInvalid @@ var x expression @@ -x != dyn.InvalidValue +x.Kind() != dyn.KindInvalid @@ var x expression @@ -x == dyn.NilValue +x.Kind() == dyn.KindNil @@ var x expression @@ -x != dyn.NilValue +x.Kind() != dyn.KindNil ``` ## Tests Unit tests and integration tests pass.	2024-06-27 13:28:19 +00:00
Andrew Nester	5f42791609	Added support for complex variables (#1467 ) ## Changes Added support for complex variables Now it's possible to add and use complex variables as shown below ``` bundle: name: complex-variables resources: jobs: my_job: job_clusters: - job_cluster_key: key new_cluster: ${var.cluster} tasks: - task_key: test job_cluster_key: key variables: cluster: description: "A cluster definition" type: complex default: spark_version: "13.2.x-scala2.11" node_type_id: "Standard_DS3_v2" num_workers: 2 spark_conf: spark.speculation: true spark.databricks.delta.retentionDurationCheck.enabled: false ``` Fixes #1298 - [x] Support for complex variables - [x] Allow variable overrides (with shortcut) in targets - [x] Don't allow to provide complex variables via flag or env variable - [x] Fail validation if complex value is used but not `type: complex` provided - [x] Support using variables inside complex variables ## Tests Added unit tests --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-06-26 10:25:32 +00:00
Lennart Kats (databricks)	deb3e365cd	Pause quality monitors when "mode: development" is used (#1481 ) ## Changes Similar to scheduled jobs, quality monitors should be paused when in development mode (in line with the [behavior for scheduled jobs](https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html)). @aravind-segu @arpitjasa-db please take a look and verify this behavior. - [x] Followup: documentation changes. If we make this change we should update https://docs.databricks.com/dev-tools/bundles/deployment-modes.html. ## Tests Unit tests	2024-06-19 13:54:35 +00:00
Andrew Nester	663aa9ab8c	Override variables with lookup value even if values has default value set (#1504 ) ## Changes This PR fixes the behaviour when variables were not overridden with lookup value from targets if these variables had any default value set in the default target. Fixes #1449 ## Tests Added regression test	2024-06-19 08:03:06 +00:00
Aravind Segu	a33d0c8bf9	Add support for Lakehouse monitoring in bundles (#1307 ) ## Changes This change adds support for Lakehouse monitoring in bundles. The associated resource type name is "quality monitor". ## Testing Unit tests. --------- Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com> Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> Co-authored-by: Arpit Jasapara <87999496+arpitjasa-db@users.noreply.github.com>	2024-05-31 09:42:25 +00:00
Andrew Nester	3f8036f2df	Fixed seg fault when specifying environment key for tasks (#1443 ) ## Changes Fixed seg fault when specifying environment key for tasks	2024-05-21 10:00:04 +00:00
Andrew Nester	a014d50a6a	Fixed panic when loading incorrectly defined jobs (#1402 ) ## Changes If only key was defined for a job in YAML config, validate previously failed with segfault. This PR validates that jobs are correctly defined and returns an error if not. ## Tests Added regression test	2024-05-17 10:10:17 +00:00
Miles Yucht	f7d4b272f4	Improve token refresh flow (#1434 ) ## Changes Currently, there are a number of issues with the non-happy-path flows for token refresh in the CLI. If the token refresh fails, the raw error message is presented to the user, as seen below. This message is very difficult for users to interpret and doesn't give any clear direction on how to resolve this issue. ``` Error: token refresh: Post "https://adb-<WSID>.azuredatabricks.net/oidc/v1/token": http 400: {"error":"invalid_request","error_description":"Refresh token is invalid"} ``` When logging in again, I've noticed that the timeout for logging in is very short, only 45 seconds. If a user is using a password manager and needs to login to that first, or needs to do MFA, 45 seconds may not be enough time. to an account-level profile, it is quite frustrating for users to need to re-enter account ID information when that information is already stored in the user's `.databrickscfg` file. This PR tackles these two issues. First, the presentation of error messages from `databricks auth token` is improved substantially by converting the `error` into a human-readable message. When the refresh token is invalid, it will present a command for the user to run to reauthenticate. If the token fetching failed for some other reason, that reason will be presented in a nice way, providing front-line debugging steps and ultimately redirecting users to file a ticket at this repo if they can't resolve the issue themselves. After this PR, the new error message is: ``` Error: a new access token could not be retrieved because the refresh token is invalid. To reauthenticate, run `.databricks/databricks auth login --host https://adb-<WSID>.azuredatabricks.net` ``` To improve the login flow, this PR modifies `databricks auth login` to auto-complete the account ID from the profile when present. Additionally, it increases the login timeout from 45 seconds to 1 hour to give the user sufficient time to login as needed. To test this change, I needed to refactor some components of the CLI around profile management, the token cache, and the API client used to fetch OAuth tokens. These are now settable in the context, and a demonstration of how they can be set and used is found in `auth_test.go`. Separately, this also demonstrates a sort-of integration test of the CLI by executing the Cobra command for `databricks auth token` from tests, which may be useful for testing other end-to-end functionality in the CLI. In particular, I believe this is necessary in order to set flag values (like the `--profile` flag in this case) for use in testing. ## Tests Unit tests cover the unhappy and happy paths using the mocked API client, token cache, and profiler. Manually tested --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-05-16 10:22:09 +00:00
shreyas-goenka	e652333103	Fix variable overrides in targets for non-string variables (#1397 ) Before variable overrides that were not string in a target would not work. This PR fixes that. Tested manually and via a unit test.	2024-04-25 11:21:10 +00:00
shreyas-goenka	1d9bf4b2c4	Add legacy option for `run_as` (#1384 ) ## Changes This PR partially reverts the changes in https://github.com/databricks/cli/pull/1233 and puts the old code under an "experimental.use_legacy_run_as" configuration. This gives customers who ran into the breaking change made in the PR a way out. ## Tests Both manually and via unit tests. Manually verified that run_as works for pipelines now. And if a user wants to use the feature they need to be both a Metastore and a workspace admin. --------- Error when the deploying user is a workspace admin but not a metastore admin: ``` Error: terraform apply: exit status 1 Error: cannot update permissions: User is not a metastore admin for Metastore 'deco-uc-prod-aws-us-east-1'. with databricks_permissions.pipeline_foo, on bundle.tf.json line 23, in resource.databricks_permissions.pipeline_foo: 23: } ``` -------- Output of bundle validate: ``` ➜ bundle-playground git:(master) ✗ cli bundle validate Warning: You are using the legacy mode of run_as. The support for this mode is experimental and might be removed in a future release of the CLI. In order to run the DLT pipelines in your DAB as the run_as user this mode changes the owners of the pipelines to the run_as identity, which requires the user deploying the bundle to be a workspace admin, and also a Metastore admin if the pipeline target is in UC. at experimental.use_legacy_run_as in databricks.yml:13:22 Name: bundle-playground Target: default Workspace: Host: https://dbc-a39a1eb1-ef95.cloud.databricks.com User: shreyas.goenka@databricks.com Path: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default Found 1 warning ```	2024-04-22 11:51:41 +00:00
Andrew Nester	1872aa12b3	Added support for job environments (#1379 ) ## Changes The main changes are: 1. Don't link artifacts to libraries anymore and instead just iterate over all jobs and tasks when uploading artifacts and update local path to remote 2. Iterating over `jobs.environments` to check if there are any local libraries and checking that they exist locally 3. Added tests to check environments are handled correctly End-to-end test will follow up ## Tests Added regression test, existing tests (including integration one) pass	2024-04-22 11:44:34 +00:00
shreyas-goenka	6ca57a7e68	Add docs URL for `run_as` in error message (#1381 )	2024-04-19 14:09:33 +00:00
Andrew Nester	27f51c760f	Added validate mutator to surface additional bundle warnings (#1352 ) ## Changes All these validators will return warnings as part of `bundle validate` run Added 2 mutators: 1. To check that if tasks use job_cluster_key it is actually defined 2. To check if there are any files to sync as part of deployment Also added `bundle.Parallel` to run them in parallel To make sure mutators under bundle.Parallel do not mutate config, introduced new `ReadOnlyMutator`, `ReadOnlyBundle` and `ReadOnlyConfig`. Example ``` databricks bundle validate -p deco-staging Warning: unknown field: new_cluster at resources.jobs.my_job in bundle.yml:24:7 Warning: job_cluster_key high_cpu_workload_job_cluster is not defined at resources.jobs.my_job.tasks[0].job_cluster_key in bundle.yml:35:28 Warning: There are no files to sync, please check your your .gitignore and sync.exclude configuration at sync.exclude in bundle.yml:18:5 Name: test Target: default Workspace: Host: https://acme.databricks.com User: andrew.nester@databricks.com Path: /Users/andrew.nester@databricks.com/.bundle/test/default Found 3 warnings ``` ## Tests Added unit tests	2024-04-18 15:13:16 +00:00
Andrew Nester	d914a1b1e2	Do not emit warning on YAML anchor blocks (#1354 ) ## Changes In 0.217.0 we started to emit warning on unknown fields in YAML configuration but wrongly considered YAML anchor blocks as unknown field. This PR fixes this by skipping normalising of YAML blocks. ## Tests Added regression tests	2024-04-10 09:55:02 +00:00
Pieter Noordhuis	a95b1c7dcf	Retain location information of variable reference (#1333 ) ## Changes Variable substitution works as if the variable reference is literally replaced with its contents. The following fields should be interpreted in the same way regardless of where the variable is defined: ```yaml foo: ${var.some_path} bar: "./${var.some_path}" ``` Before this change, `foo` would inherit the location information of the variable definition. After this change, it uses the location information of the variable reference, making the behavior for `foo` and `bar` identical. Fixes #1330. ## Tests The new test passes only with the fix.	2024-04-03 10:40:29 +00:00
shreyas-goenka	5df4c7e134	Add allow list for resources when bundle `run_as` is set (#1233 ) ## Changes This PR introduces an allow list for resource types that are allowed when the run_as for the bundle is not the same as the current deployment user. This PR also adds a test to ensure that any new resources added to DABs will have to add the resource to either the allow list or add an error to fail when run_as identity is not the same as deployment user. ## Tests Unit tests	2024-03-27 16:13:53 +00:00
Pieter Noordhuis	ca534d596b	Load bundle configuration from mutator (#1318 ) ## Changes Prior to this change, the bundle configuration entry point was loaded from the function `bundle.Load`. Other configuration files were only loaded once the caller applied the first set of mutators. This separation was unnecessary and not ideal in light of gathering diagnostics while loading _any_ configuration file, not just the ones from the includes. This change: * Updates `bundle.Load` to only verify that the specified path is a valid bundle root. * Moves mutators that perform loading to `bundle/config/loader`. * Adds a "load" phase that takes the place of applying `DefaultMutators`. Follow ups: * Rename `bundle.Load` -> `bundle.Find` (because it no longer performs loading) This change depends on #1316 and #1317. ## Tests Tests pass.	2024-03-27 10:49:05 +00:00
Pieter Noordhuis	00d76d5afa	Move path field to bundle type (#1316 ) ## Changes The bundle path was previously stored on the `config.Root` type under the assumption that the first configuration file being loaded would set it. This is slightly counterintuitive and we know what the path is upon construction of the bundle. The new location for this property reflects this. ## Tests Unit tests pass.	2024-03-27 09:03:24 +00:00
Pieter Noordhuis	ed194668db	Return `diag.Diagnostics` from mutators (#1305 ) ## Changes This diagnostics type allows us to capture multiple warnings as well as errors in the return value. This is a preparation for returning additional warnings from mutators in case we detect non-fatal problems. * All return statements that previously returned an error now return `diag.FromErr` * All return statements that previously returned `fmt.Errorf` now return `diag.Errorf` * All `err != nil` checks now use `diags.HasError()` or `diags.Error()` ## Tests * Existing tests pass. * I confirmed no call site under `./bundle` or `./cmd/bundle` uses `errors.Is` on the return value from mutators. This is relevant because we cannot wrap errors with `%w` when calling `diag.Errorf` (like `fmt.Errorf`; context in https://github.com/golang/go/issues/47641).	2024-03-25 14:18:47 +00:00
Pieter Noordhuis	f202596a6f	Move bundle tests into bundle/tests (#1299 ) ## Changes These tests were located in `bundle/tests/bundle` which meant they were unable to reuse the helper functions defined in the `bundle/tests` package. There is no need for these tests to live outside the package. ## Tests Existing tests pass.	2024-03-21 10:37:05 +00:00
Pieter Noordhuis	7c4b34945c	Rewrite relative paths using `dyn.Location` of the underlying value (#1273 ) ## Changes This change addresses the path resolution behavior in resource definitions. Previously, all paths were resolved relative to where the resource was first defined, which could lead to confusion and errors when paths were specified in different directories. The new behavior is to resolve paths relative to where they are defined, making it more intuitive. However, to avoid breaking existing configurations, compatibility with the old behavior is maintained. ## Tests * Existing unit tests for path translation pass. * Additional test to cover both the nominal and the fallback behavior.	2024-03-18 16:23:39 +00:00
Pieter Noordhuis	87dd46a3f8	Use dynamic configuration model in bundles (#1098 ) ## Changes This is a fundamental change to how we load and process bundle configuration. We now depend on the configuration being represented as a `dyn.Value`. This representation is functionally equivalent to Go's `any` (it is variadic) and allows us to capture metadata associated with a value, such as where it was defined (e.g. file, line, and column). It also allows us to represent Go's zero values properly (e.g. empty string, integer equal to 0, or boolean false). Using this representation allows us to let the configuration model deviate from the typed structure we have been relying on so far (`config.Root`). We need to deviate from these types when using variables for fields that are not a string themselves. For example, using `${var.num_workers}` for an integer `workers` field was impossible until now (though not implemented in this change). The loader for a `dyn.Value` includes functionality to capture any and all type mismatches between the user-defined configuration and the expected types. These mismatches can be surfaced as validation errors in future PRs. Given that many mutators expect the typed struct to be the source of truth, this change converts between the dynamic representation and the typed representation on mutator entry and exit. Existing mutators can continue to modify the typed representation and these modifications are reflected in the dynamic representation (see `MarkMutatorEntry` and `MarkMutatorExit` in `bundle/config/root.go`). Required changes included in this change: * The existing interpolation package is removed in favor of `libs/dyn/dynvar`. * Functionality to merge job clusters, job tasks, and pipeline clusters are now all broken out into their own mutators. To be implemented later: * Allow variable references for non-string types. * Surface diagnostics about the configuration provided by the user in the validation output. * Some mutators use a resource's configuration file path to resolve related relative paths. These depend on `bundle/config/paths.Path` being set and populated through `ConfigureConfigFilePath`. Instead, they should interact with the dynamically typed configuration directly. Doing this also unlocks being able to differentiate different base paths used within a job (e.g. a task override with a relative path defined in a directory other than the base job). ## Tests * Existing unit tests pass (some have been modified to accommodate) * Integration tests pass	2024-02-16 19:41:58 +00:00
Pieter Noordhuis	33c446dadd	Refactor library to artifact matching to not use pointers (#1172 ) ## Changes The approach to do this was: 1. Iterate over all libraries in all job tasks 2. Find references to local libraries 3. Store pointer to `compute.Library` in the matching artifact file to signal it should be uploaded This breaks down when introducing #1098 because we can no longer track unexported state across mutators. The approach in this PR performs the path matching twice; once in the matching mutator where we check if each referenced file has an artifacts section, and once during artifact upload to rewrite the library path from a local file reference to an absolute Databricks path. ## Tests Integration tests pass.	2024-02-05 15:29:45 +00:00
Andrew Nester	4b01fff03d	Fixed instance pool resolving by name (#1102 ) ## Changes Fixed instance pool resolving by name ## Tests Added regression test	2024-01-05 10:50:53 +00:00
Andrew Nester	5fb40f9d07	Allow referencing bundle resources by name (#872 ) ## Changes Now we can define variables with values which reference different Databricks resources by name. When references like this, DABs automatically looks up the resource by this name and replaces the reference with ID of the resource referenced. Thus when the variable is used in the configuration it will contain the correct resolved ID of resource. The resolvers are code generated and thus DABs support referencing all resources which has `GetByName`-like methods in Go SDK. ### Example ``` variables: my_cluster_id: description: An existing cluster. lookup: cluster: "12.2 shared" resources: jobs: my_job: name: "My Job" tasks: - task_key: TestTask existing_cluster_id: ${var.my_cluster_id} targets: dev: variables: my_cluster_id: lookup: cluster: "dev-cluster" ``` ## Tests Added unit test + manual testing --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-01-04 21:04:42 +00:00
Pieter Noordhuis	cee70a53c8	Test existing behavior when loading non-string spark conf values (#1071 ) ## Changes This test is expected to fail when we enable the custom YAML loader.	2023-12-18 11:22:22 +00:00

1 2

93 Commits