databricks-cli

Commit Graph

Author	SHA1	Message	Date
Andrew Nester	663aa9ab8c	Override variables with lookup value even if values has default value set (#1504 ) ## Changes This PR fixes the behaviour when variables were not overridden with lookup value from targets if these variables had any default value set in the default target. Fixes #1449 ## Tests Added regression test	2024-06-19 08:03:06 +00:00
shreyas-goenka	553fdd1e81	Serialize dynamic value for `bundle validate` output (#1499 ) ## Changes Using dynamic values allows us to retain references like `${resources.jobs...}` even when the type of field is not integer, eg: `run_job_task`, or in general values that do not map to the Go types for a field. ## Tests Integration test	2024-06-18 15:04:20 +00:00
shreyas-goenka	274688d8a2	Clean up unused code (#1502 ) ## Changes 1. Removes `DefaultMutatorsForTarget` which is no longer used anywhere 2. Makes SnapshotPath a private field. It's no longer needed by data structures outside its package. FYI, I also tried finding other instances of dead code but I could not find anything else that was safe to remove. I used https://go.dev/blog/deadcode to search for them, and the other instances either implemented an interface, increased test coverage for some of our other code paths or there was some other reason I could not remove them (like autogenerated functions or used in tests). Good sign our codebase is mostly clean (at least superficially).	2024-06-18 14:14:27 +00:00
Pieter Noordhuis	c9b4f11947	Update error checks that use the `os` package to use `errors.Is` (#1461 ) ## Changes From the [documentation](https://pkg.go.dev/os#IsNotExist) on the functions in the `os` package: > This function predates errors.Is. It only supports errors returned by the os package. > New code should use errors.Is(err, fs.ErrNotExist). This issue surfaced while working on using a different `vfs.Path` implementation that uses errors from the `fs` package. Calls to `os.IsNotExist` didn't return true for errors that wrap `fs.ErrNotExist`. ## Tests n/a	2024-06-03 12:39:36 +00:00
Aravind Segu	a33d0c8bf9	Add support for Lakehouse monitoring in bundles (#1307 ) ## Changes This change adds support for Lakehouse monitoring in bundles. The associated resource type name is "quality monitor". ## Testing Unit tests. --------- Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com> Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> Co-authored-by: Arpit Jasapara <87999496+arpitjasa-db@users.noreply.github.com>	2024-05-31 09:42:25 +00:00
Pieter Noordhuis	424499ec1d	Abstract over filesystem interaction with libs/vfs (#1452 ) ## Changes Introduce `libs/vfs` for an implementation of `fs.FS` and friends that _includes_ the absolute path it is anchored to. This is needed for: 1. Intercepting file operations to inject custom logic (e.g., logging, access control). 2. Traversing directories to find specific leaf directories (e.g., `.git`). 3. Converting virtual paths to OS-native paths. Options 2 and 3 are not possible with the standard `fs.FS` interface. They are needed such that we can provide an instance to the sync package and still detect the containing `.git` directory and convert paths to native paths. This change focuses on making the following packages use `vfs.Path`: * libs/fileset * libs/git * libs/sync All entries returned by `fileset.All` are now slash-separated. This has 2 consequences: * The sync snapshot now always uses slash-separated paths * We don't need to call `filepath.FromSlash` as much as we did ## Tests * All unit tests pass * All integration tests pass * Manually confirmed that a deployment made on Windows by a previous version of the CLI can be deployed by a new version of the CLI while retaining the validity of the local sync snapshot as well as the remote deployment state.	2024-05-30 07:41:50 +00:00
Andrew Nester	a014d50a6a	Fixed panic when loading incorrectly defined jobs (#1402 ) ## Changes If only key was defined for a job in YAML config, validate previously failed with segfault. This PR validates that jobs are correctly defined and returns an error if not. ## Tests Added regression test	2024-05-17 10:10:17 +00:00
Pieter Noordhuis	dd94107853	Remove dependency on `ConfigFilePath` from path translation mutator (#1437 ) ## Changes This is one step toward removing the `path.Paths` struct embedding from resource types. Going forward, we'll exclusively use the `dyn.Value` tree for location information. ## Tests Existing unit tests that cover path resolution with fallback behavior pass.	2024-05-17 09:26:09 +00:00
shreyas-goenka	63617253bd	Assert customer marshalling is implemented for resources (#1425 ) ## Changes This PR ensures every resource implements a custom marshaller / unmarshaller. This is required because we directly embed Go SDK structs. which implement custom marshalling overrides. Since the struct is embedded, the [customer marshalling overrides](https://pkg.go.dev/encoding/json#example-package-CustomMarshalJSON) are promoted to the top level. If the embedded struct itself is nil, then JSON marshal / unmarshal will panic because it tries to call `MarshalJSON` / `UnmarshalJSON` on a nil object. Fixing this issue at the Go SDK level does not seem possible. Discussed with @hectorcast-db.	2024-05-14 10:30:48 +00:00
shreyas-goenka	d949f2b4f2	Fix bundle schema for variables (#1396 ) ## Changes This PR fixes the variable schema to: 1. Allow non-string values in the "default" value of a variable. 2. Allow non-string overrides in a target for a variable. ## Tests Manually. There are no longer squiggly lines. Before: <img width="329" alt="Screenshot 2024-04-24 at 3 26 43 PM" src="https://github.com/databricks/cli/assets/88374338/43be02c2-80a4-4f80-bd79-0f3e1e93ee17"> After: <img width="361" alt="Screenshot 2024-04-24 at 3 26 10 PM" src="https://github.com/databricks/cli/assets/88374338/2c1fb892-a2a2-478b-8d2e-9bda6d844b54">	2024-04-25 11:23:50 +00:00
shreyas-goenka	e652333103	Fix variable overrides in targets for non-string variables (#1397 ) Before variable overrides that were not string in a target would not work. This PR fixes that. Tested manually and via a unit test.	2024-04-25 11:21:10 +00:00
shreyas-goenka	1d9bf4b2c4	Add legacy option for `run_as` (#1384 ) ## Changes This PR partially reverts the changes in https://github.com/databricks/cli/pull/1233 and puts the old code under an "experimental.use_legacy_run_as" configuration. This gives customers who ran into the breaking change made in the PR a way out. ## Tests Both manually and via unit tests. Manually verified that run_as works for pipelines now. And if a user wants to use the feature they need to be both a Metastore and a workspace admin. --------- Error when the deploying user is a workspace admin but not a metastore admin: ``` Error: terraform apply: exit status 1 Error: cannot update permissions: User is not a metastore admin for Metastore 'deco-uc-prod-aws-us-east-1'. with databricks_permissions.pipeline_foo, on bundle.tf.json line 23, in resource.databricks_permissions.pipeline_foo: 23: } ``` -------- Output of bundle validate: ``` ➜ bundle-playground git:(master) ✗ cli bundle validate Warning: You are using the legacy mode of run_as. The support for this mode is experimental and might be removed in a future release of the CLI. In order to run the DLT pipelines in your DAB as the run_as user this mode changes the owners of the pipelines to the run_as identity, which requires the user deploying the bundle to be a workspace admin, and also a Metastore admin if the pipeline target is in UC. at experimental.use_legacy_run_as in databricks.yml:13:22 Name: bundle-playground Target: default Workspace: Host: https://dbc-a39a1eb1-ef95.cloud.databricks.com User: shreyas.goenka@databricks.com Path: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default Found 1 warning ```	2024-04-22 11:51:41 +00:00
Andrew Nester	1872aa12b3	Added support for job environments (#1379 ) ## Changes The main changes are: 1. Don't link artifacts to libraries anymore and instead just iterate over all jobs and tasks when uploading artifacts and update local path to remote 2. Iterating over `jobs.environments` to check if there are any local libraries and checking that they exist locally 3. Added tests to check environments are handled correctly End-to-end test will follow up ## Tests Added regression test, existing tests (including integration one) pass	2024-04-22 11:44:34 +00:00
Lennart Kats (databricks)	000a7fef8c	Enable job queueing by default (#1385 ) ## Changes This enable queueing for jobs by default, following the behavior from API 2.2+. Queing is a best practice and will be the default in API 2.2. Since we're still using API 2.1 which has queueing disabled by default, this PR enables queuing using a mutator. Customers can manually turn off queueing for any job by adding the following to their job spec: ``` queue: enabled: false ``` ## Tests Unit tests, manual confirmation of property after deployment. --------- Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>	2024-04-22 10:36:39 +00:00
shreyas-goenka	6ca57a7e68	Add docs URL for `run_as` in error message (#1381 )	2024-04-19 14:09:33 +00:00
Andrew Nester	27f51c760f	Added validate mutator to surface additional bundle warnings (#1352 ) ## Changes All these validators will return warnings as part of `bundle validate` run Added 2 mutators: 1. To check that if tasks use job_cluster_key it is actually defined 2. To check if there are any files to sync as part of deployment Also added `bundle.Parallel` to run them in parallel To make sure mutators under bundle.Parallel do not mutate config, introduced new `ReadOnlyMutator`, `ReadOnlyBundle` and `ReadOnlyConfig`. Example ``` databricks bundle validate -p deco-staging Warning: unknown field: new_cluster at resources.jobs.my_job in bundle.yml:24:7 Warning: job_cluster_key high_cpu_workload_job_cluster is not defined at resources.jobs.my_job.tasks[0].job_cluster_key in bundle.yml:35:28 Warning: There are no files to sync, please check your your .gitignore and sync.exclude configuration at sync.exclude in bundle.yml:18:5 Name: test Target: default Workspace: Host: https://acme.databricks.com User: andrew.nester@databricks.com Path: /Users/andrew.nester@databricks.com/.bundle/test/default Found 3 warnings ``` ## Tests Added unit tests	2024-04-18 15:13:16 +00:00
Andrew Nester	542156c30b	Resolve variable references inside variable lookup fields (#1368 ) ## Changes Allows for the syntax below ``` variables: service_principal_app_id: description: 'The app id of the service principal for running workflows as.' lookup: service_principal: "sp-${bundle.environment}" ``` Fixes #1259 ## Tests Added regression test	2024-04-18 09:56:16 +00:00
Lennart Kats (databricks)	c3a7d17d1d	Disable locking for development mode (#1302 ) ## Changes This changes `databricks bundle deploy` so that it skips the lock acquisition/release step for a `mode: development` target: * This saves about 2 seconds (measured over 100 runs on a quiet/busy workspace). * This helps avoid the `deploy lock acquired by lennart@company.com at 2024-02-28 15:48:38.40603 +0100 CET. Use --force-lock to override` error * Risk: this may cause deployment conflicts, but since dev mode deployments are always scoped to a user, that risk should be minimal Update after discussion: * This behavior can now be disabled via a setting. * Docs PR: https://github.com/databricks/docs/pull/15873 ## Measurements ### 100 deployments of the "python_default" project to an empty workspace _Before this branch:_ p50 time: 11.479 seconds p90 time: 11.757 seconds _After this branch:_ p50 time: 9.386 seconds p90 time: 9.599 seconds ### 100 deployments of the "python_default" project to a busy (staging) workspace _Before this branch:_ * p50 time: 13.335 seconds * p90 time: 15.295 seconds _After this branch:_ * p50 time: 11.397 seconds * p90 time: 11.743 seconds ### Typical duration of deployment steps * Acquiring Deployment Lock: 1.096 seconds * Deployment Preparations and Operations: 1.477 seconds * Uploading Artifacts: 1.26 seconds * Finalizing Deployment: 9.699 seconds * Releasing Deployment Lock: 1.198 seconds --------- Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com> Co-authored-by: Andrew Nester <andrew.nester.dev@gmail.com>	2024-04-18 01:59:39 +00:00
dependabot[bot]	c949655f9f	Bump github.com/databricks/databricks-sdk-go from 0.37.0 to 0.38.0 (#1361 ) [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/databricks/databricks-sdk-go&package-manager=go_modules&previous-version=0.37.0&new-version=0.38.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Andrew Nester <andrew.nester@databricks.com>	2024-04-16 12:03:21 +00:00
Gleb Kanterov	e42156411b	Fix compute override for foreach tasks (#1357 ) ## Changes Fix compute override for foreach tasks. ``` $ databricks bundle deploy --compute-id=xxx ``` ## Tests I added unit tests	2024-04-12 09:53:29 +00:00
Andrew Nester	50d3bb4d56	Execute preinit after entry point to make sure scripts are loaded (#1351 ) ## Changes Execute preinit after entry point to make sure scripts are loaded	2024-04-08 14:32:21 +00:00
Andrew Nester	2f4c0c1b56	Fixed pre-init script order (#1348 ) ## Changes `preinit` script needs to be executed before processing configuration files to allow the script to modify the configuration or add own configuration files.	2024-04-08 13:28:38 +00:00
dependabot[bot]	f28a9d7107	Bump github.com/databricks/databricks-sdk-go from 0.36.0 to 0.37.0 (#1326 ) [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/databricks/databricks-sdk-go&package-manager=go_modules&previous-version=0.36.0&new-version=0.37.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Andrew Nester <andrew.nester@databricks.com>	2024-04-03 10:39:53 +00:00
Andrew Nester	8c144a2de4	Added `auth describe` command (#1244 ) ## Changes This command provide details on auth configuration user is using as well as authenticated user and auth mechanism used. Relies on https://github.com/databricks/databricks-sdk-go/pull/838 (tests will fail until merged) Examples of output ``` Workspace: https://test.com User: andrew.nester@databricks.com Authenticated with: pat ----- Configuration: ✓ auth_type: pat ✓ host: https://test.com (from bundle) ✓ profile: DEFAULT (from --profile flag) ✓ token: ****** (from /Users/andrew.nester/.databrickscfg config file) ``` ``` DATABRICKS_AUTH_TYPE=azure-msi databricks auth describe -p "Azure 2" Unable to authenticate: inner token: Post "https://foobar.com/oauth2/token": AADSTS900023: Specified tenant identifier foobar_aaaaaaa' is neither a valid DNS name, nor a valid external domain. See https://login.microsoftonline.com/error?code=900023 ----- Configuration: ✓ auth_type: azure-msi (from DATABRICKS_AUTH_TYPE environment variable) ✓ azure_client_id: 8470f3ba-aaaa-bbbb-cccc-xxxxyyyyzzzz (from /Users/andrew.nester/.databrickscfg config file) ~ azure_client_secret: ****** (from /Users/andrew.nester/.databrickscfg config file, not used for auth type azure-msi) ~ azure_tenant_id: foobar_aaaaaaa (from /Users/andrew.nester/.databrickscfg config file, not used for auth type azure-msi) ✓ azure_use_msi: true (from /Users/andrew.nester/.databrickscfg config file) ✓ host: https://foobar.com (from /Users/andrew.nester/.databrickscfg config file) ✓ profile: Azure 2 (from --profile flag) ``` For account ``` Unable to authenticate: default auth: databricks-cli: cannot get access token: Error: token refresh: Post "https://xxxxxxx.com/v1/token": http 400: {"error":"invalid_request","error_description":"Refresh token is invalid"} . Config: host=https://xxxxxxx.com, account_id=ed0ca3c5-fae5-4619-bb38-eebe04a4af4b, profile=ACCOUNT-ed0ca3c5-fae5-4619-bb38-eebe04a4af4b ----- Configuration: ✓ account_id: ed0ca3c5-fae5-4619-bb38-eebe04a4af4b (from /Users/andrew.nester/.databrickscfg config file) ✓ auth_type: databricks-cli (from /Users/andrew.nester/.databrickscfg config file) ✓ host: https://xxxxxxxxx.com (from /Users/andrew.nester/.databrickscfg config file) ✓ profile: ACCOUNT-ed0ca3c5-fae5-4619-bb38-eebe04a4af4b ``` ## Tests Added unit tests --------- Co-authored-by: Julia Crawford (Databricks) <julia.crawford@databricks.com>	2024-04-03 08:14:04 +00:00
Andrew Nester	56e393c743	Allow specifying CLI version constraints required to run the bundle (#1320 ) ## Changes Allow specifying CLI version constraints required to run the bundle Example of configuration: #### only allow specific version ``` bundle: name: my-bundle databricks_cli_version: "0.210.0" ``` #### allow all patch releases ``` bundle: name: my-bundle databricks_cli_version: "0.210.*" ``` #### constrain minimum version ``` bundle: name: my-bundle databricks_cli_version: ">= 0.210.0" ``` #### constrain range ``` bundle: name: my-bundle databricks_cli_version: ">= 0.210.0, <= 1.0.0" ``` For other examples see: https://github.com/Masterminds/semver?tab=readme-ov-file#checking-version-constraints Example error ``` sh-3.2$ databricks bundle validate Error: Databricks CLI version constraint not satisfied. Required: >= 1.0.0, current: 0.216.0 ``` ## Tests Added unit test cover all possible configuration permutations --------- Co-authored-by: Lennart Kats (databricks) <lennart.kats@databricks.com>	2024-04-02 12:55:21 +00:00
Pieter Noordhuis	eea34b2504	Return diagnostics from `config.Load` (#1324 ) ## Changes We no longer need to store load diagnostics on the `config.Root` type itself and instead can return them from the `config.Load` call directly. It is up to the caller of this function to append them to previous diagnostics, if any. Background: previous commits moved configuration loading of the entry point into a mutator, so now all diagnostics naturally flow from applying mutators. This PR depends on #1319. ## Tests Unit and manual validation of the debug statements in the validate command.	2024-03-28 10:59:03 +00:00
shreyas-goenka	5df4c7e134	Add allow list for resources when bundle `run_as` is set (#1233 ) ## Changes This PR introduces an allow list for resource types that are allowed when the run_as for the bundle is not the same as the current deployment user. This PR also adds a test to ensure that any new resources added to DABs will have to add the resource to either the allow list or add an error to fail when run_as identity is not the same as deployment user. ## Tests Unit tests	2024-03-27 16:13:53 +00:00
shreyas-goenka	704d069459	Make `bundle.deployment` optional in the bundle schema (#1321 ) ## Changes Makes the field optional by adding the `omitempty` tag. This gets rid of the red squiggly lines in the bundle schema.	2024-03-27 13:37:59 +00:00
Pieter Noordhuis	ca534d596b	Load bundle configuration from mutator (#1318 ) ## Changes Prior to this change, the bundle configuration entry point was loaded from the function `bundle.Load`. Other configuration files were only loaded once the caller applied the first set of mutators. This separation was unnecessary and not ideal in light of gathering diagnostics while loading _any_ configuration file, not just the ones from the includes. This change: * Updates `bundle.Load` to only verify that the specified path is a valid bundle root. * Moves mutators that perform loading to `bundle/config/loader`. * Adds a "load" phase that takes the place of applying `DefaultMutators`. Follow ups: * Rename `bundle.Load` -> `bundle.Find` (because it no longer performs loading) This change depends on #1316 and #1317. ## Tests Tests pass.	2024-03-27 10:49:05 +00:00
Pieter Noordhuis	f195b84475	Remove support for DATABRICKS_BUNDLE_INCLUDES (#1317 ) ## Changes PR #604 added functionality to load a bundle without a `databricks.yml` if both the `DATABRICKS_BUNDLE_ROOT` and `DATABRICKS_BUNDLE_INCLUDES` environment variables were set. We never ended up using this in downstream tools so this can be removed. ## Tests Unit tests pass.	2024-03-27 10:13:54 +00:00
Pieter Noordhuis	00d76d5afa	Move path field to bundle type (#1316 ) ## Changes The bundle path was previously stored on the `config.Root` type under the assumption that the first configuration file being loaded would set it. This is slightly counterintuitive and we know what the path is upon construction of the bundle. The new location for this property reflects this. ## Tests Unit tests pass.	2024-03-27 09:03:24 +00:00
Pieter Noordhuis	ed194668db	Return `diag.Diagnostics` from mutators (#1305 ) ## Changes This diagnostics type allows us to capture multiple warnings as well as errors in the return value. This is a preparation for returning additional warnings from mutators in case we detect non-fatal problems. * All return statements that previously returned an error now return `diag.FromErr` * All return statements that previously returned `fmt.Errorf` now return `diag.Errorf` * All `err != nil` checks now use `diags.HasError()` or `diags.Error()` ## Tests * Existing tests pass. * I confirmed no call site under `./bundle` or `./cmd/bundle` uses `errors.Is` on the return value from mutators. This is relevant because we cannot wrap errors with `%w` when calling `diag.Errorf` (like `fmt.Errorf`; context in https://github.com/golang/go/issues/47641).	2024-03-25 14:18:47 +00:00
Pieter Noordhuis	7c4b34945c	Rewrite relative paths using `dyn.Location` of the underlying value (#1273 ) ## Changes This change addresses the path resolution behavior in resource definitions. Previously, all paths were resolved relative to where the resource was first defined, which could lead to confusion and errors when paths were specified in different directories. The new behavior is to resolve paths relative to where they are defined, making it more intuitive. However, to avoid breaking existing configurations, compatibility with the old behavior is maintained. ## Tests * Existing unit tests for path translation pass. * Additional test to cover both the nominal and the fallback behavior.	2024-03-18 16:23:39 +00:00
Andrew Nester	1b0ac61093	Added deployment state for bundles (#1267 ) ## Changes This PR introduces new structure (and a file) being used locally and synced remotely to Databricks workspace to track bundle deployment related metadata. The state is pulled from remote, updated and pushed back remotely as part of `bundle deploy` command. This state can be used for deployment sequencing as it's `Version` field is monotonically increasing on each deployment. Currently, it only tracks files being synced as part of the deployment. This helps fix the issue with files not being removed during deployments on CI/CD as sync snapshot was never present there. Fixes #943 ## Tests Added E2E (regression) test for files removal on CI/CD --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-03-18 14:41:58 +00:00
Pieter Noordhuis	4a9a12af19	Retain location annotation when expanding globs for pipeline libraries (#1274 ) ## Changes We now keep location metadata associated with every configuration value. When expanding globs for pipeline libraries, this annotation was erased because of the conversion to/from the typed structure. This change modifies the expansion mutator to work with `dyn.Value` and retain the location of the value that holds the glob pattern. ## Tests Unit tests pass.	2024-03-11 21:59:36 +00:00
Pieter Noordhuis	c05c0cd941	Include `dyn.Path` as argument to the visit callback function (#1260 ) ## Changes This change means the callback supplied to `dyn.Foreach` can introspect the path of the value it is being called for. It also prepares for allowing visiting path patterns where the exact path is not known upfront. ## Tests Unit tests.	2024-03-07 13:56:50 +00:00
Andrew Nester	09d1846e13	Return `application_id` for service principal lookups (#1245 ) ## Changes Return ApplicationId for service principals lookups Fixes #1234 ## Tests Added (regression) tests	2024-03-04 16:12:10 +00:00
Andrew Nester	1588a14d07	Add correct tag value for models in dev mode (#1230 ) ## Changes Fixes #922 ## Tests Added regression test case	2024-02-22 14:52:49 +00:00
Pieter Noordhuis	a2a4948047	Allow use of variables references in primitive non-string fields (#1219 ) ## Changes This change enables the use of bundle variables for boolean, integer, and floating point fields. ## Tests * Unit tests. * I ran a manual test to confirm parameterizing the number of workers in a cluster definition works.	2024-02-19 10:44:51 +00:00
Pieter Noordhuis	87dd46a3f8	Use dynamic configuration model in bundles (#1098 ) ## Changes This is a fundamental change to how we load and process bundle configuration. We now depend on the configuration being represented as a `dyn.Value`. This representation is functionally equivalent to Go's `any` (it is variadic) and allows us to capture metadata associated with a value, such as where it was defined (e.g. file, line, and column). It also allows us to represent Go's zero values properly (e.g. empty string, integer equal to 0, or boolean false). Using this representation allows us to let the configuration model deviate from the typed structure we have been relying on so far (`config.Root`). We need to deviate from these types when using variables for fields that are not a string themselves. For example, using `${var.num_workers}` for an integer `workers` field was impossible until now (though not implemented in this change). The loader for a `dyn.Value` includes functionality to capture any and all type mismatches between the user-defined configuration and the expected types. These mismatches can be surfaced as validation errors in future PRs. Given that many mutators expect the typed struct to be the source of truth, this change converts between the dynamic representation and the typed representation on mutator entry and exit. Existing mutators can continue to modify the typed representation and these modifications are reflected in the dynamic representation (see `MarkMutatorEntry` and `MarkMutatorExit` in `bundle/config/root.go`). Required changes included in this change: * The existing interpolation package is removed in favor of `libs/dyn/dynvar`. * Functionality to merge job clusters, job tasks, and pipeline clusters are now all broken out into their own mutators. To be implemented later: * Allow variable references for non-string types. * Surface diagnostics about the configuration provided by the user in the validation output. * Some mutators use a resource's configuration file path to resolve related relative paths. These depend on `bundle/config/paths.Path` being set and populated through `ConfigureConfigFilePath`. Instead, they should interact with the dynamically typed configuration directly. Doing this also unlocks being able to differentiate different base paths used within a job (e.g. a task override with a relative path defined in a directory other than the base job). ## Tests * Existing unit tests pass (some have been modified to accommodate) * Integration tests pass	2024-02-16 19:41:58 +00:00
Andrew Nester	80670eceed	Added `bundle deployment bind` and `unbind` command (#1131 ) ## Changes Added `bundle deployment bind` and `unbind` command. This command allows to bind bundle-defined resources to existing resources in Databricks workspace so they become DABs-managed. ## Tests Manually + added E2E test	2024-02-14 18:04:45 +00:00
Andrew Nester	6edab93233	Added warning when trying to deploy bundle with `--fail-if-running` and running resources (#1163 ) ## Changes Deploying bundle when there are bundle resources running at the same time can be disruptive for jobs and pipelines in progress. With this change during deployment phase (before uploading any resources) if there is `--fail-if-running` specified DABs will check if there are any resources running and if so, will fail the deployment ## Tests Manual + add tests	2024-02-07 11:17:17 +00:00
Pieter Noordhuis	33c446dadd	Refactor library to artifact matching to not use pointers (#1172 ) ## Changes The approach to do this was: 1. Iterate over all libraries in all job tasks 2. Find references to local libraries 3. Store pointer to `compute.Library` in the matching artifact file to signal it should be uploaded This breaks down when introducing #1098 because we can no longer track unexported state across mutators. The approach in this PR performs the path matching twice; once in the matching mutator where we check if each referenced file has an artifacts section, and once during artifact upload to rewrite the library path from a local file reference to an absolute Databricks path. ## Tests Integration tests pass.	2024-02-05 15:29:45 +00:00
shreyas-goenka	cb3ad737f1	Add short_name helper function to bundle init templates (#1167 ) ## Changes Adds the short_name helper function. short_name is useful when templates do not want to print the full userName (typically email or service principal application-id) of the current user. ## Tests Integration test. Also adds integration tests for other helper functions that interact with the Databricks API.	2024-02-01 16:46:07 +00:00
Andrew Nester	0b3eeb8e54	Allow specifying executable in artifact section and skip bash from WSL (#1169 ) ## Changes Allow specifying executable in artifact section ``` artifacts: test: type: whl executable: bash ... ``` We also skip bash found on Windows if it's from WSL because it won't be correctly executed, see the issue above Fixes #1159	2024-02-01 14:10:04 +00:00
Andrew Nester	f269f8015d	Added `bundle generate pipeline` command (#1139 ) ## Changes Added `bundle generate pipeline` command Usage as the following ``` databricks bundle generate pipeline --existing-pipeline-id f3b8c580-0a88-4b55-xxxx-yyyyyyyyyy ``` ## Tests Manually + added E2E test	2024-01-25 11:35:14 +00:00
Ilia Babanov	9c3e4fda7c	Add "bundle summary" command (#1123 ) The plan is to use the new command in the Databricks VSCode extension to render "modified" UI state in the bundle resource tree elements, plus use resource IDs to generate links for the resources ### New revision - Renamed `remote-state` to `summary` - Added "modified statuses" to all resources. Currently we don't set "updated" status - it's either nothing, or created/deleted - Added tests for the `TerraformToBundle` command	2024-01-25 11:32:47 +00:00
Andrew Nester	1b6241746e	Use MockWorkspaceClient from SDK instead of WithImpl mocking (#1134 ) ## Changes Use MockWorkspaceClient from SDK instead of WithImpl mocking	2024-01-19 14:12:58 +00:00
Andrew Nester	70fe0e36ef	Added `databricks bundle generate job` command (#1043 ) ## Changes Now it's possible to generate bundle configuration for existing job. For now it only supports jobs with notebook tasks. It will download notebooks referenced in the job tasks and generate bundle YAML config for this job which can be included in larger bundle. ## Tests Running command manually Example of generated config ``` resources: jobs: job_128737545467921: name: Notebook job format: MULTI_TASK tasks: - task_key: as_notebook existing_cluster_id: 0704-xxxxxx-yyyyyyy notebook_task: base_parameters: bundle_root: /Users/andrew.nester@databricks.com/.bundle/job_with_module_imports/development/files notebook_path: ./entry_notebook.py source: WORKSPACE run_if: ALL_SUCCESS max_concurrent_runs: 1 ``` ## Tests Manual (on our last 100 jobs) + added end-to-end test ``` --- PASS: TestAccGenerateFromExistingJobAndDeploy (50.91s) PASS coverage: 61.5% of statements in ./... ok github.com/databricks/cli/internal/bundle 51.209s coverage: 61.5% of statements in ./... ```	2024-01-17 14:26:33 +00:00
Andrew Nester	4b01fff03d	Fixed instance pool resolving by name (#1102 ) ## Changes Fixed instance pool resolving by name ## Tests Added regression test	2024-01-05 10:50:53 +00:00
Andrew Nester	5fb40f9d07	Allow referencing bundle resources by name (#872 ) ## Changes Now we can define variables with values which reference different Databricks resources by name. When references like this, DABs automatically looks up the resource by this name and replaces the reference with ID of the resource referenced. Thus when the variable is used in the configuration it will contain the correct resolved ID of resource. The resolvers are code generated and thus DABs support referencing all resources which has `GetByName`-like methods in Go SDK. ### Example ``` variables: my_cluster_id: description: An existing cluster. lookup: cluster: "12.2 shared" resources: jobs: my_job: name: "My Job" tasks: - task_key: TestTask existing_cluster_id: ${var.my_cluster_id} targets: dev: variables: my_cluster_id: lookup: cluster: "dev-cluster" ``` ## Tests Added unit test + manual testing --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-01-04 21:04:42 +00:00
Lennart Kats (databricks)	167deec8c3	Change recommended production deployment path from /Shared to /Users (#1091 ) ## Changes This PR changes the default and `mode: production` recommendation to target `/Users` for deployment. Previously, we used `/Shared`, but because of a lack of POSIX-like permissions in WorkspaceFS this meant that files inside would be readable and writable by other users in the workspace. Detailed change: * `default-python` no longer uses a path that starts with `/Shared` * `mode: production` no longer requires a path that starts with `/Shared` ## Related PRs Docs: https://github.com/databricks/docs/pull/14585 Examples: https://github.com/databricks/bundle-examples/pull/17 ## Tests * Manual tests * Template unit tests (with an extra check to avoid /Shared)	2024-01-02 19:58:24 +00:00
Andrew Nester	ac37a592f1	Added exec.NewCommandExecutor to execute commands with correct interpreter (#1075 ) ## Changes Instead of handling command chaining ourselves, we execute passed commands as-is by storing them, in temp file and passing to correct interpreter (bash or cmd) based on OS. Fixes #1065 ## Tests Added unit tests	2023-12-21 15:45:23 +00:00
shreyas-goenka	6002f49c87	Move bundle schema update to an internal module (#1012 ) ## Changes This PR: 1. Move code to load bundle JSON Schema descriptions from the OpenAPI spec to an internal Go module 2. Remove command line flags from the `bundle schema` command. These flags were meant for internal processes and at no point were meant for customer use. 3. Regenerate `bundle_descriptions.json` 4. Add support for `bundle: "deprecated"`. The `environments` field is tagged as deprecated in this PR and consequently will no longer be a part of the bundle schema. ## Tests Tested by regenerating the CLI against its current OpenAPI spec (as defined in `__openapi_sha`). The `bundle_descriptions.json` in this PR was generated from the code generator. Manually checked that the autocompletion / descriptions from the new bundle schema are correct.	2023-12-06 10:45:18 +00:00
shreyas-goenka	677926b78b	Fix panic when bundle auth resolution fails (#1002 ) ## Changes CLI would panic if an invalid bundle auth is setup when running CLI commands. This PR removes the panic and shows the error message directly instead. ## Tests The CWD is a bundle with: ``` workspace: profile: DEFAULT ``` Before: ``` shreyas.goenka@THW32HFW6T bundle-playground % cli clusters list panic: resolve: /Users/shreyas.goenka/.databrickscfg has no DEFAULT profile configured. Config: profile=DEFAULT goroutine 1 [running]: ``` After: ``` shreyas.goenka@THW32HFW6T bundle-playground % cli clusters list Error: cannot resolve bundle auth configuration: resolve: /Users/shreyas.goenka/.databrickscfg has no DEFAULT profile configured. Config: profile=DEFAULT ``` ``` shreyas.goenka@THW32HFW6T bundle-playground % DATABRICKS_CONFIG_FILE=/dev/null cli bundle deploy Error: cannot resolve bundle auth configuration: resolve: /dev/null has no DEFAULT profile configured. Config: profile=DEFAULT, config_file=/dev/null. Env: DATABRICKS_CONFIG_FILE ```	2023-11-30 14:28:01 +00:00
Andrew Nester	4d8d825746	Fixed panic when job has trigger and in development mode (#1026 ) ## Changes Fixed panic when job has trigger and in development mode	2023-11-29 16:32:42 +00:00
Andrew Nester	833746cbdd	Do not replace pipeline libraries if there are no matches for pattern (#1021 ) ## Changes If there are no matches when doing Glob call for pipeline library defined, leave the entry as is. The next mutators in the chain will detect that file is missing and the error will be more user friendly. Before the change ``` Starting resource deployment Error: terraform apply: exit status 1 Error: cannot create pipeline: libraries must contain at least one element ``` After ``` Error: notebook ./non-existent not found ``` ## Tests Added regression unit tests	2023-11-29 13:20:13 +00:00
Andrew Nester	fa89db57e9	Enable `spark_jar_task` with local JAR libraries (#993 ) ## Changes Previously local JAR paths were transformed to remote path during initialisation and thus artifact building logic did not recognise such libraries as local to be handled and uploaded. Now it's possible to use spark_jar_tasks with local JAR libraries on 14.1+ DBR clusters Example configuration ``` bundle: name: spark-jar workspace: host: *** artifacts: my_java_code: path: ./sample-java build: "javac PrintArgs.java && jar cvfm PrintArgs.jar META-INF/MANIFEST.MF PrintArgs.class" files: - source: "/Users/andrew.nester/dabs/wheel/sample-java/PrintArgs.jar" resources: jobs: print_args: name: "Print Args" tasks: - task_key: Print new_cluster: num_workers: 0 spark_version: 14.2.x-scala2.12 node_type_id: i3.xlarge spark_conf: "spark.databricks.cluster.profile": "singleNode" "spark.master": "local[*]" custom_tags: ResourceClass: "SingleNode" spark_jar_task: main_class_name: PrintArgs libraries: - jar: ./sample-java/PrintArgs.jar ``` ## Tests Manually running `bundle deploy and bundle run`	2023-11-21 10:15:09 +00:00
Pieter Noordhuis	489d6fa1b8	Replace direct calls with `bundle.Apply` (#990 ) ## Changes Some test call sites called directly into the mutator's `Apply` function instead of `bundle.Apply`. Calling into `bundle.Apply` is preferred because that's where we can run pre/post logic common across all mutators. ## Tests Pass.	2023-11-15 14:19:18 +00:00
Pieter Noordhuis	d80c35f66a	Rename variable `bundle -> b` (#989 ) ## Changes All calls to apply a mutator must go through `bundle.Apply`. This conflicts with the existing use of the variable `bundle`. This change un-aliases the variable from the package name by renaming all variables to `b`. ## Tests Pass.	2023-11-15 14:03:36 +00:00
shreyas-goenka	0c837e5772	Make `file_path` and `artifact_path` fields consistent with json tag (#987 ) ## Changes This PR: 1. Renames `FilesPath` -> `FilePath` and `ArtifactsPath` -> `ArtifactPath` in the bundle and metadata configuration to make them consistant with the json tags. 2. Fixes development / production mode error messages to point to `file_path` and `artifact_path` ## Tests Existing unit tests. This is a strightforward renaming of the fields.	2023-11-15 13:37:26 +00:00
Lennart Kats (databricks)	0ab125c109	Allow jobs to be manually unpaused in development mode (#885 ) Partly mitigates #859. It's still not clear to me if there is an actual use case or if users are trying to use "development" mode jobs for production, but making this overridable is reasonable. Beyond this fix I think we could do something in the Jobs schedule UI, but it would help to better understand the use case (or actual reason of confusion). I expect we should hint customers to move away from dev mode rather than unpause.	2023-11-13 19:50:39 +00:00
Andrew Nester	f3db42e622	Added support for top-level permissions (#928 ) ## Changes Now it's possible to define top level `permissions` section in bundle configuration and permissions defined there will be applied to all resources defined in the bundle. Supported top-level permission levels: CAN_MANAGE, CAN_VIEW, CAN_RUN. Permissions are applied to: Jobs, DLT Pipelines, ML Models, ML Experiments and Model Service Endpoints ``` bundle: name: permissions workspace: host: *** permissions: - level: CAN_VIEW group_name: test-group - level: CAN_MANAGE user_name: user@company.com - level: CAN_RUN service_principal_name: 123456-abcdef ``` ## Tests Added corresponding unit tests + ran `bundle validate` and `bundle deploy` manually	2023-11-13 11:29:40 +00:00
Pieter Noordhuis	7847388f95	Initialize variable definitions that are defined without properties (#966 ) ## Changes We can debate whether or not variable definitions without properties are valid, but in no case should this panic the CLI. Fixes #934. ## Tests Unit.	2023-11-08 11:01:14 +00:00
Michał Szafrański	10291b0e13	Bundle path rewrites for dbt and SQL file tasks (#962 ) ## Changes Support path rewrites for Dbt and SQL file job taks. <!-- Summary of your changes that are easy to understand --> ## Tests * Added unit test <!-- How is this tested? -->	2023-11-07 20:00:09 +00:00
shreyas-goenka	5a8cd0c5bc	Persist deployment metadata in WSFS (#845 ) ## Changes This PR introduces a metadata struct that stores a subset of bundle configuration that we wish to expose to other Databricks services that wish to integrate with bundles. This metadata file is uploaded to a file `${bundle.workspace.state_path}/metadata.json` in the WSFS destination of the bundle deployment. Documentation for emitted metadata fields: * `version`: Version for the metadata file schema * `config.bundle.git.branch`: Name of the git branch the bundle was deployed from. * `config.bundle.git.origin_url`: URL for git remote "origin" * `config.bundle.git.bundle_root_path`: Relative path of the bundle root from the root of the git repository. Is set to "." if they are the same. * `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this bundle was deployed from. Note, the deployment might not exactly match this commit version if there are changes that have not been committed to git at deploy time, * `file_path`: Path in workspace where we sync bundle files to. * `resources.jobs.[job-ref].id`: Id of the job * `resources.jobs.[job-ref].relative_path`: Relative path of the yaml config file from the bundle root where this job was defined. Example metadata object when bundle root and git root are the same: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "master", "origin_url": "www.host.com", "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0", "bundle_root_path": "." } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` Example metadata when the git root is one level above the bundle repo: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "dev-branch", "origin_url": "www.my-repo.com", "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b", "bundle_root_path": "pipeline-progress" } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` This unblocks integration to the jobs break glass UI for bundles. ## Tests Unit tests and integration tests.	2023-10-27 12:55:43 +00:00
Andrew Nester	6f22ae8696	Use UserName instead of Id to check if identity used is a service principal (#924 ) ## Changes Use UserName instead of Id to check if identity used is a service principal	2023-10-26 14:58:16 +00:00
Pieter Noordhuis	6e21ced54a	Consolidate bundle configuration loader function (#918 ) ## Changes There were two functions related to loading a bundle configuration file; one as a package function and one as a member function on the configuration type. Loading the same configuration object twice doesn't make sense and therefore we can consolidate to only using the package function. The package function would scan for known file names if the specified path was a directory. This functionality was not in use because the top-level bundle loader figures out the filename itself as of #580. ## Tests Pass.	2023-10-25 12:55:56 +00:00
Pieter Noordhuis	486bf59627	Move bundle configuration filename code (#917 ) ## Changes This is unrelated to the config root so belongs in a separate file (this was added in #580). ## Tests n/a	2023-10-25 09:54:39 +00:00
Pieter Noordhuis	d4be40520c	Resolve configuration before performing verification (#890 ) ## Changes If a bundle configuration specifies a workspace host, and the user specifies a profile to use, we perform a check to confirm that the workspace host in the bundle configuration and the workspace host from the profile are identical. If they are not, we return an error. The check was introduced in #571. Previously, the code included an assumption that the client configuration was already loaded from the environment prior to performing the check. This was not the case, and as such if the user intended to use a non-default path to `.databrickscfg`, this path was not used when performing the check. The fix does the following: * Resolve the configuration prior to performing the check. * Don't treat the configuration file not existing as an error. * Add unit tests. Fixes #884. ## Tests Unit tests and manual confirmation.	2023-10-20 13:10:31 +00:00
Arpit Jasapara	24cc67563e	Support Unity Catalog Registered Models in bundles (#846 ) ## Changes <!-- Summary of your changes that are easy to understand --> Add UC Registered Models support to Databricks Asset Bundles as new resource `registered_model`. Also added UC Permission support via new resource `grant`. ## Tests <!-- How is this tested? --> Tested via unit tests and manual testing with [example PR](https://github.com/databricks/bundle-examples-internal/pull/80) and [custom Terraform provider](https://github.com/databricks/terraform-provider-databricks/pull/2771). <img width="698" alt="Screenshot 2023-10-08 at 4 57 23 PM" src="https://github.com/databricks/cli/assets/87999496/bcf605a9-7894-443b-865a-f7e240037815"> <img width="1109" alt="Screenshot 2023-10-08 at 4 56 47 PM" src="https://github.com/databricks/cli/assets/87999496/e4d6e424-cd70-4809-8843-6939ed2e172f"> <img width="1091" alt="Screenshot 2023-10-08 at 4 56 57 PM" src="https://github.com/databricks/cli/assets/87999496/88ebaabb-67db-4a11-88a5-df087e2e41c0"> --------- Signed-off-by: Arpit Jasapara <arpit.jasapara@databricks.com> Co-authored-by: Andrew Nester <andrew.nester.dev@gmail.com> Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2023-10-16 15:32:49 +00:00
Andrew Nester	30c4d2e8a7	Fixed merging task libraries from targets (#868 ) ## Changes Previous we (erroneously) kept the reference and merged into the original tasks and not the copies which we later used to replace existing tasks. Thus the merging of slices and references was incorrect. Fixes #864 ## Tests Added a regression test	2023-10-16 08:48:32 +00:00
hectorcast-db	36f30c8b47	Update Go SDK to 0.23.0 and use custom marshaller (#772 ) ## Changes Update Go SDK to 0.23.0 and use custom marshaller. ## Tests * Run unit tests * Run nightly * Manual test: ``` ./cli jobs create --json @myjob.json ``` with ``` { "name": "my-job-marshal-test-go", "tasks": [{ "task_key": "testgomarshaltask", "new_cluster": { "num_workers": 0, "spark_version": "10.4.x-scala2.12", "node_type_id": "Standard_DS3_v2" }, "libraries": [ { "jar": "dbfs:/max/jars/exampleJarTask.jar" } ], "spark_jar_task": { "main_class_name": "com.databricks.quickstart.exampleTask" } }] } ``` Main branch: ``` Error: Cluster validation error: Missing required field: settings.cluster_spec.new_cluster.size ``` This branch: ``` { "job_id":<jobid> } ``` --------- Co-authored-by: Miles Yucht <miles@databricks.com>	2023-10-16 06:56:06 +00:00
Andrew Nester	943ea89728	Allow target overrides for sync section (#856 ) ## Changes Allow target overrides for sync section ## Tests Added tests	2023-10-10 15:18:18 +00:00
Andrew Nester	8d8de3f509	Fixed using repo files as pipeline libraries (#847 ) ## Changes Fixed using repo files as pipeline libraries ## Tests Added regression test	2023-10-09 10:10:28 +00:00
Andrew Nester	aa54a8665a	Added support for glob patterns in pipeline libraries section (#833 ) ## Changes Now it's possible to specify glob pattern in pipeline libraries section and DAB will add all matched files as libraries ``` pipelines: dummy: name: " DLT with Python files" target: "dlt_python_files" libraries: - file: path: ./*.py ``` ## Tests Added unit test	2023-10-04 13:23:13 +00:00
Andrew Nester	9b6a847178	Mark artifacts properties as optional (#834 ) ## Changes Mark artifacts properties as optional Fixes #816	2023-10-03 13:59:28 +00:00
Pieter Noordhuis	f1b068cefe	Use normalized short name for tag value in development mode (#821 ) ## Changes The jobs backend propagates job tags to the underlying cloud provider's resources. As such, they need to match the constraints a cloud provider places on tag values. The display name can contain anything. With this change, we modify the tag value to equal the short name as used in the name prefix. Additionally, we leverage tag normalization as introduced in #819 to make sure characters that aren't accepted are removed before using the value as a tag value. This is a new stab at #810 and should completely eliminate this class of problems. ## Tests Tests pass.	2023-10-02 06:58:51 +00:00
Pieter Noordhuis	30b4b8ce58	Allow digits in the generated short name (#820 ) ## Changes Digits were previously replaced by `_`. ## Tests Additional test cases with uncommon variations of email addresses.	2023-09-29 06:58:40 +00:00
Serge Smertin	7171874db0	Added `process.Background()` and `process.Forwarded()` (#804 ) ## Changes This PR adds higher-level wrappers for calling subprocesses. One of the steps to get https://github.com/databricks/cli/pull/637 in, as previously discussed. The reason to add `process.Forwarded()` is to proxy Python's `input()` calls from a child process seamlessly. Another use-case is plugging in `less` as a pager for the list results. ## Tests `make test`	2023-09-27 09:04:44 +00:00
Andrew Nester	0daa0022af	Make a notebook wrapper for Python wheel tasks optional (#797 ) ## Changes Instead of always using notebook wrapper for Python wheel tasks, let's make this an opt-in option. Now by default Python wheel tasks will be deployed as is to Databricks platform. If notebook wrapper required (DBR < 13.1 or other configuration differences), users can provide a following experimental setting ``` experimental: python_wheel_wrapper: true ``` Fixes #783, https://github.com/databricks/databricks-asset-bundles-dais2023/issues/8 ## Tests Added unit tests. Integration tests passed for both cases ``` helpers.go:163: [databricks stdout]: Hello from my func helpers.go:163: [databricks stdout]: Got arguments: helpers.go:163: [databricks stdout]: ['my_test_code', 'one', 'two'] ... Bundle remote directory is */.bundle/ac05d5e8-ed4b-4e34-b3f2-afa73f62b021 Deleted snapshot file at /var/folders/nt/xjv68qzs45319w4k36dhpylc0000gp/T/TestAccPythonWheelTaskDeployAndRunWithWrapper3733431114/001/.databricks/bundle/default/sync-snapshots/cac1e02f3941a97b.json Successfully deleted files! --- PASS: TestAccPythonWheelTaskDeployAndRunWithWrapper (214.18s) PASS coverage: 93.5% of statements in ./... ok github.com/databricks/cli/internal/bundle 214.495s coverage: 93.5% of statements in ./... ``` ``` helpers.go:163: [databricks stdout]: Hello from my func helpers.go:163: [databricks stdout]: Got arguments: helpers.go:163: [databricks stdout]: ['my_test_code', 'one', 'two'] ... Bundle remote directory is */.bundle/0ef67aaf-5960-4049-bf1d-dc9e29157421 Deleted snapshot file at /var/folders/nt/xjv68qzs45319w4k36dhpylc0000gp/T/TestAccPythonWheelTaskDeployAndRunWithoutWrapper2340216760/001/.databricks/bundle/default/sync-snapshots/edf0b322cee93b13.json Successfully deleted files! --- PASS: TestAccPythonWheelTaskDeployAndRunWithoutWrapper (192.36s) PASS coverage: 93.5% of statements in ./... ok github.com/databricks/cli/internal/bundle 195.130s coverage: 93.5% of statements in ./... ```	2023-09-26 14:32:20 +00:00
Pieter Noordhuis	ee30277119	Enable target overrides for pipeline clusters (#792 ) ## Changes This is a follow-up to #658 and #779 for jobs. This change applies label normalization the same way the backend does. ## Tests Unit and config loading tests.	2023-09-21 19:21:20 +00:00
Andrew Nester	43e2eefc27	Enable environment overrides for job tasks (#779 ) ## Changes Follow up for https://github.com/databricks/cli/pull/658 When a job definition has multiple job tasks using the same key, it's considered invalid. Instead we should combine those definitions with the same key into one. This is consistent with environment overrides. This way, the override ends up in the original job tasks, and we've got a clear way to put them all together. ## Tests Added unit tests	2023-09-18 14:13:50 +00:00
Andrew Nester	953dcb4972	Added support for experimental scripts section (#632 ) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: * experimental: scripts: prebuild: \| echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: \| echo 'Checking go version...' go version postdeploy: \| echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```	2023-09-14 10:14:13 +00:00
shreyas-goenka	373f441eb2	Use clearer error message when no interpolation value is found. (#764 ) ## Changes This PR makes the error message clearer for when interpolation fails. ## Tests Existing unit test and manually	2023-09-11 15:23:25 +00:00
Pieter Noordhuis	4ccc70aeac	Consolidate environment variable interaction (#747 ) ## Changes There are a couple places throughout the code base where interaction with environment variables takes place. Moreover, more than one of these would try to read a value from more than one environment variable as fallback (for backwards compatibility). This change consolidates those accesses. The majority of diffs in this change are mechanical (i.e. add an argument or replace a call). This change: * Moves common environment variable lookups for bundles to `bundles/env`. * Adds a `libs/env` package that wraps `os.LookupEnv` and `os.Getenv` and allows for overrides to take place in a `context.Context`. By scoping overrides to a `context.Context` we can avoid `t.Setenv` in testing and unlock parallel test execution for integration tests. * Updates call sites to pass through a `context.Context` where needed. * For bundles, introduces `DATABRICKS_BUNDLE_ROOT` as new primary variable instead of `BUNDLE_ROOT`. This was the last environment variable that did not use the `DATABRICKS_` prefix. ## Tests Unit tests pass.	2023-09-11 08:18:43 +00:00
shreyas-goenka	9a51f72f0b	Make bundle and sync fields optional (#757 ) ## Changes This PR: 1. Makes the bundle and sync properties optional in the generated schema. 2. Fixes schema generation that was broken due to a rogue "description" field in the bundle docs. ## Tests Tested manually. The generated schema no longer has "bundle" and "sync" marked as required.	2023-09-11 08:16:22 +00:00
Andrew Nester	b5d033d154	List available targets when incorrect target passed (#756 ) ## Changes List available targets when incorrect target passed ## Tests ``` andrew.nester@HFW9Y94129 wheel % databricks bundle validate -t incorrect Error: incorrect: no such target. Available targets: prod, development ```	2023-09-08 15:37:55 +00:00
Andrew Nester	18a5b05d82	Apply Python wheel trampoline if workspace library is used (#755 ) ## Changes Workspace library will be detected by trampoline in 2 cases: - User defined to use local wheel file - User defined to use remote wheel file from Workspace file system In both of these cases we should correctly apply Python trampoline ## Tests Added a regression test (also covered by Python e2e test)	2023-09-08 13:45:21 +00:00
Andrew Nester	e64463ba47	Fixed marking libraries from DBFS as remote (#750 ) ## Changes Fixed marking libraries from DBFS as remote ## Tests Updated unit tests to catch the regression	2023-09-08 09:53:57 +00:00
Arpit Jasapara	50eaf16307	Support Model Serving Endpoints in bundles (#682 ) ## Changes <!-- Summary of your changes that are easy to understand --> Add Model Serving Endpoints to Databricks Bundles ## Tests <!-- How is this tested? --> Unit tests and manual testing via https://github.com/databricks/bundle-examples-internal/pull/76 <img width="1570" alt="Screenshot 2023-08-28 at 7 46 23 PM" src="https://github.com/databricks/cli/assets/87999496/7030ebd8-b0e2-4ad1-a9e3-5ff8454f1175"> <img width="747" alt="Screenshot 2023-08-28 at 7 47 01 PM" src="https://github.com/databricks/cli/assets/87999496/fb9b54d7-54e2-43ce-9148-68fb620c809a"> Signed-off-by: Arpit Jasapara <arpit.jasapara@databricks.com>	2023-09-07 21:54:31 +00:00
Lennart Kats (databricks)	f9e521b43e	databricks bundle init template v2: optional stubs, DLT support (#700 ) ## Changes This follows up on https://github.com/databricks/cli/pull/686. This PR makes our stubs optional + it adds DLT stubs: ``` $ databricks bundle init Template to use [default-python]: default-python Unique name for this project [my_project]: my_project Include a stub (sample) notebook in 'my_project/src' [yes]: yes Include a stub (sample) DLT pipeline in 'my_project/src' [yes]: yes Include a stub (sample) Python package 'my_project/src' [yes]: yes ✨ Successfully initialized template ``` ## Tests Manual testing, matrix tests. --------- Co-authored-by: Andrew Nester <andrew.nester@databricks.com> Co-authored-by: PaulCornellDB <paul.cornell@databricks.com> Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2023-09-06 09:52:31 +00:00
Lennart Kats (databricks)	947d5b1e5c	Fix IsServicePrincipal() only working for workspace admins (#732 ) ## Changes The latest rendition of isServicePrincipal no longer worked for non-admin users as it used the "principals get" API. This new version relies on the property that service principals always have a UUID as their userName. This was tested with the eng-jaws principal (8b948b2e-d2b5-4b9e-8274-11b596f3b652).	2023-09-05 11:20:55 +00:00
Andrew Nester	83443bae8d	Make resource and artifact paths in bundle config relative to config folder (#708 ) # Warning: breaking change ## Changes Instead of having paths in bundle config files be relative to bundle root even if the config file is nested, this PR makes such paths relative to the folder where the config is located. When bundle is initialised, these paths will be transformed to relative paths based on bundle root. For example, we have file structure like this ``` - mybundle \| - bundle.yml \| - subfolder \| -- resource.yml \| -- my.whl ``` Previously, we had to reference `my.whl` in resource.yml like this, which was confusing because resource.yml is in the same subfolder ``` sync: include: - ./subfolder/.whl ... tasks: - task_key: name libraries: - whl: ./subfolder/my.whl ... ``` After the change we can reference it like this (which is in line with the current behaviour for notebooks) ``` sync: include: - ./.whl ... tasks: - task_key: name libraries: - whl: ./my.whl ... ``` ## Tests Existing `translate_path_tests` successfully passed after refactoring. Added a couple of uses cases for `Libraries` paths. Added a bundle config tests with include config and sync section --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2023-09-04 09:55:01 +00:00
Lennart Kats (databricks)	e22fd73b7d	Cleanup after previous PR comments (#724 ) ## Changes @pietern this addresses a comment from you on a recently merged PR. It also updates settings.json based on the settings VS Code adds as soon as you edit a notebook.	2023-09-04 07:07:17 +00:00
Lennart Kats (databricks)	707fd6f617	Cleanup after "Add a foundation for built-in templates" (#707 ) ## Changes Add some cleanup based on @pietern's comments on https://github.com/databricks/cli/pull/685	2023-08-30 14:01:08 +00:00
Andrew Nester	12368e3382	Added transformation mutator for Python wheel task for them to work on DBR <13.1 (#635 ) ## Changes *Note: this PR relies on sync.include functionality from here: https://github.com/databricks/cli/pull/671* Added transformation mutator for Python wheel task for them to work on DBR <13.1 Using wheels upload to Workspace file system as cluster libraries is not supported in DBR < 13.1 In order to make Python wheel work correctly on DBR < 13.1 we do the following: 1. Build and upload python wheel as usual 2. Transform python wheel task into special notebook task which does the following a. Installs all necessary wheels with %pip magic b. Executes defined entry point with all provided parameters 3. Upload this notebook file to workspace file system 4. Deploy transformed job task This is also beneficial for executing on existing clusters because this notebook always reinstall wheels so if there are any changes to the wheel package, they are correctly picked up ## Tests bundle.yml ```yaml bundle: name: wheel-task workspace: host: ** resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "" python_wheel_task: package_name: "my_test_code" entry_point: "run" parameters: ["first argument","first value","second argument","second value"] libraries: - whl: ./dist/.whl ``` Output ``` andrew.nester@HFW9Y94129 wheel % databricks bundle run test_job Run URL: *** 2023-08-03 15:58:04 "[default] My Wheel Job" TERMINATED SUCCESS Output: ======= Task TestTask: Hello from my func Got arguments v1: ['python', 'first argument', 'first value', 'second argument', 'second value'] ```	2023-08-30 12:21:39 +00:00
Lennart Kats (databricks)	861f33d376	Support cluster overrides with cluster_key and compute_key (#696 ) ## Changes Support `databricks bundle deploy --compute-id my_all_purpose_cluster` in two missing cases.	2023-08-28 07:51:35 +00:00
Lennart Kats (databricks)	a5b86093ec	Add a foundation for built-in templates (#685 ) ## Changes This pull request extends the templating support in preparation of a new, default template (WIP, https://github.com/databricks/cli/pull/686): * builtin templates that can be initialized using e.g. `databricks bundle init default-python` * builtin templates are embedded into the executable using go's `embed` functionality, making sure they're co-versioned with the CLI * new helpers to get the workspace name, current user name, etc. help craft a complete template * (not enabled yet) when the user types `databricks bundle init` they can interactively select the `default-python` template And makes two tangentially related changes: * IsServicePrincipal now uses the "users" API rather than the "principals" API, since the latter is too slow for our purposes. * mode: prod no longer requires the 'target.prod.git' setting. It's hard to set that from a template. (Pieter is planning an overhaul of warnings support; this would be one of the first warnings we show.) The actual `default-python` template is maintained in a separate PR: https://github.com/databricks/cli/pull/686 ## Tests Unit tests, manual testing	2023-08-25 09:03:42 +00:00
Andrew Nester	4ee926b885	Added run_as section for bundle configuration (#692 ) ## Changes Added run_as section for bundle configuration. This section allows to define an user name or service principal which will be applied as an execution identity for jobs and DLT pipelines. In the case of DLT, identity defined in `run_as` will be assigned `IS_OWNER` permission on this pipeline. ## Tests Added unit tests for configuration. Also ran deploy for the following bundle configuration ``` bundle: name: "run_as" run_as: # service_principal_name: "f7263fcc-56d0-4981-8baf-c2a45296690b" user_name: "lennart.kats@databricks.com" resources: pipelines: andrew_pipeline: name: "Andrew Nester pipeline" libraries: - notebook: path: ./test.py jobs: job_one: name: Job One tasks: - task_key: "task" new_cluster: num_workers: 1 spark_version: 13.2.x-snapshot-scala2.12 node_type_id: i3.xlarge runtime_engine: PHOTON notebook_task: notebook_path: "./test.py" ```	2023-08-23 16:47:07 +00:00

1 2 3 4 5

227 Commits