databricks-cli

Commit Graph

Author	SHA1	Message	Date
Lennart Kats (databricks)	e885794722	Show actionable errors for collaborative deployment scenarios (#1386 ) ## Changes This adds diagnostics for collaborative (production) deployment scenarios, including: - Bob deploys a bundle that is normally deployed by Alice, but this fails because Bob can't write to `/Users/Alice/.bundle`. - Charlie deploys a bundle that is normally deployed by Alice, but this fails because he can't create a new pipeline where Alice would be the owner. - Alice deploys a bundle where she didn't list herself as one of the CAN_MANAGE users in permissions. That can work, but is probably a mistake. ## Tests Unit tests, manual testing.	2024-10-10 11:18:23 +00:00
Andrew Nester	a8cff48c0b	Always prepend bundle remote paths with /Workspace (#1724 ) ## Changes Due to platform changes, all libraries, notebooks and etc. paths used in Databricks must be started with either /Workspace or /Volumes prefix. This PR makes sure that all bundle paths are correctly prefixed. Note: this change is a breaking change if user previously configured and used `/Workspace/Workspace` folder in their workspace file system or having `/Workspace/${workspace.root_path}...` pattern configured anywhere in their bundle config Fixes: #1751 AI: - [x] Scan DABs config and error out on `/Workspace/${workspace.root_path}...` pattern usage ## Tests Added unit tests --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-10-02 15:34:00 +00:00
Pieter Noordhuis	56cd96cb93	Move trampoline code into trampoline package (#1793 ) ## Changes Doing this to make room for PyDABs under `bundle/python`. ## Tests n/a	2024-09-27 09:32:54 +00:00
Ilia Babanov	ac80d3dfcb	Add verbose flag to the "bundle deploy" command (#1774 ) ## Changes - Extract sync output logic from `cmd/sync` into `lib/sync` - Add hidden `verbose` flag to the `bundle deploy` command, it's false by default and hidden from the `--help` output - Pass output handler to the `deploy/files/upload` mutator if the verbose option is true The was an idea to use in-place output overriding each past file sync event in the output, bit that wont work for the extension, since it doesn't display deploy logs in the terminal. Example output: ``` ~/tmp/defpy: ~/cli/cli bundle deploy --sync-progress Building defpy... Uploading defpy-0.0.1+20240917.112755-py3-none-any.whl... Uploading bundle files to /Users/ilia.babanov@databricks.com/.bundle/defpy/dev/files... Action: PUT: requirements-dev.txt, resources/defpy_pipeline.yml, pytest.ini, src/defpy/main.py, src/defpy/__init__.py, src/dlt_pipeline.ipynb, tests/main_test.py, src/notebook.ipynb, setup.py, resources/defpy_job.yml, .vscode/extensions.json, .vscode/settings.json, fixtures/.gitkeep, .vscode/__builtins__.pyi, README.md, .gitignore, databricks.yml Uploaded tests Uploaded resources Uploaded fixtures Uploaded .vscode Uploaded src/defpy Uploaded requirements-dev.txt Uploaded .gitignore Uploaded fixtures/.gitkeep Uploaded src/defpy/__init__.py Uploaded databricks.yml Uploaded README.md Uploaded setup.py Uploaded .vscode/__builtins__.pyi Uploaded .vscode/extensions.json Uploaded src/dlt_pipeline.ipynb Uploaded .vscode/settings.json Uploaded resources/defpy_job.yml Uploaded pytest.ini Uploaded src/defpy/main.py Uploaded tests/main_test.py Uploaded resources/defpy_pipeline.yml Uploaded src/notebook.ipynb Initial Sync Complete Deploying resources... Updating deployment state... Deployment complete! ``` Output example in the extension: <img width="1843" alt="Screenshot 2024-09-19 at 11 07 48" src="https://github.com/user-attachments/assets/0fafd095-cdc6-44b8-b482-27a38ada0330"> ## Tests Manually for the `sync` and `bundle deploy` commands + vscode extension sync and deploy flows	2024-09-23 10:09:11 +00:00
shreyas-goenka	a27c24a397	Add prompt when a pipeline recreation happens (#1672 ) ## Changes DLT pipeline recreations are destructive. They can lead to lost history of previous updates, outage of the tables temporarily and are potentially computationally expensive. Thus we make a breaking change where a prompt is shown to the user if there configuration changes will lead to a DLT recreation. Users can skip the prompt by specifying the `--auto-approve` flag. This PR also fixes an issue with our test runner where logs from the cmdio.Logger would not get propagated to the reader returned by our cobra test runner. ## Tests Manually, and new unit and integration tests. ``` ➜ bundle-playground-3 cli bundle deploy Uploading bundle files to /Users/63ec021d-b0c6-49c0-93a0-5123953a1cb2/.bundle/test/development/files... The following DLT pipelines will be recreated. Underlying tables will be unavailable for a transient period until the newly recreated pipelines are run once successfully. History of previous pipeline update runs will be lost because of recreation: recreate pipeline foo Would you like to proceed? [y/n]: n Deployment cancelled! ```	2024-09-04 11:11:47 +00:00
Pieter Noordhuis	6e8cd835a3	Add paths field to bundle sync configuration (#1694 ) ## Changes This field allows a user to configure paths to synchronize to the workspace. Allowed values are relative paths to files and directories anchored at the directory where the field is set. If one or more values traverse up the directory tree (to an ancestor of the bundle root directory), the CLI will dynamically determine the root path to use to ensure that the file tree structure remains intact. For example, given a `databricks.yml` in `my_bundle` that includes: ```yaml sync: paths: - ../common - . ``` Then upon synchronization, the workspace will look like: ``` . ├── common │ └── lib.py └── my_bundle ├── databricks.yml └── notebook.py ``` If not set behavior remains identical. ## Tests * Newly added unit tests for the mutators and under `bundle/tests`. * Manually confirmed a bundle without this configuration works the same. * Manually confirmed a bundle with this configuration works.	2024-08-21 15:33:25 +00:00
Lennart Kats (databricks)	78d0ac5c6a	Add configurable presets for name prefixes, tags, etc. (#1490 ) ## Changes This adds configurable transformations based on the transformations currently seen in `mode: development`. Example databricks.yml showcasing how some transformations: ``` bundle: name: my_bundle targets: dev: presets: prefix: "myprefix_" # prefix all resource names with myprefix_ pipelines_development: true # set development to true by default for pipelines trigger_pause_status: PAUSED # set pause_status to PAUSED by default for all triggers and schedules jobs_max_concurrent_runs: 10 # set max_concurrent runs to 10 by default for all jobs tags: dev: true ``` ## Tests * Existing process_target_mode tests that were adapted to use this new code * Unit tests specific for the new mutator * Unit tests for config loading and merging * Manual e2e testing	2024-08-19 18:18:50 +00:00
Andrew Nester	48ff18e5fc	Upload local libraries even if they don't have artifact defined (#1664 ) ## Changes Previously for all the libraries referenced in configuration DABs made sure that there is corresponding artifact section. But this is not really necessary and flexible, because local libraries might be built outside of dabs context. It also created difficult to follow logic in code where we back referenced libraries to artifacts which was difficult to fllow This PR does 3 things: 1. Allows all local libraries referenced in DABs config to be uploaded to remote 2. Simplifies upload and glob references expand logic by doing this in single place 3. Speed things up by uploading library only once and doing this in parallel ## Tests Added unit + integration tests + made sure that change is backward compatible (no changes in existing tests) --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-08-14 09:03:44 +00:00
shreyas-goenka	7ae80de351	Stop tracking file path locations in bundle resources (#1673 ) ## Changes Since locations are already tracked in the dynamic value tree, we no longer need to track it at the resource/artifact level. This PR: 1. Removes use of `paths.Paths`. Uses dyn.Location instead. 2. Refactors the validation of resources not being empty valued to be generic across all resource types. ## Tests Existing unit tests.	2024-08-13 12:50:15 +00:00
shreyas-goenka	1b984b4f62	Skip pushing Terraform state after destroy (#1667 ) ## Changes Following up https://github.com/databricks/cli/pull/1583#discussion_r1681126323. We can skip pushing because right after `root_path` is deleted, making this a no-op effectively. ## Tests	2024-08-12 09:19:54 +00:00
Pieter Noordhuis	f3ffded3bf	Merge job parameters based on their name (#1659 ) ## Changes This change enables overriding the default value of job parameters in target overrides. This is the same approach we already take for job clusters and job tasks. Closes #1620. ## Tests Mutator unit tests and lightweight end-to-end tests.	2024-08-06 16:12:18 +00:00
shreyas-goenka	c454c2fd10	Use precomputed terraform plan for `bundle deploy` (#1640 ) # Changes With https://github.com/databricks/cli/pull/1413 we started to compute and partially print the plan if it contained deletion of UC schemas. This PR uses the precomputed plan to avoid double planning when actually doing the terraform plan. This fixes a performance regression introduced in https://github.com/databricks/cli/pull/1413. # Tests Tested manually. 1. Verified bundle deployment still works and deploys resources. 2. Verified that the precomputed plan is indeed being used by attaching a debugger and removing the plan file right before the terraform apply process is spawned and asserting that terraform apply fails because the plan is not found.	2024-07-31 14:07:25 +00:00
shreyas-goenka	89c0af5bdc	Add resource for UC schemas to DABs (#1413 ) ## Changes This PR adds support for UC Schemas to DABs. This allows users to define schemas for tables and other assets their pipelines/workflows create as part of the DAB, thus managing the life-cycle in the DAB. The first version has a couple of intentional limitations: 1. The owner of the schema will be the deployment user. Changing the owner of the schema is not allowed (yet). `run_as` will not be restricted for DABs containing UC schemas. Let's limit the scope of run_as to the compute identity used instead of ownership of data assets like UC schemas. 2. API fields that are present in the update API but not the create API. For example: enabling predictive optimization is not supported in the create schema API and thus is not available in DABs at the moment. ## Tests Manually and integration test. Manually verified the following work: 1. Development mode adds a "dev_" prefix. 2. Modified status is correctly computed in the `bundle summary` command. 3. Grants work as expected, for assigning privileges. 4. Variable interpolation works for the schema ID.	2024-07-31 12:16:28 +00:00
shreyas-goenka	e6241e196f	Move to a single prompt during bundle destroy (#1583 ) ## Changes Right now we ask users for two confirmations when destroying a bundle. One to destroy the resources and one to delete the files. This PR consolidates the two prompts into one. ## Tests Manually Destroying a bundle with no resources: ``` ➜ bundle-playground git:(master) ✗ cli bundle destroy All files and directories at the following location will be deleted: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default Would you like to proceed? [y/n]: y No resources to destroy Updating deployment state... Deleting files... Destroy complete! ``` Destroying a bundle with no remote state: ``` ➜ bundle-playground git:(master) ✗ cli bundle destroy No active deployment found to destroy! ``` When a user cancells a deployment: ``` ➜ bundle-playground git:(master) ✗ cli bundle destroy The following resources will be deleted: delete job job_1 delete job job_2 delete pipeline foo All files and directories at the following location will be deleted: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default Would you like to proceed? [y/n]: n Destroy cancelled! ``` When a user destroys resources: ``` ➜ bundle-playground git:(master) ✗ cli bundle destroy The following resources will be deleted: delete job job_1 delete job job_2 delete pipeline foo All files and directories at the following location will be deleted: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default Would you like to proceed? [y/n]: y Updating deployment state... Deleting files... Destroy complete! ```	2024-07-24 13:02:19 +00:00
Andrew Nester	39fc86e83b	Split artifact cleanup into prepare step before build (#1618 ) ## Changes Now prepare stage which does cleanup is execute once before every build, so artifacts built into the same folder are correctly kept Fixes workaround 2 from this issue #1602 ## Tests Added unit test	2024-07-24 09:13:49 +00:00
shreyas-goenka	5bc5c3c26a	Return early in bundle destroy if no deployment exists (#1581 ) ## Changes This PR: 1. Moves the if mutator to the bundle package, to live with all-time greats such as `bundle.Seq` and `bundle.Defer`. Also adds unit tests. 2. `bundle destroy` now returns early if `root_path` does not exist. We do this by leveraging a `bundle.If` condition. ## Tests Unit tests and manually. Here's an example of what it'll look like once the bundle is destroyed. ``` ➜ bundle-playground git:(master) ✗ cli bundle destroy No active deployment found to destroy! ``` I would have added some e2e coverage for this as well, but the `cobraTestRunner.Run()` method does not seem to return stdout/stderr logs correctly. We can probably punt looking into it.	2024-07-09 15:08:38 +00:00
Pieter Noordhuis	f14dded946	Replace `vfs.Path` with extension-aware filer when running on DBR (#1556 ) ## Changes The FUSE mount of the workspace file system on DBR doesn't include file extensions for notebooks. When these notebooks are checked into a repository, they do have an extension. PR #1457 added a filer type that is aware of this disparity and makes these notebooks show up as if they do have these extensions. This change swaps out the native `vfs.Path` with one that uses this filer when running on DBR. Follow up: consolidate between interfaces exported by `filer.Filer` and `vfs.Path`. ## Tests * Unit tests pass * (Manually ran a snapshot build on DBR against a bundle with notebooks) --------- Co-authored-by: Andrew Nester <andrew.nester@databricks.com>	2024-07-03 11:55:42 +00:00
Andrew Nester	5f42791609	Added support for complex variables (#1467 ) ## Changes Added support for complex variables Now it's possible to add and use complex variables as shown below ``` bundle: name: complex-variables resources: jobs: my_job: job_clusters: - job_cluster_key: key new_cluster: ${var.cluster} tasks: - task_key: test job_cluster_key: key variables: cluster: description: "A cluster definition" type: complex default: spark_version: "13.2.x-scala2.11" node_type_id: "Standard_DS3_v2" num_workers: 2 spark_conf: spark.speculation: true spark.databricks.delta.retentionDurationCheck.enabled: false ``` Fixes #1298 - [x] Support for complex variables - [x] Allow variable overrides (with shortcut) in targets - [x] Don't allow to provide complex variables via flag or env variable - [x] Fail validation if complex value is used but not `type: complex` provided - [x] Support using variables inside complex variables ## Tests Added unit tests --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-06-26 10:25:32 +00:00
Gleb Kanterov	5ff06578ac	PythonMutator: replace stdin/stdout with files (#1512 ) ## Changes Replace stdin/stdout with files in `PythonMutator`. Files are created in a temporary directory. Rename `ApplyPythonMutator` to `PythonMutator`. Add test for `dyn.Location` behavior during the "load" stage. ## Tests Unit tests	2024-06-24 07:47:41 +00:00
Gleb Kanterov	57a5a65f87	Add ApplyPythonMutator (#1430 ) ## Changes Add ApplyPythonMutator, which will fork the Python subprocess and process pipe bundle configuration through it. It's enabled through `experimental` section, for example: ```yaml experimental: pydabs: enable: true venv_path: .venv ``` For now, it's limited to two phases in the mutator pipeline: - `load`: adds new jobs - `init`: adds new jobs, or modifies existing ones It's enforced that no jobs are modified in `load` and not jobs are deleted in `load/init`, because, otherwise, it will break existing assumptions. ## Tests Unit tests	2024-06-20 08:43:08 +00:00
Ilia Babanov	2035516fde	Don't merge-in remote resources during depolyments (#1432 ) ## Changes `check_running_resources` now pulls the remote state without modifying the bundle state, similar to how it was doing before. This avoids a problem when we fail to compute deployment metadata for a deleted job (which we shouldn't do in the first place) `deploy_then_remove_resources_test` now also deploys and deletes a job (in addition to a pipeline), which catches the error that this PR fixes. ## Tests Unit and integ tests	2024-05-15 12:41:44 +00:00
shreyas-goenka	507053ee50	Annotate DLT pipelines when deployed using DABs (#1410 ) ## Changes This PR annotates any pipelines that were deployed using DABs to have `deployment.kind` set to "BUNDLE", mirroring the annotation for Jobs (similar PR for jobs FYI: https://github.com/databricks/cli/pull/880). Breakglass UI is not yet available for pipelines, so this annotation will just be used for revenue attribution ATM. Note: The API field has been deployed in all regions including GovCloud. ## Tests Unit tests and manually. Manually verified that the kind and metadata_file_path are being set by DABs, and are returned by a GET API to a pipeline deployed using a DAB. Example: ``` "deployment": { "kind":"BUNDLE", "metadata_file_path":"/Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default/state/metadata.json" }, ```	2024-05-01 08:37:03 +00:00
Ilia Babanov	153141d3ea	Don't fail while parsing outdated terraform state (#1404 ) `terraform show -json` (`terraform.Show()`) fails if the state file contains resources with fields that non longer conform to the provider schemas. This can happen when you deploy a bundle with one version of the CLI, then updated the CLI to a version that uses different databricks terraform provider, and try to run `bundle run` or `bundle summary`. Those commands don't recreate local terraform state (only `terraform apply` or `plan` do) and terraform itself fails while parsing it. [Terraform docs](https://developer.hashicorp.com/terraform/language/state#format) point out that it's best to use `terraform show` after successful `apply` or `plan`. Here we parse the state ourselves. The state file format is internal to terraform, but it's more stable than our resource schemas. We only parse a subset of fields from the state, and only update ID and ModifiedStatus of bundle resources in the `terraform.Load` mutator.	2024-05-01 08:22:35 +00:00
Andrew Nester	1872aa12b3	Added support for job environments (#1379 ) ## Changes The main changes are: 1. Don't link artifacts to libraries anymore and instead just iterate over all jobs and tasks when uploading artifacts and update local path to remote 2. Iterating over `jobs.environments` to check if there are any local libraries and checking that they exist locally 3. Added tests to check environments are handled correctly End-to-end test will follow up ## Tests Added regression test, existing tests (including integration one) pass	2024-04-22 11:44:34 +00:00
Lennart Kats (databricks)	000a7fef8c	Enable job queueing by default (#1385 ) ## Changes This enable queueing for jobs by default, following the behavior from API 2.2+. Queing is a best practice and will be the default in API 2.2. Since we're still using API 2.1 which has queueing disabled by default, this PR enables queuing using a mutator. Customers can manually turn off queueing for any job by adding the following to their job spec: ``` queue: enabled: false ``` ## Tests Unit tests, manual confirmation of property after deployment. --------- Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>	2024-04-22 10:36:39 +00:00
Andrew Nester	542156c30b	Resolve variable references inside variable lookup fields (#1368 ) ## Changes Allows for the syntax below ``` variables: service_principal_app_id: description: 'The app id of the service principal for running workflows as.' lookup: service_principal: "sp-${bundle.environment}" ``` Fixes #1259 ## Tests Added regression test	2024-04-18 09:56:16 +00:00
Pieter Noordhuis	ca534d596b	Load bundle configuration from mutator (#1318 ) ## Changes Prior to this change, the bundle configuration entry point was loaded from the function `bundle.Load`. Other configuration files were only loaded once the caller applied the first set of mutators. This separation was unnecessary and not ideal in light of gathering diagnostics while loading _any_ configuration file, not just the ones from the includes. This change: * Updates `bundle.Load` to only verify that the specified path is a valid bundle root. * Moves mutators that perform loading to `bundle/config/loader`. * Adds a "load" phase that takes the place of applying `DefaultMutators`. Follow ups: * Rename `bundle.Load` -> `bundle.Find` (because it no longer performs loading) This change depends on #1316 and #1317. ## Tests Tests pass.	2024-03-27 10:49:05 +00:00
Pieter Noordhuis	ed194668db	Return `diag.Diagnostics` from mutators (#1305 ) ## Changes This diagnostics type allows us to capture multiple warnings as well as errors in the return value. This is a preparation for returning additional warnings from mutators in case we detect non-fatal problems. * All return statements that previously returned an error now return `diag.FromErr` * All return statements that previously returned `fmt.Errorf` now return `diag.Errorf` * All `err != nil` checks now use `diags.HasError()` or `diags.Error()` ## Tests * Existing tests pass. * I confirmed no call site under `./bundle` or `./cmd/bundle` uses `errors.Is` on the return value from mutators. This is relevant because we cannot wrap errors with `%w` when calling `diag.Errorf` (like `fmt.Errorf`; context in https://github.com/golang/go/issues/47641).	2024-03-25 14:18:47 +00:00
Andrew Nester	de89af6f8c	Push deployment state right after files upload (#1293 ) ## Changes Push deployment state right after files upload ## Tests Integration tests succeed	2024-03-19 09:47:41 +00:00
Andrew Nester	d216404f27	Do CheckRunningResource only after terraform.Write (#1292 ) ## Changes CheckRunningResource does `terraform.Show` which (I believe) expects valid `bundle.tf.json` which is only written as part of `terraform.Write` later. With this PR order is changed. Fixes #1286 ## Tests Added regression E2E test	2024-03-18 15:39:18 +00:00
Andrew Nester	1b0ac61093	Added deployment state for bundles (#1267 ) ## Changes This PR introduces new structure (and a file) being used locally and synced remotely to Databricks workspace to track bundle deployment related metadata. The state is pulled from remote, updated and pushed back remotely as part of `bundle deploy` command. This state can be used for deployment sequencing as it's `Version` field is monotonically increasing on each deployment. Currently, it only tracks files being synced as part of the deployment. This helps fix the issue with files not being removed during deployments on CI/CD as sync snapshot was never present there. Fixes #943 ## Tests Added E2E (regression) test for files removal on CI/CD --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-03-18 14:41:58 +00:00
shreyas-goenka	d5dc2bd1ca	Filter current user from resource permissions (#1262 ) ## Changes The databricks terraform provider does not allow changing permission of the current user. Instead, the current identity is implictly set to be the owner of all resources on the platform side. This PR introduces a mutator to filter permissions from the bundle configuration at deploy time, allowing users to define permissions for their own identities in their bundle config. This would allow configurations like, allowing both alice and bob to collaborate on the same DAB: ``` permissions: level: CAN_MANAGE user_name: alice level: CAN_MANAGE user_name: bob ``` This PR is a reincarnation of https://github.com/databricks/cli/pull/1145. The earlier attempt had to be reverted due to metadata loss converting to and from the dynamic configuration representation (reverted here: https://github.com/databricks/cli/pull/1179) ## Tests Unit test and manually	2024-03-11 15:05:15 +00:00
Pieter Noordhuis	87dd46a3f8	Use dynamic configuration model in bundles (#1098 ) ## Changes This is a fundamental change to how we load and process bundle configuration. We now depend on the configuration being represented as a `dyn.Value`. This representation is functionally equivalent to Go's `any` (it is variadic) and allows us to capture metadata associated with a value, such as where it was defined (e.g. file, line, and column). It also allows us to represent Go's zero values properly (e.g. empty string, integer equal to 0, or boolean false). Using this representation allows us to let the configuration model deviate from the typed structure we have been relying on so far (`config.Root`). We need to deviate from these types when using variables for fields that are not a string themselves. For example, using `${var.num_workers}` for an integer `workers` field was impossible until now (though not implemented in this change). The loader for a `dyn.Value` includes functionality to capture any and all type mismatches between the user-defined configuration and the expected types. These mismatches can be surfaced as validation errors in future PRs. Given that many mutators expect the typed struct to be the source of truth, this change converts between the dynamic representation and the typed representation on mutator entry and exit. Existing mutators can continue to modify the typed representation and these modifications are reflected in the dynamic representation (see `MarkMutatorEntry` and `MarkMutatorExit` in `bundle/config/root.go`). Required changes included in this change: * The existing interpolation package is removed in favor of `libs/dyn/dynvar`. * Functionality to merge job clusters, job tasks, and pipeline clusters are now all broken out into their own mutators. To be implemented later: * Allow variable references for non-string types. * Surface diagnostics about the configuration provided by the user in the validation output. * Some mutators use a resource's configuration file path to resolve related relative paths. These depend on `bundle/config/paths.Path` being set and populated through `ConfigureConfigFilePath`. Instead, they should interact with the dynamically typed configuration directly. Doing this also unlocks being able to differentiate different base paths used within a job (e.g. a task override with a relative path defined in a directory other than the base job). ## Tests * Existing unit tests pass (some have been modified to accommodate) * Integration tests pass	2024-02-16 19:41:58 +00:00
Andrew Nester	80670eceed	Added `bundle deployment bind` and `unbind` command (#1131 ) ## Changes Added `bundle deployment bind` and `unbind` command. This command allows to bind bundle-defined resources to existing resources in Databricks workspace so they become DABs-managed. ## Tests Manually + added E2E test	2024-02-14 18:04:45 +00:00
Andrew Nester	6edab93233	Added warning when trying to deploy bundle with `--fail-if-running` and running resources (#1163 ) ## Changes Deploying bundle when there are bundle resources running at the same time can be disruptive for jobs and pipelines in progress. With this change during deployment phase (before uploading any resources) if there is `--fail-if-running` specified DABs will check if there are any resources running and if so, will fail the deployment ## Tests Manual + add tests	2024-02-07 11:17:17 +00:00
Pieter Noordhuis	6e075e8cf8	Revert "Filter current user from resource permissions (#1145 )" (#1179 ) ## Changes This reverts commit `4131069a4b`. The integration test for metadata computation failed. The back and forth to `dyn.Value` erases unexported fields that the code currently still depends on. We'll have to retry on top of #1098.	2024-02-07 09:22:44 +00:00
shreyas-goenka	4131069a4b	Filter current user from resource permissions (#1145 ) ## Changes The databricks terraform provider does not allow changing permission of the current user. Instead, the current identity is implictly set to be the owner of all resources on the platform side. This PR introduces a mutator to filter permissions from the bundle configuration, allowing users to define permissions for their own identities in their bundle config. This would allow configurations like, allowing both alice and bob to collaborate on the same DAB: ``` permissions: level: CAN_MANAGE user_name: alice level: CAN_MANAGE user_name: bob ``` ## Tests Unit test and manually	2024-02-06 12:45:08 +00:00
shreyas-goenka	cf2a1c38ba	Set run_as permissions after variable interpolation (#1141 ) ## Changes This PR sets run as permissions after variable interpolation. Terraform does not allow specifying permissions for current user. The following configuration would fail becuase we would assign a permission block for self, bypassing this check here: `4ee926b885/bundle/config/mutator/run_as.go (L47)` ``` run_as: user_name: ${workspace.current_user.userName} ``` ## Tests Manually, setting run_as to ${workspace.current_user.userName} works now	2024-01-24 12:22:04 +00:00
Andrew Nester	5fb40f9d07	Allow referencing bundle resources by name (#872 ) ## Changes Now we can define variables with values which reference different Databricks resources by name. When references like this, DABs automatically looks up the resource by this name and replaces the reference with ID of the resource referenced. Thus when the variable is used in the configuration it will contain the correct resolved ID of resource. The resolvers are code generated and thus DABs support referencing all resources which has `GetByName`-like methods in Go SDK. ### Example ``` variables: my_cluster_id: description: An existing cluster. lookup: cluster: "12.2 shared" resources: jobs: my_job: name: "My Job" tasks: - task_key: TestTask existing_cluster_id: ${var.my_cluster_id} targets: dev: variables: my_cluster_id: lookup: cluster: "dev-cluster" ``` ## Tests Added unit test + manual testing --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-01-04 21:04:42 +00:00
Lennart Kats (databricks)	875c9d2db1	Tune output of bundle deploy command (#1047 ) ## Changes Update the output of the `deploy` command to be more concise and consistent: ``` $ databricks bundle deploy Building my_project... Uploading my_project-0.0.1+20231207.205106-py3-none-any.whl... Uploading bundle files to /Users/lennart.kats@databricks.com/.bundle/my_project/dev/files... Deploying resources... Updating deployment state... Deployment complete! ``` This does away with the intermediate success messages, makes consistent use of `...`, and only prints the success message at the very end after everything is completed. Below is the original output for comparison: ``` $ databricks bundle deploy Detecting Python wheel project... Found Python wheel project at /tmp/output/my_project Building my_project... Build succeeded Uploading my_project-0.0.1+20231207.205134-py3-none-any.whl... Upload succeeded Starting upload of bundle files Uploaded bundle files at /Users/lennart.kats@databricks.com/.bundle/my_project/dev/files! Starting resource deployment Resource deployment completed! ```	2023-12-21 08:00:37 +00:00
shreyas-goenka	2d93f62f21	Set metadata fields required to enable break-glass UI for jobs (#880 ) ## Changes This PR sets the following fields for all jobs that are deployed from a DAB 1. `deployment`: This provides the platform with the path to a file to read the metadata from. 2. `edit_mode`: This tells the platform to display the break-glass UI for jobs deployed from a DAB. Setting this is required to re-lock the UI after a user clicks "disconnect from source". 3. `format = MULTI_TASK`. This makes the Terraform provider always use jobs API 2.1 for creating/updating the job. Required because `deployment` and `edit_mode` are only available in API 2.1. ## Tests Unit test and manually. Manually verified that deployments trigger the break glass UI. Manually verified there is no Terraform drift when all three fields are set. --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2023-12-19 07:38:52 +00:00
shreyas-goenka	677926b78b	Fix panic when bundle auth resolution fails (#1002 ) ## Changes CLI would panic if an invalid bundle auth is setup when running CLI commands. This PR removes the panic and shows the error message directly instead. ## Tests The CWD is a bundle with: ``` workspace: profile: DEFAULT ``` Before: ``` shreyas.goenka@THW32HFW6T bundle-playground % cli clusters list panic: resolve: /Users/shreyas.goenka/.databrickscfg has no DEFAULT profile configured. Config: profile=DEFAULT goroutine 1 [running]: ``` After: ``` shreyas.goenka@THW32HFW6T bundle-playground % cli clusters list Error: cannot resolve bundle auth configuration: resolve: /Users/shreyas.goenka/.databrickscfg has no DEFAULT profile configured. Config: profile=DEFAULT ``` ``` shreyas.goenka@THW32HFW6T bundle-playground % DATABRICKS_CONFIG_FILE=/dev/null cli bundle deploy Error: cannot resolve bundle auth configuration: resolve: /dev/null has no DEFAULT profile configured. Config: profile=DEFAULT, config_file=/dev/null. Env: DATABRICKS_CONFIG_FILE ```	2023-11-30 14:28:01 +00:00
Andrew Nester	f3db42e622	Added support for top-level permissions (#928 ) ## Changes Now it's possible to define top level `permissions` section in bundle configuration and permissions defined there will be applied to all resources defined in the bundle. Supported top-level permission levels: CAN_MANAGE, CAN_VIEW, CAN_RUN. Permissions are applied to: Jobs, DLT Pipelines, ML Models, ML Experiments and Model Service Endpoints ``` bundle: name: permissions workspace: host: *** permissions: - level: CAN_VIEW group_name: test-group - level: CAN_MANAGE user_name: user@company.com - level: CAN_RUN service_principal_name: 123456-abcdef ``` ## Tests Added corresponding unit tests + ran `bundle validate` and `bundle deploy` manually	2023-11-13 11:29:40 +00:00
shreyas-goenka	5a8cd0c5bc	Persist deployment metadata in WSFS (#845 ) ## Changes This PR introduces a metadata struct that stores a subset of bundle configuration that we wish to expose to other Databricks services that wish to integrate with bundles. This metadata file is uploaded to a file `${bundle.workspace.state_path}/metadata.json` in the WSFS destination of the bundle deployment. Documentation for emitted metadata fields: * `version`: Version for the metadata file schema * `config.bundle.git.branch`: Name of the git branch the bundle was deployed from. * `config.bundle.git.origin_url`: URL for git remote "origin" * `config.bundle.git.bundle_root_path`: Relative path of the bundle root from the root of the git repository. Is set to "." if they are the same. * `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this bundle was deployed from. Note, the deployment might not exactly match this commit version if there are changes that have not been committed to git at deploy time, * `file_path`: Path in workspace where we sync bundle files to. * `resources.jobs.[job-ref].id`: Id of the job * `resources.jobs.[job-ref].relative_path`: Relative path of the yaml config file from the bundle root where this job was defined. Example metadata object when bundle root and git root are the same: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "master", "origin_url": "www.host.com", "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0", "bundle_root_path": "." } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` Example metadata when the git root is one level above the bundle repo: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "dev-branch", "origin_url": "www.my-repo.com", "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b", "bundle_root_path": "pipeline-progress" } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` This unblocks integration to the jobs break glass UI for bundles. ## Tests Unit tests and integration tests.	2023-10-27 12:55:43 +00:00
Andrew Nester	19e00d2d47	Upload terraform state even if apply fails (#923 ) ## Changes Upload terraform state even if apply fails Fixes #893 ## Tests Manually running `databricks bundle deploy` with incorrect permissions in bundle config and observe that it gets uploaded correctly	2023-10-26 14:38:01 +00:00
Andrew Nester	aa54a8665a	Added support for glob patterns in pipeline libraries section (#833 ) ## Changes Now it's possible to specify glob pattern in pipeline libraries section and DAB will add all matched files as libraries ``` pipelines: dummy: name: " DLT with Python files" target: "dlt_python_files" libraries: - file: path: ./*.py ``` ## Tests Added unit test	2023-10-04 13:23:13 +00:00
Andrew Nester	3ee89c41da	Added a warning when Python wheel wrapper needs to be used (#807 ) ## Changes Added a warning when Python wheel wrapper needs to be used ## Tests Added unit tests + manual run with different bundle configurations	2023-09-27 08:26:59 +00:00
Andrew Nester	953dcb4972	Added support for experimental scripts section (#632 ) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: * experimental: scripts: prebuild: \| echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: \| echo 'Checking go version...' go version postdeploy: \| echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```	2023-09-14 10:14:13 +00:00
Andrew Nester	12368e3382	Added transformation mutator for Python wheel task for them to work on DBR <13.1 (#635 ) ## Changes *Note: this PR relies on sync.include functionality from here: https://github.com/databricks/cli/pull/671* Added transformation mutator for Python wheel task for them to work on DBR <13.1 Using wheels upload to Workspace file system as cluster libraries is not supported in DBR < 13.1 In order to make Python wheel work correctly on DBR < 13.1 we do the following: 1. Build and upload python wheel as usual 2. Transform python wheel task into special notebook task which does the following a. Installs all necessary wheels with %pip magic b. Executes defined entry point with all provided parameters 3. Upload this notebook file to workspace file system 4. Deploy transformed job task This is also beneficial for executing on existing clusters because this notebook always reinstall wheels so if there are any changes to the wheel package, they are correctly picked up ## Tests bundle.yml ```yaml bundle: name: wheel-task workspace: host: ** resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "" python_wheel_task: package_name: "my_test_code" entry_point: "run" parameters: ["first argument","first value","second argument","second value"] libraries: - whl: ./dist/.whl ``` Output ``` andrew.nester@HFW9Y94129 wheel % databricks bundle run test_job Run URL: *** 2023-08-03 15:58:04 "[default] My Wheel Job" TERMINATED SUCCESS Output: ======= Task TestTask: Hello from my func Got arguments v1: ['python', 'first argument', 'first value', 'second argument', 'second value'] ```	2023-08-30 12:21:39 +00:00
Andrew Nester	4ee926b885	Added run_as section for bundle configuration (#692 ) ## Changes Added run_as section for bundle configuration. This section allows to define an user name or service principal which will be applied as an execution identity for jobs and DLT pipelines. In the case of DLT, identity defined in `run_as` will be assigned `IS_OWNER` permission on this pipeline. ## Tests Added unit tests for configuration. Also ran deploy for the following bundle configuration ``` bundle: name: "run_as" run_as: # service_principal_name: "f7263fcc-56d0-4981-8baf-c2a45296690b" user_name: "lennart.kats@databricks.com" resources: pipelines: andrew_pipeline: name: "Andrew Nester pipeline" libraries: - notebook: path: ./test.py jobs: job_one: name: Job One tasks: - task_key: "task" new_cluster: num_workers: 1 spark_version: 13.2.x-snapshot-scala2.12 node_type_id: i3.xlarge runtime_engine: PHOTON notebook_task: notebook_path: "./test.py" ```	2023-08-23 16:47:07 +00:00

1 2

71 Commits