databricks-cli

Commit Graph

Author	SHA1	Message	Date
Wojciech Pratkowiecki	0e7f502de6	Merge `fd67acbc7f` into `85c0d2d3ee`	2024-11-26 19:14:53 +00:00
Andrew Nester	592e1111b7	Update filenames used by bundle generate to use `.<resource-type>.yml` (#1901 ) ## Changes Update filenames used by bundle generate to use '.resource-type.yml' Similar to [Add sub-extension to resource files in built-in templates by shreyas-goenka · Pull Request #1777 · databricks/cli](https://github.com/databricks/cli/pull/1777) --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-11-20 13:53:25 +01:00
Pieter Noordhuis	886e14910c	Fix template initialization when running on Databricks (#1912 ) ## Changes When running the CLI on Databricks Runtime (DBR), use the extension-aware filer to write an instantiated template if the instance path is located in the workspace filesystem. Notebooks cannot be written through the workspace filesystem's FUSE mount. As a result, this is the only method for initializing templates that contain notebooks when running the CLI on DBR and writing to the workspace filesystem. Depends on #1910 and #1911. Supersedes #1744. ## Tests * Manually confirmed I can initialize a template with notebooks when running the CLI from the web terminal.	2024-11-20 11:42:23 +00:00
Pieter Noordhuis	4fea0219fd	Use `fs.FS` interface to read template (#1910 ) ## Changes While working on the v2 of #1744, I found that: * Template initialization first copies built-in templates to a temporary directory before initializing them * Reading a template's contents goes through a `filer.Filer` but is hardcoded to a local one This change updates the interface for reading templates to be `fs.FS`. This is compatible with the `embed.FS` type for the built-in templates, so they no longer have to be copied to a temporary directory before being used. The alternative is to use a `filer.Filer` throughout, but this would have required even more plumbing, and we don't need to _read_ templates, including notebooks, from the workspace filesystem (yet?). As part of making `template.Materialize` take an `fs.FS` argument, the logic to match a given argument to a particular built-in template in the `init` command has moved to sit next to its implementation. ## Tests Existing tests pass.	2024-11-20 09:28:35 +00:00
Andrew Nester	162aa212bc	Do not execute build on bundle destroy (#1882 ) ## Changes There's no value in building artifacts on destroy because they are just removed from workspace as part of destroy.	2024-11-07 09:31:49 +00:00
Pieter Noordhuis	edff68c763	Fix bundle run when run interactively (#1880 ) ## Changes The commit where resource lookup was factored out into a separate package (#1858) didn't take into account the use of `args` further down in the code. This change fixes that oversight by returning the tail arguments when determining which resource to run. The later call no longer has to index the `args` slice. ## Tests Manually confirmed that the command works when being prompted for the resource to run.	2024-11-05 09:30:11 +00:00
Pieter Noordhuis	1896b09350	Add bundle generate variant for dashboards (#1847 ) ## Changes This change adds the `databricks bundle generate dashboard` command. The command requires one of three flags: * `--existing-id` to generate configuration for an existing dashboard by its ID. * `--existing-path` to generate configuration for an existing dashboard by its path in the workspace file system. * `--resource` to generate the `.lvdash.json` dashboard file for a dashboard that's already defined in the bundle. This option does not impact the YAML configuration. A typical workflow could look like this: 1. Use the command with `--existing-id` or `--existing-path` for a starting point 2. Run `bundle deploy` to deploy a copy of the dashboard 3. Run `bundle open` to open this copy in your browser 4. Navigate to the draft mode and make modifications 5. Run `bundle generate dashboard` with `--resource` to update the local `.lvdash.json` file with the remote modifications ## Tests * Unit tests. * Manual walkthrough as documented in the [Dashboard for NYC Taxi Trip Analysis example](https://github.com/databricks/bundle-examples/tree/main/knowledge_base/dashboard_nyc_taxi).	2024-10-29 11:51:59 +00:00
Pieter Noordhuis	ed84a33b0a	Reuse resource resolution code for the run command (#1858 ) ## Changes As of #1846 we have a generalized package for doing resource lookups and completion. This change updates the run command to use this instead of more specific code under `bundle/run`. ## Tests * Unit tests pass * Manually confirmed that completion and prompting works	2024-10-24 13:24:30 +00:00
Pieter Noordhuis	89ee7d8a99	Add command to open a resource in the browser (#1846 ) ## Changes This builds on the functionality added in #1731 that produces a URL for every resource. Adds `bundle/resources` package to deal with resource lookups and command completion. The new functionality is similar to the lookup and command completion functionality located in `bundle/run`. It differs in that it doesn't gracefully deal with ambiguous references to resources, now that we explicitly validate this doesn't occur in the bundle configuration. It still allows resources to be looked up with their fully qualified key, `<plural type>.<key>`. ## Tests * Added unit tests for resource lookup and completion * Manually confirmed that `bundle open` prompts, accepts a key argument, and opens a browser	2024-10-24 12:20:33 +00:00
Ilia Babanov	55a055d0f5	Add "output" flag to the bundle sync command (#1853 ) ## Changes We want to use 'bundle sync' in the vscode extension before running a file as an ad-hoc job (or through the context api). Right now we use bundle deploy in these cases, but deploying bundle resources is not always expected when you just want to quickly run a file. Sync makes more sense in these cases, but we still want to have verbose output to see what's happening. In the 'deploy' command we have hidden 'verbose' flag. For the sync I've just added 'output' flag, handling both json and text cases, similar to how it's done in the non-bundle `sync` command. The flag is not hidden (although we still don't show any output by default, if the flag is not set). VSCode Extension PR: https://github.com/databricks/databricks-vscode/pull/1401 ## Tests Manually	2024-10-23 11:08:12 +00:00
shreyas-goenka	3bab21e72e	Fix race condition when restarting continuous jobs (#1849 ) ## Changes We don't need to cancel existing runs when the job is continuous and unpaused. The `/jobs/run-now` command will cancel the existing run and trigger a new one automatically. Cancelling the job manually can cause a race condition where both the manual trigger from the CLI and the continuous trigger from the job configuration happens at the same time. This PR prevents that from happening. ## Tests Unit tests and manually	2024-10-22 14:59:17 +00:00
Lennart Kats (databricks)	c5043c3d9d	Add `bundle summary` to display URLs for deployed resources (#1731 ) ## Changes Adds a textual output to the `databricks bundle summary` command, which includes URLs of deployed resources. Example usage: ``` $ databricks bundle summary Name: my_pipeline Target: dev Workspace: Host: https://domain.databricks.com User: user@databricks.com Path: /Users/user@databricks.com/.bundle/my_pipeline/dev Resources: Jobs: my_project_job: Name: [dev lennart] my_project_job URL: https://domain.databricks.com/jobs/206899209187287?o=6051921418418893 Pipelines: my_project_pipeline: Name: [dev lennart] my_project_pipeline URL: https://domain.databricks.com/pipelines/3f849fd5-ba7d-47fa-a34c-c6bf034b4f58?o=6051921418418893 ``` Notes: * The top headers of the output are the same as those from the existing `bundle validate` command * URLs are colored light blue in the output * For resources that haven't been deployed yet, we show `(not deployed)` in place of the URL --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>	2024-10-18 06:45:47 +00:00
Wojciech Pratkowiecki	a64b88b93c	Add --dry-run option for bundle deplot	2024-10-02 20:01:01 +02:00
Pieter Noordhuis	1d1aa0a416	Rename `RootPath` -> `BundleRootPath` (#1792 ) ## Changes After introducing the `SyncRootPath` field on the bundle (#1694), the previous `RootPath` became ambiguous. Does it mean the bundle root path or the sync root path? This PR renames to field to `BundleRootPath` to remove the ambiguity. ## Tests n/a --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-09-27 10:03:05 +00:00
Andrew Nester	56ed9bebf3	Added support for creating all-purpose clusters (#1698 ) ## Changes Added support for creating all-purpose clusters Example of configuration ``` bundle: name: clusters resources: clusters: test_cluster: cluster_name: "Test Cluster" num_workers: 2 node_type_id: "i3.xlarge" autoscale: min_workers: 2 max_workers: 7 spark_version: "13.3.x-scala2.12" spark_conf: "spark.executor.memory": "2g" jobs: test_job: name: "Test Job" tasks: - task_key: test_task existing_cluster_id: ${resources.clusters.test_cluster.id} notebook_task: notebook_path: "./src/test.py" targets: development: mode: development compute_id: ${resources.clusters.test_cluster.id} ``` ## Tests Added unit, config and E2E tests	2024-09-23 10:42:34 +00:00
Ilia Babanov	ac80d3dfcb	Add verbose flag to the "bundle deploy" command (#1774 ) ## Changes - Extract sync output logic from `cmd/sync` into `lib/sync` - Add hidden `verbose` flag to the `bundle deploy` command, it's false by default and hidden from the `--help` output - Pass output handler to the `deploy/files/upload` mutator if the verbose option is true The was an idea to use in-place output overriding each past file sync event in the output, bit that wont work for the extension, since it doesn't display deploy logs in the terminal. Example output: ``` ~/tmp/defpy: ~/cli/cli bundle deploy --sync-progress Building defpy... Uploading defpy-0.0.1+20240917.112755-py3-none-any.whl... Uploading bundle files to /Users/ilia.babanov@databricks.com/.bundle/defpy/dev/files... Action: PUT: requirements-dev.txt, resources/defpy_pipeline.yml, pytest.ini, src/defpy/main.py, src/defpy/__init__.py, src/dlt_pipeline.ipynb, tests/main_test.py, src/notebook.ipynb, setup.py, resources/defpy_job.yml, .vscode/extensions.json, .vscode/settings.json, fixtures/.gitkeep, .vscode/__builtins__.pyi, README.md, .gitignore, databricks.yml Uploaded tests Uploaded resources Uploaded fixtures Uploaded .vscode Uploaded src/defpy Uploaded requirements-dev.txt Uploaded .gitignore Uploaded fixtures/.gitkeep Uploaded src/defpy/__init__.py Uploaded databricks.yml Uploaded README.md Uploaded setup.py Uploaded .vscode/__builtins__.pyi Uploaded .vscode/extensions.json Uploaded src/dlt_pipeline.ipynb Uploaded .vscode/settings.json Uploaded resources/defpy_job.yml Uploaded pytest.ini Uploaded src/defpy/main.py Uploaded tests/main_test.py Uploaded resources/defpy_pipeline.yml Uploaded src/notebook.ipynb Initial Sync Complete Deploying resources... Updating deployment state... Deployment complete! ``` Output example in the extension: <img width="1843" alt="Screenshot 2024-09-19 at 11 07 48" src="https://github.com/user-attachments/assets/0fafd095-cdc6-44b8-b482-27a38ada0330"> ## Tests Manually for the `sync` and `bundle deploy` commands + vscode extension sync and deploy flows	2024-09-23 10:09:11 +00:00
Lennart Kats (databricks)	6c57683dc6	Reduce time until the prompt is shown for bundle run (#1727 ) ## Summary Makes the `databricks bundle run` command use local state before showing the menu prompt, which makes it show more quickly. For large/busy workspaces this means the prompt can show 2-3 seconds earlier.	2024-09-21 06:36:47 +00:00
Andrew Nester	66307134c1	Fixed generated YAML missing 'default' for empty values (#1765 ) ## Changes Fixed generated YAML missing 'default' for empty values ## Tests Added unit test	2024-09-11 09:49:58 +00:00
shreyas-goenka	28b39cd3f7	Make bundle JSON schema modular with `$defs` (#1700 ) ## Changes This PR makes sweeping changes to the way we generate and test the bundle JSON schema. The main benefits are: 1. More modular JSON schema. Every definition in the schema now is one level deep and points to references instead of inlining the entire schema for a field. This unblocks PyDABs from taking a dependency on the JSON schema. 2. Generate the JSON schema during CLI code generation. Directly stream it instead of computing it at runtime whenever a user calls `databricks bundle schema`. This is nice because we no longer need to embed a partial OpenAPI spec in the CLI. Down the line, we can add a `Schema()` method to every struct in the Databricks Go SDK and remove the dependency on the OpenAPI spec altogether. It'll become more important once we decouple Go SDK structs and methods from the underlying APIs. 3. Add enum values for Go SDK fields in the JSON schema. Better autocompletion and validation for these fields. As a follow-up, we can add enum values for non-Go SDK enums as well (created internal ticket to track). 4. Use "packageName.structName" as a key to read JSON schemas from the OpenAPI spec for Go SDK structs. Before, we would use an unrolled presentation of the JSON schema (stored in `bundle_descriptions.json`), which was complex to parse and include in the final JSON schema output. This also means loading values from the OpenAPI spec for `target` schema works automatically and no longer needs custom code. 5. Support recursive types (eg: `for_each_task`). With us now using $refs everywhere it's trivial to support. 6. Using complex variables would be invalid according to the schema generated before this PR. Now that bug is fixed. In the future adding more custom rules will be easier as well due to the single level nature of the JSON schema. Since this is a complete change of approach in how we generate the JSON schema, there are a few (very minor) regressions worth calling out. 1. We'll lose a few custom descriptions for non Go SDK structs that were a part of `bundle_descriptions.json`. Support for those can be added in the future as a followup. 2. Since now the final JSON schema is a static artefact, we lose some lead time for the signal that JSON schema integration tests are failing. It's okay though since we have a lot of coverage via the existing unit tests. ## Tests Unit tests. End to end tests are being added in this PR: https://github.com/databricks/cli/pull/1726 Previous unit tests were all deleted because they were bloated. Effort was made to make the new unit tests provide (almost) equivalent coverage.	2024-09-10 13:55:18 +00:00
shreyas-goenka	3bc68e9dd2	Clarify file format required for the `config-file` flag in `bundle init` (#1651 )	2024-08-05 12:24:22 +00:00
shreyas-goenka	89c0af5bdc	Add resource for UC schemas to DABs (#1413 ) ## Changes This PR adds support for UC Schemas to DABs. This allows users to define schemas for tables and other assets their pipelines/workflows create as part of the DAB, thus managing the life-cycle in the DAB. The first version has a couple of intentional limitations: 1. The owner of the schema will be the deployment user. Changing the owner of the schema is not allowed (yet). `run_as` will not be restricted for DABs containing UC schemas. Let's limit the scope of run_as to the compute identity used instead of ownership of data assets like UC schemas. 2. API fields that are present in the update API but not the create API. For example: enabling predictive optimization is not supported in the create schema API and thus is not available in DABs at the moment. ## Tests Manually and integration test. Manually verified the following work: 1. Development mode adds a "dev_" prefix. 2. Modified status is correctly computed in the `bundle summary` command. 3. Grants work as expected, for assigning privileges. 4. Variable interpolation works for the schema ID.	2024-07-31 12:16:28 +00:00
Gleb Kanterov	af975ca64b	Print diagnostics in 'bundle deploy' (#1579 ) ## Changes Print diagnostics in 'bundle deploy' similar to 'bundle validate'. This way if a bundle has any errors or warnings, they are going to be easy to notice. NB: due to how we render errors, there is one extra trailing new line in output, preserved in examples below ## Example: No errors or warnings ``` % databricks bundle deploy Building default... Deploying resources... Updating deployment state... Deployment complete! ``` ## Example: Error on load ``` % databricks bundle deploy Error: Databricks CLI version constraint not satisfied. Required: >= 1337.0.0, current: 0.0.0-dev ``` ## Example: Warning on load ``` % databricks bundle deploy Building default... Deploying resources... Updating deployment state... Deployment complete! Warning: unknown field: foo in databricks.yml:6:1 ``` ## Example: Error + warning on load ``` % databricks bundle deploy Warning: unknown field: foo in databricks.yml:6:1 Error: something went wrong ``` ## Example: Warning on load + error in init ``` % databricks bundle deploy Warning: unknown field: foo in databricks.yml:6:1 Error: Failed to xxx in yyy.yml Detailed explanation in multiple lines ``` ## Tests Tested manually	2024-07-10 11:14:57 +00:00
shreyas-goenka	1da04a4318	Remove schema override for variable default value (#1536 ) ## Changes This PR: 1. Removes the custom added in https://github.com/databricks/cli/pull/1396/files for `variables..default`. It's no longer needed because with complex variables (https://github.com/databricks/cli/pull/1467) `default` has a type of any. 2. Retains, and extends the override on `targets..variables.*`. Target override values can now be complex objects, not just primitive values. ## Tests Manually Before: Only primitive types were allowed. <img width="436" alt="Screenshot 2024-06-27 at 3 58 34 PM" src="https://github.com/databricks/cli/assets/88374338/de55be5c-9236-4ccc-b529-97ce7201b8e8"> After: An empty JSON schema is generated. All YAML values are acceptable. <img width="453" alt="Screenshot 2024-06-27 at 3 57 15 PM" src="https://github.com/databricks/cli/assets/88374338/7534b770-563f-4efc-b9c6-0af526dcc705">	2024-07-10 06:57:27 +00:00
Gleb Kanterov	d30c4c730d	Add new template (#1578 ) ## Changes Add a new hidden experimental template ## Tests Tested manually	2024-07-08 13:32:56 +00:00
Gleb Kanterov	e8b76a7f13	Improve `bundle validate` output (#1532 ) ## Changes This combination of changes allows pretty-printing errors happening during the "load" and "init" phases, including their locations. Move to render code into a separate module dedicated to rendering `diag.Diagnostics` in a human-readable format. This will be used for the `bundle deploy` command. Preserve the "bundle" value if an error occurs in mutators. Rewrite the go templates to handle the case when the bundle isn't yet loaded if an error occurs during loading, that is possible now. Improve rendering for errors and warnings: - don't render empty locations - render "details" for errors if they exist Add `root.ErrAlreadyPrinted` indicating that the error was already printed, and the CLI entry point shouldn't print it again. ## Tests Add tests for output, that are especially handy to detect extra newlines	2024-07-01 09:01:10 +00:00
shreyas-goenka	553fdd1e81	Serialize dynamic value for `bundle validate` output (#1499 ) ## Changes Using dynamic values allows us to retain references like `${resources.jobs...}` even when the type of field is not integer, eg: `run_job_task`, or in general values that do not map to the Go types for a field. ## Tests Integration test	2024-06-18 15:04:20 +00:00
shreyas-goenka	44e3928d6a	Avoid multiple file tree traversals on bundle deploy (#1493 ) ## Changes To run bundle deploy from DBR we use an abstraction over the workspace import / export APIs to create a `filer.Filer` and abstract the file system. Walking the file tree in such a filer is expensive and requires multiple API calls. This PR remove the two duplicate file tree walks that happen by caching the result.	2024-06-17 09:48:52 +00:00
Lennart Kats (databricks)	aa36aee159	Make dbt-sql and default-sql templates public (#1463 ) ## Changes This makes the dbt-sql and default-sql templates public. These templates were previously not listed and marked "experimental" since structured streaming tables were still in gated preview and would result in weird error messages when a workspace wasn't enabled for the preview. This PR also incorporates some of the feedback and learnings for these templates so far.	2024-06-04 08:57:13 +00:00
Pieter Noordhuis	db84a707cd	Fix bundle documentation URL (#1399 ) Closes #1395.	2024-04-25 11:25:26 +00:00
shreyas-goenka	d949f2b4f2	Fix bundle schema for variables (#1396 ) ## Changes This PR fixes the variable schema to: 1. Allow non-string values in the "default" value of a variable. 2. Allow non-string overrides in a target for a variable. ## Tests Manually. There are no longer squiggly lines. Before: <img width="329" alt="Screenshot 2024-04-24 at 3 26 43 PM" src="https://github.com/databricks/cli/assets/88374338/43be02c2-80a4-4f80-bd79-0f3e1e93ee17"> After: <img width="361" alt="Screenshot 2024-04-24 at 3 26 10 PM" src="https://github.com/databricks/cli/assets/88374338/2c1fb892-a2a2-478b-8d2e-9bda6d844b54">	2024-04-25 11:23:50 +00:00
Pieter Noordhuis	3108883a8f	Processing and completion of positional args to bundle run (#1120 ) ## Changes With this change, both job parameters and task parameters can be specified as positional arguments to bundle run. How the positional arguments are interpreted depends on the configuration of the job. ### Examples: For a job that has job parameters configured a user can specify: ``` databricks bundle run my_job -- --param1=value1 --param2=value2 ``` And the run is kicked off with job parameters set to: ```json { "param1": "value1", "param2": "value2" } ``` Similarly, for a job that doesn't use job parameters and only has `notebook_task` tasks, a user can specify: ``` databricks bundle run my_notebook_job -- --param1=value1 --param2=value2 ``` And the run is kicked off with task level `notebook_params` configured as: ```json { "param1": "value1", "param2": "value2" } ``` For a job that doesn't doesn't use job parameters and only has either `spark_python_task` or `python_wheel_task` tasks, a user can specify: ``` databricks bundle run my_python_file_job -- --flag=value other arguments ``` And the run is kicked off with task level `python_params` configured as: ```json [ "--flag=value", "other", "arguments" ] ``` The same is applied to jobs with only `spark_jar_task` or `spark_submit_task` tasks. ## Tests Unit tests. Tested the completions manually.	2024-04-22 11:50:13 +00:00
shreyas-goenka	331313ea5f	Print host in `bundle validate` when passed via profile or environment variables (#1378 ) ## Changes Fixes to get host from the workspace client rather than only printing the host when it's configured in the bundle config. ## Tests Manually. When a profile was specified for auth. Before: ``` ➜ bundle-playground git:(master) ✗ cli bundle validate Name: bundle-playground Target: default Workspace: Host: User: shreyas.goenka@databricks.com Path: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default ``` After: ``` ➜ bundle-playground git:(master) ✗ cli bundle validate Name: bundle-playground Target: default Workspace: Host: https://e2-dogfood.staging.cloud.databricks.com User: shreyas.goenka@databricks.com Path: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default ```	2024-04-19 11:43:50 +00:00
Andrew Nester	27f51c760f	Added validate mutator to surface additional bundle warnings (#1352 ) ## Changes All these validators will return warnings as part of `bundle validate` run Added 2 mutators: 1. To check that if tasks use job_cluster_key it is actually defined 2. To check if there are any files to sync as part of deployment Also added `bundle.Parallel` to run them in parallel To make sure mutators under bundle.Parallel do not mutate config, introduced new `ReadOnlyMutator`, `ReadOnlyBundle` and `ReadOnlyConfig`. Example ``` databricks bundle validate -p deco-staging Warning: unknown field: new_cluster at resources.jobs.my_job in bundle.yml:24:7 Warning: job_cluster_key high_cpu_workload_job_cluster is not defined at resources.jobs.my_job.tasks[0].job_cluster_key in bundle.yml:35:28 Warning: There are no files to sync, please check your your .gitignore and sync.exclude configuration at sync.exclude in bundle.yml:18:5 Name: test Target: default Workspace: Host: https://acme.databricks.com User: andrew.nester@databricks.com Path: /Users/andrew.nester@databricks.com/.bundle/test/default Found 3 warnings ``` ## Tests Added unit tests	2024-04-18 15:13:16 +00:00
Pieter Noordhuis	04cbc7171e	Make bundle validation print text output by default (#1335 ) ## Changes It now shows human-readable warnings and validation status. ## Tests * Manual tests against many examples. * Errors still return immediately.	2024-04-03 15:33:43 +00:00
dependabot[bot]	f28a9d7107	Bump github.com/databricks/databricks-sdk-go from 0.36.0 to 0.37.0 (#1326 ) [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/databricks/databricks-sdk-go&package-manager=go_modules&previous-version=0.36.0&new-version=0.37.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Andrew Nester <andrew.nester@databricks.com>	2024-04-03 10:39:53 +00:00
Ilia Babanov	079c416f8d	Add `bundle debug terraform` command (#1294 ) - Add `bundle debug terraform` command. It prints versions of the Terraform and the Databricks Terraform provider. In the text mode it also explains how to setup the CLI in environments with restricted internet access. - Use `DATABRICKS_TF_EXEC_PATH` env var to point Databricks CLI to the Terraform binary. The CLI only uses it if `DATABRICKS_TF_VERSION` matches the currently used terraform version. - Use `DATABRICKS_TF_CLI_CONFIG_FILE` env var to point Terraform CLI config that points to the filesystem mirror for the Databricks provider. The CLI only uses it if `DATABRICKS_TF_PROVIDER_VERSION` matches the currently used provider version. Relevant PR on the VSCode extension side: https://github.com/databricks/databricks-vscode/pull/1147 Example output of the `databricks bundle debug terraform`: ``` Terraform version: 1.5.5 Terraform URL: https://releases.hashicorp.com/terraform/1.5.5 Databricks Terraform Provider version: 1.38.0 Databricks Terraform Provider URL: https://github.com/databricks/terraform-provider-databricks/releases/tag/v1.38.0 Databricks CLI downloads its Terraform dependencies automatically. If you run the CLI in an air-gapped environment, you can download the dependencies manually and set these environment variables: DATABRICKS_TF_VERSION=1.5.5 DATABRICKS_TF_EXEC_PATH=/path/to/terraform/binary DATABRICKS_TF_PROVIDER_VERSION=1.38.0 DATABRICKS_TF_CLI_CONFIG_FILE=/path/to/terraform/cli/config.tfrc Here is an example *.tfrc configuration file: disable_checkpoint = true provider_installation { filesystem_mirror { path = "/path/to/a/folder/with/databricks/terraform/provider" } } The filesystem mirror path should point to the folder with the Databricks Terraform Provider. The folder should have this structure: /registry.terraform.io/databricks/databricks/terraform-provider-databricks_1.38.0_ARCH.zip For more information about filesystem mirrors, see the Terraform documentation: https://developer.hashicorp.com/terraform/cli/config/config-file#filesystem_mirror ``` --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-04-02 12:56:27 +00:00
Pieter Noordhuis	eea34b2504	Return diagnostics from `config.Load` (#1324 ) ## Changes We no longer need to store load diagnostics on the `config.Root` type itself and instead can return them from the `config.Load` call directly. It is up to the caller of this function to append them to previous diagnostics, if any. Background: previous commits moved configuration loading of the entry point into a mutator, so now all diagnostics naturally flow from applying mutators. This PR depends on #1319. ## Tests Unit and manual validation of the debug statements in the validate command.	2024-03-28 10:59:03 +00:00
Pieter Noordhuis	b21e3c81cd	Make bundle loaders return diagnostics (#1319 ) ## Changes The function signature of Cobra's `PreRunE` function has an `error` return value. We'd like to start returning `diag.Diagnostics` after loading a bundle, so this is incompatible. This change modifies all usage of `PreRunE` to load a bundle to inline function calls in the command's `RunE` function. ## Tests * Unit tests pass. * Integration tests pass.	2024-03-28 10:32:34 +00:00
Pieter Noordhuis	00d76d5afa	Move path field to bundle type (#1316 ) ## Changes The bundle path was previously stored on the `config.Root` type under the assumption that the first configuration file being loaded would set it. This is slightly counterintuitive and we know what the path is upon construction of the bundle. The new location for this property reflects this. ## Tests Unit tests pass.	2024-03-27 09:03:24 +00:00
Pieter Noordhuis	ed194668db	Return `diag.Diagnostics` from mutators (#1305 ) ## Changes This diagnostics type allows us to capture multiple warnings as well as errors in the return value. This is a preparation for returning additional warnings from mutators in case we detect non-fatal problems. * All return statements that previously returned an error now return `diag.FromErr` * All return statements that previously returned `fmt.Errorf` now return `diag.Errorf` * All `err != nil` checks now use `diags.HasError()` or `diags.Error()` ## Tests * Existing tests pass. * I confirmed no call site under `./bundle` or `./cmd/bundle` uses `errors.Is` on the return value from mutators. This is relevant because we cannot wrap errors with `%w` when calling `diag.Errorf` (like `fmt.Errorf`; context in https://github.com/golang/go/issues/47641).	2024-03-25 14:18:47 +00:00
Andrew Nester	1b0ac61093	Added deployment state for bundles (#1267 ) ## Changes This PR introduces new structure (and a file) being used locally and synced remotely to Databricks workspace to track bundle deployment related metadata. The state is pulled from remote, updated and pushed back remotely as part of `bundle deploy` command. This state can be used for deployment sequencing as it's `Version` field is monotonically increasing on each deployment. Currently, it only tracks files being synced as part of the deployment. This helps fix the issue with files not being removed during deployments on CI/CD as sync snapshot was never present there. Fixes #943 ## Tests Added E2E (regression) test for files removal on CI/CD --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-03-18 14:41:58 +00:00
Andrew Nester	c7818560ca	Add usage string when command fails with incorrect arguments (#1276 ) ## Changes Add usage string when command fails with incorrect arguments Fixes #1119 ## Tests Example output ``` > databricks libraries cluster-status Error: accepts 1 arg(s), received 0 Usage: databricks libraries cluster-status CLUSTER_ID [flags] Flags: -h, --help help for cluster-status Global Flags: --debug enable debug logging -o, --output type output type: text or json (default text) -p, --profile string ~/.databrickscfg profile -t, --target string bundle target to use (if applicable) ```	2024-03-12 14:12:34 +00:00
Pieter Noordhuis	e1407038d3	Configure cobra.NoArgs for bundle commands where applicable (#1250 ) ## Changes Return an error if unused arguments are passed to these commands. ## Tests n/a	2024-03-01 15:50:20 +00:00
Ilia Babanov	d12f88e24d	Fix summary command when internal terraform config doesn't exist (#1242 ) Check if `bundle.tf.json` doesn't exist and create it before executing `terraform init` (inside `terraform.Load`) Fixes a problem when during `terraform.Load` it fails with: ``` Error: Failed to load plugin schemas Error while loading schemas for plugin components: Failed to obtain provider schema: Could not load the schema for provider registry.terraform.io/databricks/databricks: failed to instantiate provider "registry.terraform.io/databricks/databricks" to obtain schema: unavailable provider "registry.terraform.io/databricks/databricks".. ```	2024-03-01 08:25:12 +00:00
Andrew Nester	1b4a774609	Only set ComputeID value when `--compute-id` flag provided (#1229 ) ## Changes Fixes an issue when `compute_id` is defined in the bundle config, correctly replaced in `validate` command but not used in `deploy` command ## Tests Manually	2024-02-22 15:14:06 +00:00
Lennart Kats (databricks)	162b115e19	Add an experimental default-sql template (#1051 ) ## Changes This adds a `default-sql` template! In this latest revision, I've hidden the new template from the list so we can merge it, iterate over it, and properly release the template at the right time. - [x] WorkspaceFS support for .sql files is in prod - [x] SQL extension is preconfigured based on extension settings (if possible) - [ ] Streaming tables support is either ungated or the template provides instructions about signup - _Mitigation for now: this template is hidden from the list of templates._ - [x] Support non-UC workspaces ## Tests - [x] Unit tests - [x] Manual testing - [x] More manual testing - [x] Reviewer testing --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>	2024-02-19 12:01:11 +00:00
Lennart Kats (databricks)	1c680121c8	Add an experimental dbt-sql template (#1059 ) ## Changes This adds a new dbt-sql template. This work requires the new WorkspaceFS support for dbt tasks. In this latest revision, I've hidden the new template from the list so we can merge it, iterate over it, and propertly release the template at the right time. Blockers: - [x] WorkspaceFS support for dbt projects is in prod - [x] Move dbt files into a subdirectory - [ ] Wait until the next (>1.7.4) release of the dbt plugin which will have major improvements! - _Rather than wait, this template is hidden from the list of templates._ - [x] SQL extension is preconfigured based on extension settings (if possible) - MV / streaming tables: - [x] Add to template - [x] Fix https://github.com/databricks/dbt-databricks/issues/535 (to be released with in 1.7.4) - [x] Merge https://github.com/databricks/dbt-databricks/pull/338 (to be released with in 1.7.4) - [ ] Fix "too many 503 errors" issue (https://github.com/databricks/dbt-databricks/issues/570, internal tracker: ES-1009215, ES-1014138) - [x] Support ANSI mode in the template - [ ] Streaming tables support is either ungated or the template provides instructions about signup - _Mitigation for now: this template is hidden from the list of templates._ - [x] Support non-workspace-admin deployment - [x] Make sure `data_security_mode: SINGLE_USER` works on non-UC workspaces (it's required to be explicitly specified on UC workspaces with single-node clusters) - [x] Support non-UC workspaces ## Tests - [x] Unit tests - [x] Manual testing - [x] More manual testing - [ ] Reviewer manual testing - _I'd like to do a small bug bash post-merging._ - [x] Unit tests	2024-02-19 09:15:17 +00:00
Pieter Noordhuis	87dd46a3f8	Use dynamic configuration model in bundles (#1098 ) ## Changes This is a fundamental change to how we load and process bundle configuration. We now depend on the configuration being represented as a `dyn.Value`. This representation is functionally equivalent to Go's `any` (it is variadic) and allows us to capture metadata associated with a value, such as where it was defined (e.g. file, line, and column). It also allows us to represent Go's zero values properly (e.g. empty string, integer equal to 0, or boolean false). Using this representation allows us to let the configuration model deviate from the typed structure we have been relying on so far (`config.Root`). We need to deviate from these types when using variables for fields that are not a string themselves. For example, using `${var.num_workers}` for an integer `workers` field was impossible until now (though not implemented in this change). The loader for a `dyn.Value` includes functionality to capture any and all type mismatches between the user-defined configuration and the expected types. These mismatches can be surfaced as validation errors in future PRs. Given that many mutators expect the typed struct to be the source of truth, this change converts between the dynamic representation and the typed representation on mutator entry and exit. Existing mutators can continue to modify the typed representation and these modifications are reflected in the dynamic representation (see `MarkMutatorEntry` and `MarkMutatorExit` in `bundle/config/root.go`). Required changes included in this change: * The existing interpolation package is removed in favor of `libs/dyn/dynvar`. * Functionality to merge job clusters, job tasks, and pipeline clusters are now all broken out into their own mutators. To be implemented later: * Allow variable references for non-string types. * Surface diagnostics about the configuration provided by the user in the validation output. * Some mutators use a resource's configuration file path to resolve related relative paths. These depend on `bundle/config/paths.Path` being set and populated through `ConfigureConfigFilePath`. Instead, they should interact with the dynamically typed configuration directly. Doing this also unlocks being able to differentiate different base paths used within a job (e.g. a task override with a relative path defined in a directory other than the base job). ## Tests * Existing unit tests pass (some have been modified to accommodate) * Integration tests pass	2024-02-16 19:41:58 +00:00
Andrew Nester	e474948a4b	Generate correct YAML if custom_tags or spark_conf is used for pipeline or job cluster configuration (#1210 ) These fields (key and values) needs to be double quoted in order for yaml loader to read, parse and unmarshal it into Go struct correctly because these fields are `map[string]string` type. ## Tests Added regression unit and E2E tests	2024-02-15 15:03:19 +00:00
Andrew Nester	80670eceed	Added `bundle deployment bind` and `unbind` command (#1131 ) ## Changes Added `bundle deployment bind` and `unbind` command. This command allows to bind bundle-defined resources to existing resources in Databricks workspace so they become DABs-managed. ## Tests Manually + added E2E test	2024-02-14 18:04:45 +00:00

1 2 3

123 Commits