databricks-cli

Commit Graph

Author	SHA1	Message	Date
Shreyas Goenka	0d0e45cc2c	moved tests	2025-01-03 16:18:16 +05:30
Shreyas Goenka	ea0bc1705c	more progress on cleaning up the APIs	2025-01-03 16:13:53 +05:30
Shreyas Goenka	2357954885	Add type for mock logger and some cleanup	2025-01-03 11:31:22 +05:30
Shreyas Goenka	0ac536f188	lint	2025-01-03 00:46:07 +05:30
Shreyas Goenka	d66fdcbbf8	merge	2025-01-02 18:14:28 +05:30
shreyas-goenka	7beb0fb8b5	Add validation mutator for volume `artifact_path` (#2050 ) ## Changes This PR: 1. Incrementally improves the error messages shown to the user when the volume they are referring to in `workspace.artifact_path` does not exist. 2. Performs this validation in both `bundle validate` and `bundle deploy` compared to before on just deployments. 3. It runs "fast" validations on `bundle deploy`, which earlier were only run on `bundle validate`. ## Tests Unit tests and manually. Also, existing integration tests provide coverage (`TestUploadArtifactToVolumeNotYetDeployed`, `TestUploadArtifactFileToVolumeThatDoesNotExist`) Examples: ``` .venv➜ bundle-playground git:(master) ✗ cli bundle validate Error: cannot access volume capital.whatever.my_volume: User does not have READ VOLUME on Volume 'capital.whatever.my_volume'. at workspace.artifact_path in databricks.yml:7:18 ``` and ``` .venv➜ bundle-playground git:(master) ✗ cli bundle validate Error: volume capital.whatever.foobar does not exist at workspace.artifact_path resources.volumes.foo in databricks.yml:7:18 databricks.yml:12:7 You are using a volume in your artifact_path that is managed by this bundle but which has not been deployed yet. Please first deploy the volume using 'bundle deploy' and then switch over to using it in the artifact_path. ```	2025-01-02 17:23:15 +05:30
Denis Bilenko	0b80784df7	Enable testifylint and fix the issues (#2065 ) ## Changes - Enable new linter: testifylint. - Apply fixes with --fix. - Fix remaining issues (mostly with aider). There were 2 cases we --fix did the wrong thing - this seems to a be a bug in linter: https://github.com/Antonboom/testifylint/issues/210 Nonetheless, I kept that check enabled, it seems useful, just need to be fixed manually after autofix. ## Tests Existing tests	2025-01-02 12:03:41 +01:00
Shreyas Goenka	9fb2c66c4a	add debug log	2024-12-30 12:12:05 +05:30
Shreyas Goenka	4563397edc	more tests and cleanup	2024-12-27 14:32:42 +05:30
Shreyas Goenka	44d43fccb7	add integration tests and bug fixes	2024-12-27 13:27:24 +05:30
Shreyas Goenka	ecb977f4ed	cleanup code	2024-12-27 12:19:42 +05:30
Shreyas Goenka	a3fbdd33a5	-	2024-12-27 11:48:44 +05:30
Shreyas Goenka	e4f088a65c	add event proper	2024-12-27 11:44:33 +05:30
Denis Bilenko	2e018cfaec	Enable gofumpt and goimports in golangci-lint (#1999 ) ## Changes Enable gofumpt and goimports in golangci-lint and apply autofix. This makes 'make fmt' redundant, will be cleaned up in follow up diff. ## Tests Existing tests.	2024-12-12 10:28:42 +01:00
Denis Bilenko	8d5351c1c3	Enable errcheck everywhere and fix or silent remaining issues (#1987 ) ## Changes Enable errcheck linter for the whole codebase. Fix remaining complaints: - If we can propagate error to caller, do that - If we writing to stdout, continue ignoring errors (to avoid crashing in "cli \| head" case) - Add exception for cobra non-critical API such as MarkHidden/MarkDeprecated/RegisterFlagCompletionFunc. This keeps current code and behaviour, to be decided later if we want to change this. - Continue ignoring errors where that is desired behaviour (e.g. git.loadConfig). - Continue ignoring errors where panicking seems riskier than ignoring the error. - Annotate cases in libs/dyn with //nolint:errcheck - to be addressed later. Note, this PR is not meant to come up with the best strategy for each case, but to be a relative safe change to enable errcheck linter. ## Tests Existing tests.	2024-12-11 13:26:00 +01:00
Denis Bilenko	1b2be1b2cb	Add error checking in tests and enable errcheck there (#1980 ) ## Changes Fix all errcheck-found issues in tests and test helpers. Mostly this done by adding require.NoError(t, err), sometimes panic() where t object is not available). Initial change is obtained with aider+claude, then manually reviewed and cleaned up. ## Tests Existing tests.	2024-12-09 13:56:41 +01:00
Andrew Nester	592e1111b7	Update filenames used by bundle generate to use `.<resource-type>.yml` (#1901 ) ## Changes Update filenames used by bundle generate to use '.resource-type.yml' Similar to [Add sub-extension to resource files in built-in templates by shreyas-goenka · Pull Request #1777 · databricks/cli](https://github.com/databricks/cli/pull/1777) --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-11-20 13:53:25 +01:00
Pieter Noordhuis	886e14910c	Fix template initialization when running on Databricks (#1912 ) ## Changes When running the CLI on Databricks Runtime (DBR), use the extension-aware filer to write an instantiated template if the instance path is located in the workspace filesystem. Notebooks cannot be written through the workspace filesystem's FUSE mount. As a result, this is the only method for initializing templates that contain notebooks when running the CLI on DBR and writing to the workspace filesystem. Depends on #1910 and #1911. Supersedes #1744. ## Tests * Manually confirmed I can initialize a template with notebooks when running the CLI from the web terminal.	2024-11-20 11:42:23 +00:00
Pieter Noordhuis	4fea0219fd	Use `fs.FS` interface to read template (#1910 ) ## Changes While working on the v2 of #1744, I found that: * Template initialization first copies built-in templates to a temporary directory before initializing them * Reading a template's contents goes through a `filer.Filer` but is hardcoded to a local one This change updates the interface for reading templates to be `fs.FS`. This is compatible with the `embed.FS` type for the built-in templates, so they no longer have to be copied to a temporary directory before being used. The alternative is to use a `filer.Filer` throughout, but this would have required even more plumbing, and we don't need to _read_ templates, including notebooks, from the workspace filesystem (yet?). As part of making `template.Materialize` take an `fs.FS` argument, the logic to match a given argument to a particular built-in template in the `init` command has moved to sit next to its implementation. ## Tests Existing tests pass.	2024-11-20 09:28:35 +00:00
Andrew Nester	162aa212bc	Do not execute build on bundle destroy (#1882 ) ## Changes There's no value in building artifacts on destroy because they are just removed from workspace as part of destroy.	2024-11-07 09:31:49 +00:00
Pieter Noordhuis	edff68c763	Fix bundle run when run interactively (#1880 ) ## Changes The commit where resource lookup was factored out into a separate package (#1858) didn't take into account the use of `args` further down in the code. This change fixes that oversight by returning the tail arguments when determining which resource to run. The later call no longer has to index the `args` slice. ## Tests Manually confirmed that the command works when being prompted for the resource to run.	2024-11-05 09:30:11 +00:00
Pieter Noordhuis	1896b09350	Add bundle generate variant for dashboards (#1847 ) ## Changes This change adds the `databricks bundle generate dashboard` command. The command requires one of three flags: * `--existing-id` to generate configuration for an existing dashboard by its ID. * `--existing-path` to generate configuration for an existing dashboard by its path in the workspace file system. * `--resource` to generate the `.lvdash.json` dashboard file for a dashboard that's already defined in the bundle. This option does not impact the YAML configuration. A typical workflow could look like this: 1. Use the command with `--existing-id` or `--existing-path` for a starting point 2. Run `bundle deploy` to deploy a copy of the dashboard 3. Run `bundle open` to open this copy in your browser 4. Navigate to the draft mode and make modifications 5. Run `bundle generate dashboard` with `--resource` to update the local `.lvdash.json` file with the remote modifications ## Tests * Unit tests. * Manual walkthrough as documented in the [Dashboard for NYC Taxi Trip Analysis example](https://github.com/databricks/bundle-examples/tree/main/knowledge_base/dashboard_nyc_taxi).	2024-10-29 11:51:59 +00:00
Pieter Noordhuis	ed84a33b0a	Reuse resource resolution code for the run command (#1858 ) ## Changes As of #1846 we have a generalized package for doing resource lookups and completion. This change updates the run command to use this instead of more specific code under `bundle/run`. ## Tests * Unit tests pass * Manually confirmed that completion and prompting works	2024-10-24 13:24:30 +00:00
Pieter Noordhuis	89ee7d8a99	Add command to open a resource in the browser (#1846 ) ## Changes This builds on the functionality added in #1731 that produces a URL for every resource. Adds `bundle/resources` package to deal with resource lookups and command completion. The new functionality is similar to the lookup and command completion functionality located in `bundle/run`. It differs in that it doesn't gracefully deal with ambiguous references to resources, now that we explicitly validate this doesn't occur in the bundle configuration. It still allows resources to be looked up with their fully qualified key, `<plural type>.<key>`. ## Tests * Added unit tests for resource lookup and completion * Manually confirmed that `bundle open` prompts, accepts a key argument, and opens a browser	2024-10-24 12:20:33 +00:00
Ilia Babanov	55a055d0f5	Add "output" flag to the bundle sync command (#1853 ) ## Changes We want to use 'bundle sync' in the vscode extension before running a file as an ad-hoc job (or through the context api). Right now we use bundle deploy in these cases, but deploying bundle resources is not always expected when you just want to quickly run a file. Sync makes more sense in these cases, but we still want to have verbose output to see what's happening. In the 'deploy' command we have hidden 'verbose' flag. For the sync I've just added 'output' flag, handling both json and text cases, similar to how it's done in the non-bundle `sync` command. The flag is not hidden (although we still don't show any output by default, if the flag is not set). VSCode Extension PR: https://github.com/databricks/databricks-vscode/pull/1401 ## Tests Manually	2024-10-23 11:08:12 +00:00
shreyas-goenka	3bab21e72e	Fix race condition when restarting continuous jobs (#1849 ) ## Changes We don't need to cancel existing runs when the job is continuous and unpaused. The `/jobs/run-now` command will cancel the existing run and trigger a new one automatically. Cancelling the job manually can cause a race condition where both the manual trigger from the CLI and the continuous trigger from the job configuration happens at the same time. This PR prevents that from happening. ## Tests Unit tests and manually	2024-10-22 14:59:17 +00:00
Lennart Kats (databricks)	c5043c3d9d	Add `bundle summary` to display URLs for deployed resources (#1731 ) ## Changes Adds a textual output to the `databricks bundle summary` command, which includes URLs of deployed resources. Example usage: ``` $ databricks bundle summary Name: my_pipeline Target: dev Workspace: Host: https://domain.databricks.com User: user@databricks.com Path: /Users/user@databricks.com/.bundle/my_pipeline/dev Resources: Jobs: my_project_job: Name: [dev lennart] my_project_job URL: https://domain.databricks.com/jobs/206899209187287?o=6051921418418893 Pipelines: my_project_pipeline: Name: [dev lennart] my_project_pipeline URL: https://domain.databricks.com/pipelines/3f849fd5-ba7d-47fa-a34c-c6bf034b4f58?o=6051921418418893 ``` Notes: * The top headers of the output are the same as those from the existing `bundle validate` command * URLs are colored light blue in the output * For resources that haven't been deployed yet, we show `(not deployed)` in place of the URL --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>	2024-10-18 06:45:47 +00:00
Pieter Noordhuis	1d1aa0a416	Rename `RootPath` -> `BundleRootPath` (#1792 ) ## Changes After introducing the `SyncRootPath` field on the bundle (#1694), the previous `RootPath` became ambiguous. Does it mean the bundle root path or the sync root path? This PR renames to field to `BundleRootPath` to remove the ambiguity. ## Tests n/a --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-09-27 10:03:05 +00:00
Andrew Nester	56ed9bebf3	Added support for creating all-purpose clusters (#1698 ) ## Changes Added support for creating all-purpose clusters Example of configuration ``` bundle: name: clusters resources: clusters: test_cluster: cluster_name: "Test Cluster" num_workers: 2 node_type_id: "i3.xlarge" autoscale: min_workers: 2 max_workers: 7 spark_version: "13.3.x-scala2.12" spark_conf: "spark.executor.memory": "2g" jobs: test_job: name: "Test Job" tasks: - task_key: test_task existing_cluster_id: ${resources.clusters.test_cluster.id} notebook_task: notebook_path: "./src/test.py" targets: development: mode: development compute_id: ${resources.clusters.test_cluster.id} ``` ## Tests Added unit, config and E2E tests	2024-09-23 10:42:34 +00:00
Ilia Babanov	ac80d3dfcb	Add verbose flag to the "bundle deploy" command (#1774 ) ## Changes - Extract sync output logic from `cmd/sync` into `lib/sync` - Add hidden `verbose` flag to the `bundle deploy` command, it's false by default and hidden from the `--help` output - Pass output handler to the `deploy/files/upload` mutator if the verbose option is true The was an idea to use in-place output overriding each past file sync event in the output, bit that wont work for the extension, since it doesn't display deploy logs in the terminal. Example output: ``` ~/tmp/defpy: ~/cli/cli bundle deploy --sync-progress Building defpy... Uploading defpy-0.0.1+20240917.112755-py3-none-any.whl... Uploading bundle files to /Users/ilia.babanov@databricks.com/.bundle/defpy/dev/files... Action: PUT: requirements-dev.txt, resources/defpy_pipeline.yml, pytest.ini, src/defpy/main.py, src/defpy/__init__.py, src/dlt_pipeline.ipynb, tests/main_test.py, src/notebook.ipynb, setup.py, resources/defpy_job.yml, .vscode/extensions.json, .vscode/settings.json, fixtures/.gitkeep, .vscode/__builtins__.pyi, README.md, .gitignore, databricks.yml Uploaded tests Uploaded resources Uploaded fixtures Uploaded .vscode Uploaded src/defpy Uploaded requirements-dev.txt Uploaded .gitignore Uploaded fixtures/.gitkeep Uploaded src/defpy/__init__.py Uploaded databricks.yml Uploaded README.md Uploaded setup.py Uploaded .vscode/__builtins__.pyi Uploaded .vscode/extensions.json Uploaded src/dlt_pipeline.ipynb Uploaded .vscode/settings.json Uploaded resources/defpy_job.yml Uploaded pytest.ini Uploaded src/defpy/main.py Uploaded tests/main_test.py Uploaded resources/defpy_pipeline.yml Uploaded src/notebook.ipynb Initial Sync Complete Deploying resources... Updating deployment state... Deployment complete! ``` Output example in the extension: <img width="1843" alt="Screenshot 2024-09-19 at 11 07 48" src="https://github.com/user-attachments/assets/0fafd095-cdc6-44b8-b482-27a38ada0330"> ## Tests Manually for the `sync` and `bundle deploy` commands + vscode extension sync and deploy flows	2024-09-23 10:09:11 +00:00
Lennart Kats (databricks)	6c57683dc6	Reduce time until the prompt is shown for bundle run (#1727 ) ## Summary Makes the `databricks bundle run` command use local state before showing the menu prompt, which makes it show more quickly. For large/busy workspaces this means the prompt can show 2-3 seconds earlier.	2024-09-21 06:36:47 +00:00
Andrew Nester	66307134c1	Fixed generated YAML missing 'default' for empty values (#1765 ) ## Changes Fixed generated YAML missing 'default' for empty values ## Tests Added unit test	2024-09-11 09:49:58 +00:00
shreyas-goenka	28b39cd3f7	Make bundle JSON schema modular with `$defs` (#1700 ) ## Changes This PR makes sweeping changes to the way we generate and test the bundle JSON schema. The main benefits are: 1. More modular JSON schema. Every definition in the schema now is one level deep and points to references instead of inlining the entire schema for a field. This unblocks PyDABs from taking a dependency on the JSON schema. 2. Generate the JSON schema during CLI code generation. Directly stream it instead of computing it at runtime whenever a user calls `databricks bundle schema`. This is nice because we no longer need to embed a partial OpenAPI spec in the CLI. Down the line, we can add a `Schema()` method to every struct in the Databricks Go SDK and remove the dependency on the OpenAPI spec altogether. It'll become more important once we decouple Go SDK structs and methods from the underlying APIs. 3. Add enum values for Go SDK fields in the JSON schema. Better autocompletion and validation for these fields. As a follow-up, we can add enum values for non-Go SDK enums as well (created internal ticket to track). 4. Use "packageName.structName" as a key to read JSON schemas from the OpenAPI spec for Go SDK structs. Before, we would use an unrolled presentation of the JSON schema (stored in `bundle_descriptions.json`), which was complex to parse and include in the final JSON schema output. This also means loading values from the OpenAPI spec for `target` schema works automatically and no longer needs custom code. 5. Support recursive types (eg: `for_each_task`). With us now using $refs everywhere it's trivial to support. 6. Using complex variables would be invalid according to the schema generated before this PR. Now that bug is fixed. In the future adding more custom rules will be easier as well due to the single level nature of the JSON schema. Since this is a complete change of approach in how we generate the JSON schema, there are a few (very minor) regressions worth calling out. 1. We'll lose a few custom descriptions for non Go SDK structs that were a part of `bundle_descriptions.json`. Support for those can be added in the future as a followup. 2. Since now the final JSON schema is a static artefact, we lose some lead time for the signal that JSON schema integration tests are failing. It's okay though since we have a lot of coverage via the existing unit tests. ## Tests Unit tests. End to end tests are being added in this PR: https://github.com/databricks/cli/pull/1726 Previous unit tests were all deleted because they were bloated. Effort was made to make the new unit tests provide (almost) equivalent coverage.	2024-09-10 13:55:18 +00:00
shreyas-goenka	3bc68e9dd2	Clarify file format required for the `config-file` flag in `bundle init` (#1651 )	2024-08-05 12:24:22 +00:00
shreyas-goenka	89c0af5bdc	Add resource for UC schemas to DABs (#1413 ) ## Changes This PR adds support for UC Schemas to DABs. This allows users to define schemas for tables and other assets their pipelines/workflows create as part of the DAB, thus managing the life-cycle in the DAB. The first version has a couple of intentional limitations: 1. The owner of the schema will be the deployment user. Changing the owner of the schema is not allowed (yet). `run_as` will not be restricted for DABs containing UC schemas. Let's limit the scope of run_as to the compute identity used instead of ownership of data assets like UC schemas. 2. API fields that are present in the update API but not the create API. For example: enabling predictive optimization is not supported in the create schema API and thus is not available in DABs at the moment. ## Tests Manually and integration test. Manually verified the following work: 1. Development mode adds a "dev_" prefix. 2. Modified status is correctly computed in the `bundle summary` command. 3. Grants work as expected, for assigning privileges. 4. Variable interpolation works for the schema ID.	2024-07-31 12:16:28 +00:00
Gleb Kanterov	af975ca64b	Print diagnostics in 'bundle deploy' (#1579 ) ## Changes Print diagnostics in 'bundle deploy' similar to 'bundle validate'. This way if a bundle has any errors or warnings, they are going to be easy to notice. NB: due to how we render errors, there is one extra trailing new line in output, preserved in examples below ## Example: No errors or warnings ``` % databricks bundle deploy Building default... Deploying resources... Updating deployment state... Deployment complete! ``` ## Example: Error on load ``` % databricks bundle deploy Error: Databricks CLI version constraint not satisfied. Required: >= 1337.0.0, current: 0.0.0-dev ``` ## Example: Warning on load ``` % databricks bundle deploy Building default... Deploying resources... Updating deployment state... Deployment complete! Warning: unknown field: foo in databricks.yml:6:1 ``` ## Example: Error + warning on load ``` % databricks bundle deploy Warning: unknown field: foo in databricks.yml:6:1 Error: something went wrong ``` ## Example: Warning on load + error in init ``` % databricks bundle deploy Warning: unknown field: foo in databricks.yml:6:1 Error: Failed to xxx in yyy.yml Detailed explanation in multiple lines ``` ## Tests Tested manually	2024-07-10 11:14:57 +00:00
shreyas-goenka	1da04a4318	Remove schema override for variable default value (#1536 ) ## Changes This PR: 1. Removes the custom added in https://github.com/databricks/cli/pull/1396/files for `variables..default`. It's no longer needed because with complex variables (https://github.com/databricks/cli/pull/1467) `default` has a type of any. 2. Retains, and extends the override on `targets..variables.*`. Target override values can now be complex objects, not just primitive values. ## Tests Manually Before: Only primitive types were allowed. <img width="436" alt="Screenshot 2024-06-27 at 3 58 34 PM" src="https://github.com/databricks/cli/assets/88374338/de55be5c-9236-4ccc-b529-97ce7201b8e8"> After: An empty JSON schema is generated. All YAML values are acceptable. <img width="453" alt="Screenshot 2024-06-27 at 3 57 15 PM" src="https://github.com/databricks/cli/assets/88374338/7534b770-563f-4efc-b9c6-0af526dcc705">	2024-07-10 06:57:27 +00:00
Gleb Kanterov	d30c4c730d	Add new template (#1578 ) ## Changes Add a new hidden experimental template ## Tests Tested manually	2024-07-08 13:32:56 +00:00
Gleb Kanterov	e8b76a7f13	Improve `bundle validate` output (#1532 ) ## Changes This combination of changes allows pretty-printing errors happening during the "load" and "init" phases, including their locations. Move to render code into a separate module dedicated to rendering `diag.Diagnostics` in a human-readable format. This will be used for the `bundle deploy` command. Preserve the "bundle" value if an error occurs in mutators. Rewrite the go templates to handle the case when the bundle isn't yet loaded if an error occurs during loading, that is possible now. Improve rendering for errors and warnings: - don't render empty locations - render "details" for errors if they exist Add `root.ErrAlreadyPrinted` indicating that the error was already printed, and the CLI entry point shouldn't print it again. ## Tests Add tests for output, that are especially handy to detect extra newlines	2024-07-01 09:01:10 +00:00
shreyas-goenka	553fdd1e81	Serialize dynamic value for `bundle validate` output (#1499 ) ## Changes Using dynamic values allows us to retain references like `${resources.jobs...}` even when the type of field is not integer, eg: `run_job_task`, or in general values that do not map to the Go types for a field. ## Tests Integration test	2024-06-18 15:04:20 +00:00
shreyas-goenka	44e3928d6a	Avoid multiple file tree traversals on bundle deploy (#1493 ) ## Changes To run bundle deploy from DBR we use an abstraction over the workspace import / export APIs to create a `filer.Filer` and abstract the file system. Walking the file tree in such a filer is expensive and requires multiple API calls. This PR remove the two duplicate file tree walks that happen by caching the result.	2024-06-17 09:48:52 +00:00
Lennart Kats (databricks)	aa36aee159	Make dbt-sql and default-sql templates public (#1463 ) ## Changes This makes the dbt-sql and default-sql templates public. These templates were previously not listed and marked "experimental" since structured streaming tables were still in gated preview and would result in weird error messages when a workspace wasn't enabled for the preview. This PR also incorporates some of the feedback and learnings for these templates so far.	2024-06-04 08:57:13 +00:00
Pieter Noordhuis	db84a707cd	Fix bundle documentation URL (#1399 ) Closes #1395.	2024-04-25 11:25:26 +00:00
shreyas-goenka	d949f2b4f2	Fix bundle schema for variables (#1396 ) ## Changes This PR fixes the variable schema to: 1. Allow non-string values in the "default" value of a variable. 2. Allow non-string overrides in a target for a variable. ## Tests Manually. There are no longer squiggly lines. Before: <img width="329" alt="Screenshot 2024-04-24 at 3 26 43 PM" src="https://github.com/databricks/cli/assets/88374338/43be02c2-80a4-4f80-bd79-0f3e1e93ee17"> After: <img width="361" alt="Screenshot 2024-04-24 at 3 26 10 PM" src="https://github.com/databricks/cli/assets/88374338/2c1fb892-a2a2-478b-8d2e-9bda6d844b54">	2024-04-25 11:23:50 +00:00
Pieter Noordhuis	3108883a8f	Processing and completion of positional args to bundle run (#1120 ) ## Changes With this change, both job parameters and task parameters can be specified as positional arguments to bundle run. How the positional arguments are interpreted depends on the configuration of the job. ### Examples: For a job that has job parameters configured a user can specify: ``` databricks bundle run my_job -- --param1=value1 --param2=value2 ``` And the run is kicked off with job parameters set to: ```json { "param1": "value1", "param2": "value2" } ``` Similarly, for a job that doesn't use job parameters and only has `notebook_task` tasks, a user can specify: ``` databricks bundle run my_notebook_job -- --param1=value1 --param2=value2 ``` And the run is kicked off with task level `notebook_params` configured as: ```json { "param1": "value1", "param2": "value2" } ``` For a job that doesn't doesn't use job parameters and only has either `spark_python_task` or `python_wheel_task` tasks, a user can specify: ``` databricks bundle run my_python_file_job -- --flag=value other arguments ``` And the run is kicked off with task level `python_params` configured as: ```json [ "--flag=value", "other", "arguments" ] ``` The same is applied to jobs with only `spark_jar_task` or `spark_submit_task` tasks. ## Tests Unit tests. Tested the completions manually.	2024-04-22 11:50:13 +00:00
shreyas-goenka	331313ea5f	Print host in `bundle validate` when passed via profile or environment variables (#1378 ) ## Changes Fixes to get host from the workspace client rather than only printing the host when it's configured in the bundle config. ## Tests Manually. When a profile was specified for auth. Before: ``` ➜ bundle-playground git:(master) ✗ cli bundle validate Name: bundle-playground Target: default Workspace: Host: User: shreyas.goenka@databricks.com Path: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default ``` After: ``` ➜ bundle-playground git:(master) ✗ cli bundle validate Name: bundle-playground Target: default Workspace: Host: https://e2-dogfood.staging.cloud.databricks.com User: shreyas.goenka@databricks.com Path: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default ```	2024-04-19 11:43:50 +00:00
Andrew Nester	27f51c760f	Added validate mutator to surface additional bundle warnings (#1352 ) ## Changes All these validators will return warnings as part of `bundle validate` run Added 2 mutators: 1. To check that if tasks use job_cluster_key it is actually defined 2. To check if there are any files to sync as part of deployment Also added `bundle.Parallel` to run them in parallel To make sure mutators under bundle.Parallel do not mutate config, introduced new `ReadOnlyMutator`, `ReadOnlyBundle` and `ReadOnlyConfig`. Example ``` databricks bundle validate -p deco-staging Warning: unknown field: new_cluster at resources.jobs.my_job in bundle.yml:24:7 Warning: job_cluster_key high_cpu_workload_job_cluster is not defined at resources.jobs.my_job.tasks[0].job_cluster_key in bundle.yml:35:28 Warning: There are no files to sync, please check your your .gitignore and sync.exclude configuration at sync.exclude in bundle.yml:18:5 Name: test Target: default Workspace: Host: https://acme.databricks.com User: andrew.nester@databricks.com Path: /Users/andrew.nester@databricks.com/.bundle/test/default Found 3 warnings ``` ## Tests Added unit tests	2024-04-18 15:13:16 +00:00
Pieter Noordhuis	04cbc7171e	Make bundle validation print text output by default (#1335 ) ## Changes It now shows human-readable warnings and validation status. ## Tests * Manual tests against many examples. * Errors still return immediately.	2024-04-03 15:33:43 +00:00
dependabot[bot]	f28a9d7107	Bump github.com/databricks/databricks-sdk-go from 0.36.0 to 0.37.0 (#1326 ) [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/databricks/databricks-sdk-go&package-manager=go_modules&previous-version=0.36.0&new-version=0.37.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Andrew Nester <andrew.nester@databricks.com>	2024-04-03 10:39:53 +00:00
Ilia Babanov	079c416f8d	Add `bundle debug terraform` command (#1294 ) - Add `bundle debug terraform` command. It prints versions of the Terraform and the Databricks Terraform provider. In the text mode it also explains how to setup the CLI in environments with restricted internet access. - Use `DATABRICKS_TF_EXEC_PATH` env var to point Databricks CLI to the Terraform binary. The CLI only uses it if `DATABRICKS_TF_VERSION` matches the currently used terraform version. - Use `DATABRICKS_TF_CLI_CONFIG_FILE` env var to point Terraform CLI config that points to the filesystem mirror for the Databricks provider. The CLI only uses it if `DATABRICKS_TF_PROVIDER_VERSION` matches the currently used provider version. Relevant PR on the VSCode extension side: https://github.com/databricks/databricks-vscode/pull/1147 Example output of the `databricks bundle debug terraform`: ``` Terraform version: 1.5.5 Terraform URL: https://releases.hashicorp.com/terraform/1.5.5 Databricks Terraform Provider version: 1.38.0 Databricks Terraform Provider URL: https://github.com/databricks/terraform-provider-databricks/releases/tag/v1.38.0 Databricks CLI downloads its Terraform dependencies automatically. If you run the CLI in an air-gapped environment, you can download the dependencies manually and set these environment variables: DATABRICKS_TF_VERSION=1.5.5 DATABRICKS_TF_EXEC_PATH=/path/to/terraform/binary DATABRICKS_TF_PROVIDER_VERSION=1.38.0 DATABRICKS_TF_CLI_CONFIG_FILE=/path/to/terraform/cli/config.tfrc Here is an example *.tfrc configuration file: disable_checkpoint = true provider_installation { filesystem_mirror { path = "/path/to/a/folder/with/databricks/terraform/provider" } } The filesystem mirror path should point to the folder with the Databricks Terraform Provider. The folder should have this structure: /registry.terraform.io/databricks/databricks/terraform-provider-databricks_1.38.0_ARCH.zip For more information about filesystem mirrors, see the Terraform documentation: https://developer.hashicorp.com/terraform/cli/config/config-file#filesystem_mirror ``` --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-04-02 12:56:27 +00:00

1 2 3

137 Commits