databricks-cli

Commit Graph

Author	SHA1	Message	Date
Shreyas Goenka	acff901127	fix windows	2025-03-05 11:39:09 +01:00
Shreyas Goenka	39ec48a602	remove error struct	2025-03-05 10:58:18 +01:00
Shreyas Goenka	7fb464b496	lint	2025-03-05 10:47:58 +01:00
Shreyas Goenka	2212aa2597	proper printing	2025-03-05 10:44:43 +01:00
Shreyas Goenka	be3be9cb2f	try streaming output	2025-03-04 18:22:12 +01:00
Shreyas Goenka	647dab0a66	some cleanup	2025-03-03 20:16:12 +01:00
Shreyas Goenka	112a9de90a	-	2025-03-03 20:10:27 +01:00
Shreyas Goenka	c0d34f7827	-	2025-03-03 20:09:20 +01:00
Shreyas Goenka	406103623d	-	2025-03-03 20:08:33 +01:00
Shreyas Goenka	769acf7fd3	-	2025-03-03 20:07:51 +01:00
Shreyas Goenka	edf705188d	more cleanup	2025-03-03 20:06:24 +01:00
Shreyas Goenka	0c9fbf7b23	-	2025-03-03 19:58:14 +01:00
Shreyas Goenka	93c5e3f2ae	simplify streaming	2025-03-03 19:50:56 +01:00
Shreyas Goenka	fd2600cbea	return stdout / stderr errors after	2025-03-03 19:32:43 +01:00
Shreyas Goenka	898b2c1cc3	merge	2025-03-03 19:21:18 +01:00
Shreyas Goenka	a2748d531b	cleanup	2025-03-03 19:19:39 +01:00
Shreyas Goenka	05cd18c0be	exit code done	2025-03-03 18:52:02 +01:00
Shreyas Goenka	993956294a	added flags are not parsed check	2025-03-03 18:35:03 +01:00
Shreyas Goenka	58d28a9c92	add more tests for profile and target	2025-03-03 18:26:17 +01:00
Shreyas Goenka	080d1bf8b9	make the profile is passed test work	2025-03-03 18:13:56 +01:00
Shreyas Goenka	f0d1dc56c8	fix replacement	2025-03-03 17:40:04 +01:00
Shreyas Goenka	152d982c9b	add cases for the target flag	2025-03-03 16:58:57 +01:00
Shreyas Goenka	c442378f45	clarify the cwd plan	2025-03-03 15:19:18 +01:00
Shreyas Goenka	1b5ff48873	execute scripts from bundle root	2025-03-03 15:12:50 +01:00
Denis Bilenko	e4cd782852	Remove bundle.{Parallel,ReadOnlyBundle} (#2414 ) ## Changes - Remove bundle.Parallel & bundle.ReadOnlyBundle. - Add bundle.ApplyParallel, as a helper to migrate from bundle.Parallel. - Keep ReadOnlyMutator as a separate type but it's now a subtype of Mutator so it works on regular *Bundle. Having it as a separate type prevents non-readonly mutators being passed to ApplyParallel - validate.Validate becomes a function (was Mutator). ## Why This a follow up to #2390 where we removed most of the tools to construct chains of mutators. Same motivation applies here. When it comes to read-only bundles, it's a leaky abstraction -- since it's a shallow copy, it does not actually guarantee or enforce readonly access to bundle. A better approach would be to run parallel operations on independent narrowly-focused deep-copied structs, with just enough information to carry out the task (this is not implemented here, but the eventual goal). Now that we can just write regular code in phases and not limited to mutator interface, we can switch to that approach. ## Tests Existing tests. --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2025-03-03 13:35:36 +00:00
Shreyas Goenka	6002e0d040	some more tests	2025-03-03 14:01:41 +01:00
Shreyas Goenka	90c6ad0fdb	-	2025-03-03 10:52:31 +01:00
Shreyas Goenka	4f43fb9acf	[WIP] Add bundle exec	2025-03-02 18:11:46 +01:00
Denis Bilenko	e2db0cd0e2	Remove bundle.{Seq,If,Defer,newPhase,logString}, switch to regular functions (#2390 ) ## Changes - Instead of constructing chains of mutators and then executing them, execute them directly. - Remove functionality related to chain-building: Seq, If, Defer, newPhase, logString. - Phases become functions that apply the changes directly rather than construct mutator chains that will be called later. - Add a helper ApplySeq to call multiple mutators, use it where Apply+Seq were used before. This is intended to be a refactoring without functional changes, but there are a few behaviour changes: - Since defer() is used to call unlock instead of bundle.Defer() unlocking will now happen even in case of panics. - In --debug, the phase names are are still logged once before start of the phase but each entry no longer has 'seq' or phase name in it. - The message "Deployment complete!" was printed even if terraform.Apply() mutator had an error. It no longer does that. ## Motivation The use of the chains was necessary when mutators were returning a list of other mutators instead of calling them directly. But that has since been removed, so now the chain machinery have no purpose anymore. Use of direct functions simplifies the logic and makes bugs more apparent and easy to fix. Other improvements that this unlocks: - Simpler stacktraces/debugging (breakpoints). - Use of functions with narrowly scoped API: instead of mutators that receive full bundle config, we can use focused functions that only deal with sections they care about prepareGitSettings(currentGitSection) -> updatedGitSection. This makes the data flow more apparent. - Parallel computations across mutators (within phase): launch goroutines fetching data from APIs at the beggining, process them once they are ready. ## Tests Existing tests.	2025-02-27 11:41:58 +00:00
Denis Bilenko	d282f33a22	Append newline to "-o json" for validate/summary/run (#2326 ) ## Changes - Insert newline after rendering indented JSON in bundle validate/summary/run. - This prevents "No newline at end of file" message in various cases, for example when switching between recording raw output of the command to output processed by jq, since jq does add a newline or when running diff in acceptance tests. ## Tests Manually running validate: ``` ~/work/dabs_cuj_brickfood % ../cli/cli-main bundle validate -o json \| tail -n 2 # without change Error: root_path must start with '~/' or contain the current username to ensure uniqueness when using 'mode: development' } }% ~/work/dabs_cuj_brickfood % ../cli/cli bundle validate -o json \| tail -n 2 # with change Error: root_path must start with '~/' or contain the current username to ensure uniqueness when using 'mode: development' } } ~/work/dabs_cuj_brickfood % ``` Via #2316 -- see cleaner output there.	2025-02-10 14:00:49 +01:00
Andrew Nester	f8aaa7fce3	Added support to generate Git based jobs (#2304 ) ## Changes This will generate bundle YAML configuration for Git based jobs but won't download any related files as they are in Git repo. Fixes #1423 ## Tests Added unit test --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2025-02-07 18:37:03 +00:00
Andrew Nester	5aa89230e9	Use CreatePipeline instead of PipelineSpec for resources.Pipeline struct (#2287 ) ## Changes `CreatePipeline` is a more complete structure (superset of PipelineSpec one) which enables support of additional fields such as `run_as` and `allow_duplicate_names` in DABs configuration. Note: these fields are subject to support in TF in order to correctly work. ## Tests Existing tests pass + no fields are removed from JSON schema	2025-02-07 17:22:51 +00:00
shreyas-goenka	41a21af556	Refactor `bundle init` (#2074 ) ## Summary of changes This PR introduces three new abstractions: 1. `Resolver`: Resolves which reader and writer to use for a template. 2. `Writer`: Writes a template project to disk. Prompts the user if necessary. 3. `Reader`: Reads a template specification from disk, built into the CLI or from GitHub. Introducing these abstractions helps decouple reading a template from writing it. When I tried adding telemetry for the `bundle init` command, I noticed that the code in `cmd/init.go` was getting convoluted and hard to test. A future change could have accidentally logged PII when a user initialised a custom template. Hedging against that risk is important here because we use a generic untyped `map<string, string>` representation in the backend to log telemetry for the `databricks bundle init`. Otherwise, we risk accidentally breaking our compliance with our centralization requirements. ### Details After this PR there are two classes of templates that can be initialized: 1. A `databricks` template: This could be a builtin template or a template outside the CLI like mlops-stacks, which is still owned and managed by Databricks. These templates log their telemetry arguments and template name. 2. A `custom` template: These are templates created by and managed by the end user. In these templates we do not log the template name and args. Instead a generic placeholder string of "custom" is logged in our telemetry system. NOTE: The functionality of the `databricks bundle init` command remains the same after this PR. Only the internal abstractions used are changed. ## Tests New unit tests. Existing golden and unit tests. Also a fair bit of manual testing.	2025-01-20 12:09:28 +00:00
Gleb Kanterov	31c10c1b82	Add experimental-jobs-as-code template (#2177 ) ## Changes Add experimental-jobs-as-code template allowing defining jobs using Python instead of YAML through the `databricks-bundles` PyPI package. ## Tests Manually and acceptance tests.	2025-01-20 10:15:11 +00:00
Pieter Noordhuis	0d5193a62c	Include help output for bundle commands in acceptance tests (#2178 ) ## Changes This includes a change to the defaults for the output directory flags of the "generate" commands. These defaults included the expanded working directory. This can be omitted because it is implied.	2025-01-17 14:52:53 +00:00
Denis Bilenko	b273dc5942	Enable linter 'copyloopvar' and fix the issues (#2160 ) ## Changes - Remove all unnecessary copies of the loop variable, it is not necessary since Go 1.22 https://go.dev/blog/loopvar-preview - Enable the linter that catches this issue https://github.com/karamaru-alpha/copyloopvar ## Tests Existing tests.	2025-01-16 11:20:50 +00:00
Andrew Nester	913e10a037	Added support for Databricks Apps in DABs (#1928 ) ## Changes Now it's possible to configure new `app` resource in bundle and point it to the custom `source_code_path` location where Databricks App code is defined. On `databricks bundle deploy` DABs will create an app. All consecutive `databricks bundle deploy` execution will update an existing app if there are any updated On `databricks bundle run <my_app>` DABs will execute app deployment. If the app is not started yet, it will start the app first. ### Bundle configuration ``` bundle: name: apps variables: my_job_id: description: "ID of job to run app" lookup: job: "My Job" databricks_name: description: "Name for app user" additional_flags: description: "Additional flags to run command app" default: "" my_app_config: type: complex description: "Configuration for my Databricks App" default: command: - flask - --app - hello - run - ${var.additional_flags} env: - name: DATABRICKS_NAME value: ${var.databricks_name} resources: apps: my_app: name: "anester-app" # required and has to be unique description: "My App" source_code_path: ./app # required and points to location of app code config: ${var.my_app_config} resources: - name: "my-job" description: "A job for app to be able to run" job: id: ${var.my_job_id} permission: "CAN_MANAGE_RUN" permissions: - user_name: "foo@bar.com" level: "CAN_VIEW" - service_principal_name: "my_sp" level: "CAN_MANAGE" targets: dev: variables: databricks_name: "Andrew (from dev)" additional_flags: --debug prod: variables: databricks_name: "Andrew (from prod)" ``` ### Execution 1. `databricks bundle deploy -t dev` 2. `databricks bundle run my_app -t dev` If app is started ``` ✓ Getting the status of the app my-app ✓ App is in RUNNING state ✓ Preparing source code for new app deployment. ✓ Deployment is pending ✓ Starting app with command: flask --app hello run --debug ✓ App started successfully You can access the app at <app-url> ``` If app is not started ``` ✓ Getting the status of the app my-app ✓ App is in UNAVAILABLE state ✓ Starting the app my-app ✓ App is starting... .... ✓ App is starting... ✓ App is started! ✓ Preparing source code for new app deployment. ✓ Downloading source code from /Workspace/Users/... ✓ Starting app with command: flask --app hello run --debug ✓ App started successfully You can access the app at <app-url> ``` ## Tests Added unit and config tests + manual test. ``` --- PASS: TestAccDeployBundleWithApp (404.59s) PASS coverage: 36.8% of statements in ./... ok github.com/databricks/cli/internal/bundle 405.035s coverage: 36.8% of statements in ./... ```	2025-01-13 16:43:48 +00:00
Denis Bilenko	6d3b4159bd	Log warnings to stderr for "bundle validate -o json" (#2109 ) ## Changes Previously diagnostics were not seen in JSON output mode. This change prints them to stderr. This also fixes acceptance tests to preprocess all output with s/execPath/$CLI/ not just output.txt. ## Tests Existing acceptance tests. In one case I've added non-json command to check that they match in output.	2025-01-10 08:51:59 +00:00
Denis Bilenko	e2cd8c2f34	Enable perfsprint linter and apply autofix (#2071 ) https://github.com/catenacyber/perfsprint	2025-01-07 10:49:23 +00:00
shreyas-goenka	7beb0fb8b5	Add validation mutator for volume `artifact_path` (#2050 ) ## Changes This PR: 1. Incrementally improves the error messages shown to the user when the volume they are referring to in `workspace.artifact_path` does not exist. 2. Performs this validation in both `bundle validate` and `bundle deploy` compared to before on just deployments. 3. It runs "fast" validations on `bundle deploy`, which earlier were only run on `bundle validate`. ## Tests Unit tests and manually. Also, existing integration tests provide coverage (`TestUploadArtifactToVolumeNotYetDeployed`, `TestUploadArtifactFileToVolumeThatDoesNotExist`) Examples: ``` .venv➜ bundle-playground git:(master) ✗ cli bundle validate Error: cannot access volume capital.whatever.my_volume: User does not have READ VOLUME on Volume 'capital.whatever.my_volume'. at workspace.artifact_path in databricks.yml:7:18 ``` and ``` .venv➜ bundle-playground git:(master) ✗ cli bundle validate Error: volume capital.whatever.foobar does not exist at workspace.artifact_path resources.volumes.foo in databricks.yml:7:18 databricks.yml:12:7 You are using a volume in your artifact_path that is managed by this bundle but which has not been deployed yet. Please first deploy the volume using 'bundle deploy' and then switch over to using it in the artifact_path. ```	2025-01-02 17:23:15 +05:30
Denis Bilenko	0b80784df7	Enable testifylint and fix the issues (#2065 ) ## Changes - Enable new linter: testifylint. - Apply fixes with --fix. - Fix remaining issues (mostly with aider). There were 2 cases we --fix did the wrong thing - this seems to a be a bug in linter: https://github.com/Antonboom/testifylint/issues/210 Nonetheless, I kept that check enabled, it seems useful, just need to be fixed manually after autofix. ## Tests Existing tests	2025-01-02 12:03:41 +01:00
Denis Bilenko	2e018cfaec	Enable gofumpt and goimports in golangci-lint (#1999 ) ## Changes Enable gofumpt and goimports in golangci-lint and apply autofix. This makes 'make fmt' redundant, will be cleaned up in follow up diff. ## Tests Existing tests.	2024-12-12 10:28:42 +01:00
Denis Bilenko	8d5351c1c3	Enable errcheck everywhere and fix or silent remaining issues (#1987 ) ## Changes Enable errcheck linter for the whole codebase. Fix remaining complaints: - If we can propagate error to caller, do that - If we writing to stdout, continue ignoring errors (to avoid crashing in "cli \| head" case) - Add exception for cobra non-critical API such as MarkHidden/MarkDeprecated/RegisterFlagCompletionFunc. This keeps current code and behaviour, to be decided later if we want to change this. - Continue ignoring errors where that is desired behaviour (e.g. git.loadConfig). - Continue ignoring errors where panicking seems riskier than ignoring the error. - Annotate cases in libs/dyn with //nolint:errcheck - to be addressed later. Note, this PR is not meant to come up with the best strategy for each case, but to be a relative safe change to enable errcheck linter. ## Tests Existing tests.	2024-12-11 13:26:00 +01:00
Denis Bilenko	1b2be1b2cb	Add error checking in tests and enable errcheck there (#1980 ) ## Changes Fix all errcheck-found issues in tests and test helpers. Mostly this done by adding require.NoError(t, err), sometimes panic() where t object is not available). Initial change is obtained with aider+claude, then manually reviewed and cleaned up. ## Tests Existing tests.	2024-12-09 13:56:41 +01:00
Andrew Nester	592e1111b7	Update filenames used by bundle generate to use `.<resource-type>.yml` (#1901 ) ## Changes Update filenames used by bundle generate to use '.resource-type.yml' Similar to [Add sub-extension to resource files in built-in templates by shreyas-goenka · Pull Request #1777 · databricks/cli](https://github.com/databricks/cli/pull/1777) --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-11-20 13:53:25 +01:00
Pieter Noordhuis	886e14910c	Fix template initialization when running on Databricks (#1912 ) ## Changes When running the CLI on Databricks Runtime (DBR), use the extension-aware filer to write an instantiated template if the instance path is located in the workspace filesystem. Notebooks cannot be written through the workspace filesystem's FUSE mount. As a result, this is the only method for initializing templates that contain notebooks when running the CLI on DBR and writing to the workspace filesystem. Depends on #1910 and #1911. Supersedes #1744. ## Tests * Manually confirmed I can initialize a template with notebooks when running the CLI from the web terminal.	2024-11-20 11:42:23 +00:00
Pieter Noordhuis	4fea0219fd	Use `fs.FS` interface to read template (#1910 ) ## Changes While working on the v2 of #1744, I found that: * Template initialization first copies built-in templates to a temporary directory before initializing them * Reading a template's contents goes through a `filer.Filer` but is hardcoded to a local one This change updates the interface for reading templates to be `fs.FS`. This is compatible with the `embed.FS` type for the built-in templates, so they no longer have to be copied to a temporary directory before being used. The alternative is to use a `filer.Filer` throughout, but this would have required even more plumbing, and we don't need to _read_ templates, including notebooks, from the workspace filesystem (yet?). As part of making `template.Materialize` take an `fs.FS` argument, the logic to match a given argument to a particular built-in template in the `init` command has moved to sit next to its implementation. ## Tests Existing tests pass.	2024-11-20 09:28:35 +00:00
Andrew Nester	162aa212bc	Do not execute build on bundle destroy (#1882 ) ## Changes There's no value in building artifacts on destroy because they are just removed from workspace as part of destroy.	2024-11-07 09:31:49 +00:00
Pieter Noordhuis	edff68c763	Fix bundle run when run interactively (#1880 ) ## Changes The commit where resource lookup was factored out into a separate package (#1858) didn't take into account the use of `args` further down in the code. This change fixes that oversight by returning the tail arguments when determining which resource to run. The later call no longer has to index the `args` slice. ## Tests Manually confirmed that the command works when being prompted for the resource to run.	2024-11-05 09:30:11 +00:00
Pieter Noordhuis	1896b09350	Add bundle generate variant for dashboards (#1847 ) ## Changes This change adds the `databricks bundle generate dashboard` command. The command requires one of three flags: * `--existing-id` to generate configuration for an existing dashboard by its ID. * `--existing-path` to generate configuration for an existing dashboard by its path in the workspace file system. * `--resource` to generate the `.lvdash.json` dashboard file for a dashboard that's already defined in the bundle. This option does not impact the YAML configuration. A typical workflow could look like this: 1. Use the command with `--existing-id` or `--existing-path` for a starting point 2. Run `bundle deploy` to deploy a copy of the dashboard 3. Run `bundle open` to open this copy in your browser 4. Navigate to the draft mode and make modifications 5. Run `bundle generate dashboard` with `--resource` to update the local `.lvdash.json` file with the remote modifications ## Tests * Unit tests. * Manual walkthrough as documented in the [Dashboard for NYC Taxi Trip Analysis example](https://github.com/databricks/bundle-examples/tree/main/knowledge_base/dashboard_nyc_taxi).	2024-10-29 11:51:59 +00:00

1 2 3 4

165 Commits