databricks-cli

Commit Graph

Author	SHA1	Message	Date
Lennart Kats (databricks)	875c9d2db1	Tune output of bundle deploy command (#1047 ) ## Changes Update the output of the `deploy` command to be more concise and consistent: ``` $ databricks bundle deploy Building my_project... Uploading my_project-0.0.1+20231207.205106-py3-none-any.whl... Uploading bundle files to /Users/lennart.kats@databricks.com/.bundle/my_project/dev/files... Deploying resources... Updating deployment state... Deployment complete! ``` This does away with the intermediate success messages, makes consistent use of `...`, and only prints the success message at the very end after everything is completed. Below is the original output for comparison: ``` $ databricks bundle deploy Detecting Python wheel project... Found Python wheel project at /tmp/output/my_project Building my_project... Build succeeded Uploading my_project-0.0.1+20231207.205134-py3-none-any.whl... Upload succeeded Starting upload of bundle files Uploaded bundle files at /Users/lennart.kats@databricks.com/.bundle/my_project/dev/files! Starting resource deployment Resource deployment completed! ```	2023-12-21 08:00:37 +00:00
shreyas-goenka	2d93f62f21	Set metadata fields required to enable break-glass UI for jobs (#880 ) ## Changes This PR sets the following fields for all jobs that are deployed from a DAB 1. `deployment`: This provides the platform with the path to a file to read the metadata from. 2. `edit_mode`: This tells the platform to display the break-glass UI for jobs deployed from a DAB. Setting this is required to re-lock the UI after a user clicks "disconnect from source". 3. `format = MULTI_TASK`. This makes the Terraform provider always use jobs API 2.1 for creating/updating the job. Required because `deployment` and `edit_mode` are only available in API 2.1. ## Tests Unit test and manually. Manually verified that deployments trigger the break glass UI. Manually verified there is no Terraform drift when all three fields are set. --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2023-12-19 07:38:52 +00:00
shreyas-goenka	677926b78b	Fix panic when bundle auth resolution fails (#1002 ) ## Changes CLI would panic if an invalid bundle auth is setup when running CLI commands. This PR removes the panic and shows the error message directly instead. ## Tests The CWD is a bundle with: ``` workspace: profile: DEFAULT ``` Before: ``` shreyas.goenka@THW32HFW6T bundle-playground % cli clusters list panic: resolve: /Users/shreyas.goenka/.databrickscfg has no DEFAULT profile configured. Config: profile=DEFAULT goroutine 1 [running]: ``` After: ``` shreyas.goenka@THW32HFW6T bundle-playground % cli clusters list Error: cannot resolve bundle auth configuration: resolve: /Users/shreyas.goenka/.databrickscfg has no DEFAULT profile configured. Config: profile=DEFAULT ``` ``` shreyas.goenka@THW32HFW6T bundle-playground % DATABRICKS_CONFIG_FILE=/dev/null cli bundle deploy Error: cannot resolve bundle auth configuration: resolve: /dev/null has no DEFAULT profile configured. Config: profile=DEFAULT, config_file=/dev/null. Env: DATABRICKS_CONFIG_FILE ```	2023-11-30 14:28:01 +00:00
Andrew Nester	f3db42e622	Added support for top-level permissions (#928 ) ## Changes Now it's possible to define top level `permissions` section in bundle configuration and permissions defined there will be applied to all resources defined in the bundle. Supported top-level permission levels: CAN_MANAGE, CAN_VIEW, CAN_RUN. Permissions are applied to: Jobs, DLT Pipelines, ML Models, ML Experiments and Model Service Endpoints ``` bundle: name: permissions workspace: host: *** permissions: - level: CAN_VIEW group_name: test-group - level: CAN_MANAGE user_name: user@company.com - level: CAN_RUN service_principal_name: 123456-abcdef ``` ## Tests Added corresponding unit tests + ran `bundle validate` and `bundle deploy` manually	2023-11-13 11:29:40 +00:00
shreyas-goenka	5a8cd0c5bc	Persist deployment metadata in WSFS (#845 ) ## Changes This PR introduces a metadata struct that stores a subset of bundle configuration that we wish to expose to other Databricks services that wish to integrate with bundles. This metadata file is uploaded to a file `${bundle.workspace.state_path}/metadata.json` in the WSFS destination of the bundle deployment. Documentation for emitted metadata fields: * `version`: Version for the metadata file schema * `config.bundle.git.branch`: Name of the git branch the bundle was deployed from. * `config.bundle.git.origin_url`: URL for git remote "origin" * `config.bundle.git.bundle_root_path`: Relative path of the bundle root from the root of the git repository. Is set to "." if they are the same. * `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this bundle was deployed from. Note, the deployment might not exactly match this commit version if there are changes that have not been committed to git at deploy time, * `file_path`: Path in workspace where we sync bundle files to. * `resources.jobs.[job-ref].id`: Id of the job * `resources.jobs.[job-ref].relative_path`: Relative path of the yaml config file from the bundle root where this job was defined. Example metadata object when bundle root and git root are the same: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "master", "origin_url": "www.host.com", "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0", "bundle_root_path": "." } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` Example metadata when the git root is one level above the bundle repo: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "dev-branch", "origin_url": "www.my-repo.com", "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b", "bundle_root_path": "pipeline-progress" } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` This unblocks integration to the jobs break glass UI for bundles. ## Tests Unit tests and integration tests.	2023-10-27 12:55:43 +00:00
Andrew Nester	19e00d2d47	Upload terraform state even if apply fails (#923 ) ## Changes Upload terraform state even if apply fails Fixes #893 ## Tests Manually running `databricks bundle deploy` with incorrect permissions in bundle config and observe that it gets uploaded correctly	2023-10-26 14:38:01 +00:00
Andrew Nester	aa54a8665a	Added support for glob patterns in pipeline libraries section (#833 ) ## Changes Now it's possible to specify glob pattern in pipeline libraries section and DAB will add all matched files as libraries ``` pipelines: dummy: name: " DLT with Python files" target: "dlt_python_files" libraries: - file: path: ./*.py ``` ## Tests Added unit test	2023-10-04 13:23:13 +00:00
Andrew Nester	3ee89c41da	Added a warning when Python wheel wrapper needs to be used (#807 ) ## Changes Added a warning when Python wheel wrapper needs to be used ## Tests Added unit tests + manual run with different bundle configurations	2023-09-27 08:26:59 +00:00
Andrew Nester	953dcb4972	Added support for experimental scripts section (#632 ) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: * experimental: scripts: prebuild: \| echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: \| echo 'Checking go version...' go version postdeploy: \| echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```	2023-09-14 10:14:13 +00:00
Andrew Nester	12368e3382	Added transformation mutator for Python wheel task for them to work on DBR <13.1 (#635 ) ## Changes *Note: this PR relies on sync.include functionality from here: https://github.com/databricks/cli/pull/671* Added transformation mutator for Python wheel task for them to work on DBR <13.1 Using wheels upload to Workspace file system as cluster libraries is not supported in DBR < 13.1 In order to make Python wheel work correctly on DBR < 13.1 we do the following: 1. Build and upload python wheel as usual 2. Transform python wheel task into special notebook task which does the following a. Installs all necessary wheels with %pip magic b. Executes defined entry point with all provided parameters 3. Upload this notebook file to workspace file system 4. Deploy transformed job task This is also beneficial for executing on existing clusters because this notebook always reinstall wheels so if there are any changes to the wheel package, they are correctly picked up ## Tests bundle.yml ```yaml bundle: name: wheel-task workspace: host: ** resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "" python_wheel_task: package_name: "my_test_code" entry_point: "run" parameters: ["first argument","first value","second argument","second value"] libraries: - whl: ./dist/.whl ``` Output ``` andrew.nester@HFW9Y94129 wheel % databricks bundle run test_job Run URL: *** 2023-08-03 15:58:04 "[default] My Wheel Job" TERMINATED SUCCESS Output: ======= Task TestTask: Hello from my func Got arguments v1: ['python', 'first argument', 'first value', 'second argument', 'second value'] ```	2023-08-30 12:21:39 +00:00
Andrew Nester	4ee926b885	Added run_as section for bundle configuration (#692 ) ## Changes Added run_as section for bundle configuration. This section allows to define an user name or service principal which will be applied as an execution identity for jobs and DLT pipelines. In the case of DLT, identity defined in `run_as` will be assigned `IS_OWNER` permission on this pipeline. ## Tests Added unit tests for configuration. Also ran deploy for the following bundle configuration ``` bundle: name: "run_as" run_as: # service_principal_name: "f7263fcc-56d0-4981-8baf-c2a45296690b" user_name: "lennart.kats@databricks.com" resources: pipelines: andrew_pipeline: name: "Andrew Nester pipeline" libraries: - notebook: path: ./test.py jobs: job_one: name: Job One tasks: - task_key: "task" new_cluster: num_workers: 1 spark_version: 13.2.x-snapshot-scala2.12 node_type_id: i3.xlarge runtime_engine: PHOTON notebook_task: notebook_path: "./test.py" ```	2023-08-23 16:47:07 +00:00
Andrew Nester	56dcd3f0a7	Renamed `environments` to `targets` in bundle configuration (#670 ) ## Changes Renamed Environments to Targets in bundle.yml. The change is backward-compatible and customers can continue to use `environments` in the time being. ## Tests Added tests which checks that both `environments` and `targets` sections in bundle.yml works correctly	2023-08-17 15:22:32 +00:00
Lennart Kats (databricks)	433f401c83	Add validation for Git settings in bundles (#578 ) ## Changes This checks whether the Git settings are consistent with the actual Git state of a source directory. (This PR adds to https://github.com/databricks/cli/pull/577.) Previously, we would silently let users configure their Git branch to e.g. `main` and deploy with that metadata even if they were actually on a different branch. With these changes, the following config would result in an error when deployed from any other branch than `main`: ``` bundle: name: example workspace: git: branch: main environments: ... ``` > not on the right Git branch: > expected according to configuration: main > actual: my-feature-branch It's not very useful to set the same branch for all environments, though. For development, it's better to just let the CLI auto-detect the right branch. Therefore, it's now possible to set the branch just for a single environment: ``` bundle: name: example 2 environments: development: default: true production: # production can only be deployed from the 'main' branch git: branch: main ``` Adding to that, the `mode: production` option actually checks that users explicitly set the Git branch as seen above. Setting that branch helps avoid mistakes, where someone accidentally deploys to production from the wrong branch. (I could see us offering an escape hatch for that in the future.) # Testing Manual testing to validate the experience and error messages. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com>	2023-07-30 12:44:33 +00:00
Andrew Nester	cfff140815	Auto detect Python wheel packages and infer build command (#603 )	2023-07-26 10:07:26 +00:00
Andrew Nester	9a88fa602d	Added support for artifacts building for bundles (#583 ) ## Changes Added support for artifacts building for bundles. Now it allows to specify `artifacts` block in bundle.yml and define a resource (at the moment Python wheel) to be build and uploaded during `bundle deploy` Built artifact will be automatically attached to corresponding job task or pipeline where it's used as a library Follow-ups: 1. If artifact is used in job or pipeline, but not found in the config, try to infer and build it anyway 2. If build command is not provided for Python wheel artifact, infer it	2023-07-25 13:35:08 +02:00
Lennart Kats (databricks)	57e75d3e22	Add development runs (#522 ) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ```	2023-07-12 08:51:54 +02:00
stikkireddy	533234f148	Fix: bundle destroy fails when bundle.tf.json file is deleted (#519 ) ## Changes Adds the following steps to the destroy phase: 1. interpolate 2. write Resolves #518 ## Tests Tested manually due there not being an examples for tests to use.	2023-07-05 21:58:06 +02:00
shreyas-goenka	5d036ab6b8	Fix locker unlock for destroy (#492 ) ## Changes Adds ability for allowing unlock to succeed even if the deploy file is missing. ## Tests Using integration tests and manually	2023-06-19 15:57:25 +02:00
Andrew Nester	6141476ca2	Added support for bundle.Seq, simplified Mutator.Apply interface (#403 ) ## Changes Added support for `bundle.Seq`, simplified `Mutator.Apply` interface by removing list of mutators from return values/ ## Tests 1. Ran `cli bundle deploy` and interrupted it with Cmd + C mid execution so lock is not released 2. Ran `cli bundle deploy` top make sure that CLI is not trying to release lock when it fail to acquire it ``` andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files! ^C andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-24 12:10:23.050343 +0200 CEST. Use --force to override ```	2023-05-24 14:45:19 +02:00
Pieter Noordhuis	98ebb78c9b	Rename bricks -> databricks (#389 ) ## Changes Rename all instances of "bricks" to "databricks". ## Tests * Confirmed the goreleaser build works, uses the correct new binary name, and produces the right archives. * Help output is confirmed to be correct. * Output of `git grep -w bricks` is minimal with a couple changes remaining for after the repository rename.	2023-05-16 18:35:39 +02:00
Andrew Nester	180dfc9a40	Added ability for deferred mutator execution (#380 ) ## Changes Added `DeferredMutator` and `bundle.Defer` function which allows to always execute some mutators either in the end of execution chain or after error occurs in the middle of execution chain. Usage as follows: ``` deferredMutator := bundle.Defer([]bundle.Mutator{ lock.Acquire() transform.DoSomething(), //... }, []bundle.Mutator{ lock.Release(), }) ``` In such case `lock.Release()` will always be executed: either when all operations above succeed or when any of them fails ## Tests Before the change ``` andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files! Error: terraform not initialized andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-10 16:41:22.902659 +0200 CEST. Use --force to override ``` After the change ``` andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files! Error: terraform not initialized andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files! Error: terraform not initialized ```	2023-05-16 18:01:50 +02:00
shreyas-goenka	c5e940f664	Add support for variables in bundle config (#359 ) ## Changes This PR now allows you to define variables in the bundle config and set them in three ways 1. command line args 2. process environment variable 3. in the bundle config itself ## Tests manually, unit, and black box tests --------- Co-authored-by: Miles Yucht <miles@databricks.com>	2023-05-15 11:34:05 +02:00
shreyas-goenka	4871f7bc8a	Add bundle destroy command (#300 ) Adds bundle destroy capability to bricks	2023-04-06 12:54:58 +02:00
Pieter Noordhuis	4e4c0658db	Interpolate paths for job tasks that reference files (#306 ) ## Changes This change also swaps the order of mutators such that interpolation happens before path translation. This means that is is possible to use variables (e.g. `${bundle.environment}`) in notebook or file paths. ## Tests New tests pass and verified manually.	2023-04-05 16:02:17 +02:00
Pieter Noordhuis	04e77102c9	Add mutators to pull and push Terraform state (#288 ) ## Changes Pull state before deploying and push state after deploying. Note: the run command was missing mutators to initialize Terraform. This is necessary if the cache directory is removed between running "deploy" and "run" (which is valid now that we synchronize state). ## Tests Manually.	2023-03-30 12:01:09 +02:00
Pieter Noordhuis	edd8630f71	Log mutator phase at info level (#272 )	2023-03-22 17:02:22 +01:00
Pieter Noordhuis	123a5e15e9	Acquire lock prior to deploy (#270 ) Add configuration: ``` bundle: lock: enabled: true force: false ``` The force field can be set by passing the `--force` argument to `bricks bundle deploy`. Doing so means the deployment lock is acquired even if it is currently held. This should only be used in exceptional cases (e.g. a previous deployment has failed to release the lock).	2023-03-22 16:37:26 +01:00
Pieter Noordhuis	a0ed02281d	Execute file synchronization on deploy (#211 ) 1. Perform file synchronization on deploy 2. Update notebook file path translation logic to point to the synchronization target rather than treating the notebook as an artifact and uploading it separately.	2023-02-20 19:42:55 +01:00
Pieter Noordhuis	35c3d9fa4e	Add workspace paths (#179 ) The workspace root path is a base path for bundle storage. If not specified, it defaults to `~/.bundle/name/environment`. This default, or other paths starting with `~` are expanded to the current user's home directory. The configuration also includes fields for the files path, artifacts path, and state path. By default, these are nested under the root path, but can be overridden if needed.	2023-01-26 19:55:38 +01:00
Pieter Noordhuis	4026b2cda2	Mutator to convert paths to local notebooks files into artifacts (#144 ) This lets you write: ```yaml libraries: - notebook: path: ./events.sql ``` Instead of: ```yaml artifacts: events_sql: notebook: path: ./events.sql libraries: - notebook: path: "${artifacts.events_sql.notebook.remote_path}" ```	2022-12-16 14:49:23 +01:00
Pieter Noordhuis	35243db33c	Automatically install Terraform if needed (#141 ) Users can opt out and use the system-installed version with the following configuration: ``` bundle: terraform: exec_path: terraform ``` This will find the binary in $PATH and replace it with the found value. If this is not set, the initialize phase will install Terraform in the bundle's cache directory.	2022-12-15 17:30:33 +01:00
Pieter Noordhuis	c255bd686a	Define deploy command as sequence of build phases (#129 )	2022-12-12 12:49:25 +01:00

32 Commits