databricks-cli/bundle/phases/deploy.go

181 lines
5.3 KiB
Go
Raw Normal View History

package phases
import (
"context"
"fmt"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/artifacts"
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
"github.com/databricks/cli/bundle/config"
Add validation for Git settings in bundles (#578) ## Changes This checks whether the Git settings are consistent with the actual Git state of a source directory. (This PR adds to https://github.com/databricks/cli/pull/577.) Previously, we would silently let users configure their Git branch to e.g. `main` and deploy with that metadata even if they were actually on a different branch. With these changes, the following config would result in an error when deployed from any other branch than `main`: ``` bundle: name: example workspace: git: branch: main environments: ... ``` > not on the right Git branch: > expected according to configuration: main > actual: my-feature-branch It's not very useful to set the same branch for all environments, though. For development, it's better to just let the CLI auto-detect the right branch. Therefore, it's now possible to set the branch just for a single environment: ``` bundle: name: example 2 environments: development: default: true production: # production can only be deployed from the 'main' branch git: branch: main ``` Adding to that, the `mode: production` option actually checks that users explicitly set the Git branch as seen above. Setting that branch helps avoid mistakes, where someone accidentally deploys to production from the wrong branch. (I could see us offering an escape hatch for that in the future.) # Testing Manual testing to validate the experience and error messages. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com>
2023-07-30 12:44:33 +00:00
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/deploy"
"github.com/databricks/cli/bundle/deploy/files"
"github.com/databricks/cli/bundle/deploy/lock"
Persist deployment metadata in WSFS (#845) ## Changes This PR introduces a metadata struct that stores a subset of bundle configuration that we wish to expose to other Databricks services that wish to integrate with bundles. This metadata file is uploaded to a file `${bundle.workspace.state_path}/metadata.json` in the WSFS destination of the bundle deployment. Documentation for emitted metadata fields: * `version`: Version for the metadata file schema * `config.bundle.git.branch`: Name of the git branch the bundle was deployed from. * `config.bundle.git.origin_url`: URL for git remote "origin" * `config.bundle.git.bundle_root_path`: Relative path of the bundle root from the root of the git repository. Is set to "." if they are the same. * `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this bundle was deployed from. Note, the deployment might not exactly match this commit version if there are changes that have not been committed to git at deploy time, * `file_path`: Path in workspace where we sync bundle files to. * `resources.jobs.[job-ref].id`: Id of the job * `resources.jobs.[job-ref].relative_path`: Relative path of the yaml config file from the bundle root where this job was defined. Example metadata object when bundle root and git root are the same: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "master", "origin_url": "www.host.com", "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0", "bundle_root_path": "." } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` Example metadata when the git root is one level above the bundle repo: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "dev-branch", "origin_url": "www.my-repo.com", "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b", "bundle_root_path": "pipeline-progress" } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` This unblocks integration to the jobs break glass UI for bundles. ## Tests Unit tests and integration tests.
2023-10-27 12:55:43 +00:00
"github.com/databricks/cli/bundle/deploy/metadata"
"github.com/databricks/cli/bundle/deploy/terraform"
"github.com/databricks/cli/bundle/libraries"
"github.com/databricks/cli/bundle/permissions"
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
"github.com/databricks/cli/bundle/scripts"
"github.com/databricks/cli/bundle/trampoline"
"github.com/databricks/cli/libs/cmdio"
Add verbose flag to the "bundle deploy" command (#1774) ## Changes - Extract sync output logic from `cmd/sync` into `lib/sync` - Add hidden `verbose` flag to the `bundle deploy` command, it's false by default and hidden from the `--help` output - Pass output handler to the `deploy/files/upload` mutator if the verbose option is true The was an idea to use in-place output overriding each past file sync event in the output, bit that wont work for the extension, since it doesn't display deploy logs in the terminal. Example output: ``` ~/tmp/defpy: ~/cli/cli bundle deploy --sync-progress Building defpy... Uploading defpy-0.0.1+20240917.112755-py3-none-any.whl... Uploading bundle files to /Users/ilia.babanov@databricks.com/.bundle/defpy/dev/files... Action: PUT: requirements-dev.txt, resources/defpy_pipeline.yml, pytest.ini, src/defpy/main.py, src/defpy/__init__.py, src/dlt_pipeline.ipynb, tests/main_test.py, src/notebook.ipynb, setup.py, resources/defpy_job.yml, .vscode/extensions.json, .vscode/settings.json, fixtures/.gitkeep, .vscode/__builtins__.pyi, README.md, .gitignore, databricks.yml Uploaded tests Uploaded resources Uploaded fixtures Uploaded .vscode Uploaded src/defpy Uploaded requirements-dev.txt Uploaded .gitignore Uploaded fixtures/.gitkeep Uploaded src/defpy/__init__.py Uploaded databricks.yml Uploaded README.md Uploaded setup.py Uploaded .vscode/__builtins__.pyi Uploaded .vscode/extensions.json Uploaded src/dlt_pipeline.ipynb Uploaded .vscode/settings.json Uploaded resources/defpy_job.yml Uploaded pytest.ini Uploaded src/defpy/main.py Uploaded tests/main_test.py Uploaded resources/defpy_pipeline.yml Uploaded src/notebook.ipynb Initial Sync Complete Deploying resources... Updating deployment state... Deployment complete! ``` Output example in the extension: <img width="1843" alt="Screenshot 2024-09-19 at 11 07 48" src="https://github.com/user-attachments/assets/0fafd095-cdc6-44b8-b482-27a38ada0330"> ## Tests Manually for the `sync` and `bundle deploy` commands + vscode extension sync and deploy flows
2024-09-23 10:09:11 +00:00
"github.com/databricks/cli/libs/sync"
terraformlib "github.com/databricks/cli/libs/terraform"
tfjson "github.com/hashicorp/terraform-json"
)
2024-09-10 10:52:31 +00:00
func filterDeleteOrRecreateActions(changes []*tfjson.ResourceChange, resourceType string) []terraformlib.Action {
res := make([]terraformlib.Action, 0)
for _, rc := range changes {
2024-09-10 10:52:31 +00:00
if rc.Type != resourceType {
continue
}
var actionType terraformlib.ActionType
switch {
case rc.Change.Actions.Delete():
actionType = terraformlib.ActionTypeDelete
case rc.Change.Actions.Replace():
actionType = terraformlib.ActionTypeRecreate
default:
2024-09-10 10:52:31 +00:00
// Filter other action types..
continue
}
res = append(res, terraformlib.Action{
Action: actionType,
ResourceType: rc.Type,
ResourceName: rc.Name,
})
}
return res
}
func approvalForDeploy(ctx context.Context, b *bundle.Bundle) (bool, error) {
tf := b.Terraform
if tf == nil {
return false, fmt.Errorf("terraform not initialized")
}
// read plan file
plan, err := tf.ShowPlanFile(ctx, b.Plan.Path)
if err != nil {
return false, err
}
2024-09-10 10:52:31 +00:00
schemaActions := filterDeleteOrRecreateActions(plan.ResourceChanges, "databricks_schema")
dltActions := filterDeleteOrRecreateActions(plan.ResourceChanges, "databricks_pipeline")
volumeActions := filterDeleteOrRecreateActions(plan.ResourceChanges, "databricks_volume")
// We don't need to display any prompts in this case.
2024-09-10 10:52:31 +00:00
if len(schemaActions) == 0 && len(dltActions) == 0 && len(volumeActions) == 0 {
return true, nil
}
// One or more UC schema resources will be deleted or recreated.
if len(schemaActions) != 0 {
cmdio.LogString(ctx, "The following UC schemas will be deleted or recreated. Any underlying data may be lost:")
for _, action := range schemaActions {
cmdio.Log(ctx, action)
}
}
// One or more DLT pipelines is being recreated.
if len(dltActions) != 0 {
msg := `
This action will result in the deletion or recreation of the following DLT Pipelines along with the
Streaming Tables (STs) and Materialized Views (MVs) managed by them. Recreating the Pipelines will
restore the defined STs and MVs through full refresh. Note that recreation is necessary when pipeline
properties such as the 'catalog' or 'storage' are changed:`
cmdio.LogString(ctx, msg)
for _, action := range dltActions {
cmdio.Log(ctx, action)
}
}
2024-09-10 10:52:31 +00:00
// One or more volumes is being recreated.
if len(volumeActions) != 0 {
msg := `
2024-12-02 10:46:55 +00:00
This action will result in the deletion or recreation of the following volumes.
2024-10-31 14:57:45 +00:00
For managed volumes, the files stored in the volume are also deleted from your
cloud tenant within 30 days. For external volumes, the metadata about the volume
is removed from the catalog, but the underlying files are not deleted:`
2024-09-10 10:52:31 +00:00
cmdio.LogString(ctx, msg)
for _, action := range volumeActions {
cmdio.Log(ctx, action)
}
}
if b.AutoApprove {
return true, nil
}
if !cmdio.IsPromptSupported(ctx) {
return false, fmt.Errorf("the deployment requires destructive actions, but current console does not support prompting. Please specify --auto-approve if you would like to skip prompts and proceed")
}
cmdio.LogString(ctx, "")
approved, err := cmdio.AskYesOrNo(ctx, "Would you like to proceed?")
if err != nil {
return false, err
}
return approved, nil
}
// The deploy phase deploys artifacts and resources.
Add verbose flag to the "bundle deploy" command (#1774) ## Changes - Extract sync output logic from `cmd/sync` into `lib/sync` - Add hidden `verbose` flag to the `bundle deploy` command, it's false by default and hidden from the `--help` output - Pass output handler to the `deploy/files/upload` mutator if the verbose option is true The was an idea to use in-place output overriding each past file sync event in the output, bit that wont work for the extension, since it doesn't display deploy logs in the terminal. Example output: ``` ~/tmp/defpy: ~/cli/cli bundle deploy --sync-progress Building defpy... Uploading defpy-0.0.1+20240917.112755-py3-none-any.whl... Uploading bundle files to /Users/ilia.babanov@databricks.com/.bundle/defpy/dev/files... Action: PUT: requirements-dev.txt, resources/defpy_pipeline.yml, pytest.ini, src/defpy/main.py, src/defpy/__init__.py, src/dlt_pipeline.ipynb, tests/main_test.py, src/notebook.ipynb, setup.py, resources/defpy_job.yml, .vscode/extensions.json, .vscode/settings.json, fixtures/.gitkeep, .vscode/__builtins__.pyi, README.md, .gitignore, databricks.yml Uploaded tests Uploaded resources Uploaded fixtures Uploaded .vscode Uploaded src/defpy Uploaded requirements-dev.txt Uploaded .gitignore Uploaded fixtures/.gitkeep Uploaded src/defpy/__init__.py Uploaded databricks.yml Uploaded README.md Uploaded setup.py Uploaded .vscode/__builtins__.pyi Uploaded .vscode/extensions.json Uploaded src/dlt_pipeline.ipynb Uploaded .vscode/settings.json Uploaded resources/defpy_job.yml Uploaded pytest.ini Uploaded src/defpy/main.py Uploaded tests/main_test.py Uploaded resources/defpy_pipeline.yml Uploaded src/notebook.ipynb Initial Sync Complete Deploying resources... Updating deployment state... Deployment complete! ``` Output example in the extension: <img width="1843" alt="Screenshot 2024-09-19 at 11 07 48" src="https://github.com/user-attachments/assets/0fafd095-cdc6-44b8-b482-27a38ada0330"> ## Tests Manually for the `sync` and `bundle deploy` commands + vscode extension sync and deploy flows
2024-09-23 10:09:11 +00:00
func Deploy(outputHandler sync.OutputHandler) bundle.Mutator {
// Core mutators that CRUD resources and modify deployment state. These
// mutators need informed consent if they are potentially destructive.
deployCore := bundle.Defer(
bundle.Seq(
bundle.LogString("Deploying resources..."),
terraform.Apply(),
),
bundle.Seq(
terraform.StatePush(),
terraform.Load(),
metadata.Compute(),
metadata.Upload(),
bundle.LogString("Deployment complete!"),
),
)
deployMutator := bundle.Seq(
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
scripts.Execute(config.ScriptPreDeploy),
2023-05-16 16:01:50 +00:00
lock.Acquire(),
bundle.Defer(
bundle.Seq(
terraform.StatePull(),
terraform.CheckDashboardsModifiedRemotely(),
deploy.StatePull(),
Add validation for Git settings in bundles (#578) ## Changes This checks whether the Git settings are consistent with the actual Git state of a source directory. (This PR adds to https://github.com/databricks/cli/pull/577.) Previously, we would silently let users configure their Git branch to e.g. `main` and deploy with that metadata even if they were actually on a different branch. With these changes, the following config would result in an error when deployed from any other branch than `main`: ``` bundle: name: example workspace: git: branch: main environments: ... ``` > not on the right Git branch: > expected according to configuration: main > actual: my-feature-branch It's not very useful to set the same branch for all environments, though. For development, it's better to just let the CLI auto-detect the right branch. Therefore, it's now possible to set the branch just for a single environment: ``` bundle: name: example 2 environments: development: default: true production: # production can only be deployed from the 'main' branch git: branch: main ``` Adding to that, the `mode: production` option actually checks that users explicitly set the Git branch as seen above. Setting that branch helps avoid mistakes, where someone accidentally deploys to production from the wrong branch. (I could see us offering an escape hatch for that in the future.) # Testing Manual testing to validate the experience and error messages. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com>
2023-07-30 12:44:33 +00:00
mutator.ValidateGitDetails(),
artifacts.CleanUp(),
libraries.ExpandGlobReferences(),
libraries.Upload(),
trampoline.TransformWheelTask(),
Add verbose flag to the "bundle deploy" command (#1774) ## Changes - Extract sync output logic from `cmd/sync` into `lib/sync` - Add hidden `verbose` flag to the `bundle deploy` command, it's false by default and hidden from the `--help` output - Pass output handler to the `deploy/files/upload` mutator if the verbose option is true The was an idea to use in-place output overriding each past file sync event in the output, bit that wont work for the extension, since it doesn't display deploy logs in the terminal. Example output: ``` ~/tmp/defpy: ~/cli/cli bundle deploy --sync-progress Building defpy... Uploading defpy-0.0.1+20240917.112755-py3-none-any.whl... Uploading bundle files to /Users/ilia.babanov@databricks.com/.bundle/defpy/dev/files... Action: PUT: requirements-dev.txt, resources/defpy_pipeline.yml, pytest.ini, src/defpy/main.py, src/defpy/__init__.py, src/dlt_pipeline.ipynb, tests/main_test.py, src/notebook.ipynb, setup.py, resources/defpy_job.yml, .vscode/extensions.json, .vscode/settings.json, fixtures/.gitkeep, .vscode/__builtins__.pyi, README.md, .gitignore, databricks.yml Uploaded tests Uploaded resources Uploaded fixtures Uploaded .vscode Uploaded src/defpy Uploaded requirements-dev.txt Uploaded .gitignore Uploaded fixtures/.gitkeep Uploaded src/defpy/__init__.py Uploaded databricks.yml Uploaded README.md Uploaded setup.py Uploaded .vscode/__builtins__.pyi Uploaded .vscode/extensions.json Uploaded src/dlt_pipeline.ipynb Uploaded .vscode/settings.json Uploaded resources/defpy_job.yml Uploaded pytest.ini Uploaded src/defpy/main.py Uploaded tests/main_test.py Uploaded resources/defpy_pipeline.yml Uploaded src/notebook.ipynb Initial Sync Complete Deploying resources... Updating deployment state... Deployment complete! ``` Output example in the extension: <img width="1843" alt="Screenshot 2024-09-19 at 11 07 48" src="https://github.com/user-attachments/assets/0fafd095-cdc6-44b8-b482-27a38ada0330"> ## Tests Manually for the `sync` and `bundle deploy` commands + vscode extension sync and deploy flows
2024-09-23 10:09:11 +00:00
files.Upload(outputHandler),
deploy.StateUpdate(),
deploy.StatePush(),
permissions.ApplyWorkspaceRootPermissions(),
terraform.Interpolate(),
terraform.Write(),
terraform.CheckRunningResource(),
terraform.Plan(terraform.PlanGoal("deploy")),
bundle.If(
approvalForDeploy,
deployCore,
bundle.LogString("Deployment cancelled!"),
),
),
lock.Release(lock.GoalDeploy),
),
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
scripts.Execute(config.ScriptPostDeploy),
)
2023-05-16 16:01:50 +00:00
return newPhase(
"deploy",
[]bundle.Mutator{deployMutator},
)
}