databricks-cli/bundle/phases/deploy.go

184 lines
5.2 KiB
Go
Raw Normal View History

package phases
import (
"context"
"fmt"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/artifacts"
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
"github.com/databricks/cli/bundle/config"
Add validation for Git settings in bundles (#578) ## Changes This checks whether the Git settings are consistent with the actual Git state of a source directory. (This PR adds to https://github.com/databricks/cli/pull/577.) Previously, we would silently let users configure their Git branch to e.g. `main` and deploy with that metadata even if they were actually on a different branch. With these changes, the following config would result in an error when deployed from any other branch than `main`: ``` bundle: name: example workspace: git: branch: main environments: ... ``` > not on the right Git branch: > expected according to configuration: main > actual: my-feature-branch It's not very useful to set the same branch for all environments, though. For development, it's better to just let the CLI auto-detect the right branch. Therefore, it's now possible to set the branch just for a single environment: ``` bundle: name: example 2 environments: development: default: true production: # production can only be deployed from the 'main' branch git: branch: main ``` Adding to that, the `mode: production` option actually checks that users explicitly set the Git branch as seen above. Setting that branch helps avoid mistakes, where someone accidentally deploys to production from the wrong branch. (I could see us offering an escape hatch for that in the future.) # Testing Manual testing to validate the experience and error messages. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com>
2023-07-30 12:44:33 +00:00
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/deploy"
"github.com/databricks/cli/bundle/deploy/files"
"github.com/databricks/cli/bundle/deploy/lock"
Persist deployment metadata in WSFS (#845) ## Changes This PR introduces a metadata struct that stores a subset of bundle configuration that we wish to expose to other Databricks services that wish to integrate with bundles. This metadata file is uploaded to a file `${bundle.workspace.state_path}/metadata.json` in the WSFS destination of the bundle deployment. Documentation for emitted metadata fields: * `version`: Version for the metadata file schema * `config.bundle.git.branch`: Name of the git branch the bundle was deployed from. * `config.bundle.git.origin_url`: URL for git remote "origin" * `config.bundle.git.bundle_root_path`: Relative path of the bundle root from the root of the git repository. Is set to "." if they are the same. * `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this bundle was deployed from. Note, the deployment might not exactly match this commit version if there are changes that have not been committed to git at deploy time, * `file_path`: Path in workspace where we sync bundle files to. * `resources.jobs.[job-ref].id`: Id of the job * `resources.jobs.[job-ref].relative_path`: Relative path of the yaml config file from the bundle root where this job was defined. Example metadata object when bundle root and git root are the same: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "master", "origin_url": "www.host.com", "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0", "bundle_root_path": "." } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` Example metadata when the git root is one level above the bundle repo: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "dev-branch", "origin_url": "www.my-repo.com", "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b", "bundle_root_path": "pipeline-progress" } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` This unblocks integration to the jobs break glass UI for bundles. ## Tests Unit tests and integration tests.
2023-10-27 12:55:43 +00:00
"github.com/databricks/cli/bundle/deploy/metadata"
"github.com/databricks/cli/bundle/deploy/terraform"
"github.com/databricks/cli/bundle/libraries"
"github.com/databricks/cli/bundle/permissions"
Added transformation mutator for Python wheel task for them to work on DBR <13.1 (#635) ## Changes ***Note: this PR relies on sync.include functionality from here: https://github.com/databricks/cli/pull/671*** Added transformation mutator for Python wheel task for them to work on DBR <13.1 Using wheels upload to Workspace file system as cluster libraries is not supported in DBR < 13.1 In order to make Python wheel work correctly on DBR < 13.1 we do the following: 1. Build and upload python wheel as usual 2. Transform python wheel task into special notebook task which does the following a. Installs all necessary wheels with %pip magic b. Executes defined entry point with all provided parameters 3. Upload this notebook file to workspace file system 4. Deploy transformed job task This is also beneficial for executing on existing clusters because this notebook always reinstall wheels so if there are any changes to the wheel package, they are correctly picked up ## Tests bundle.yml ```yaml bundle: name: wheel-task workspace: host: **** resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" parameters: ["first argument","first value","second argument","second value"] libraries: - whl: ./dist/*.whl ``` Output ``` andrew.nester@HFW9Y94129 wheel % databricks bundle run test_job Run URL: *** 2023-08-03 15:58:04 "[default] My Wheel Job" TERMINATED SUCCESS Output: ======= Task TestTask: Hello from my func Got arguments v1: ['python', 'first argument', 'first value', 'second argument', 'second value'] ```
2023-08-30 12:21:39 +00:00
"github.com/databricks/cli/bundle/python"
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
"github.com/databricks/cli/bundle/scripts"
"github.com/databricks/cli/libs/cmdio"
terraformlib "github.com/databricks/cli/libs/terraform"
tfjson "github.com/hashicorp/terraform-json"
)
func parseTerraformActions(changes []*tfjson.ResourceChange, toInclude func(typ string, actions tfjson.Actions) bool) []terraformlib.Action {
res := make([]terraformlib.Action, 0)
for _, rc := range changes {
if !toInclude(rc.Type, rc.Change.Actions) {
continue
}
var actionType terraformlib.ActionType
switch {
case rc.Change.Actions.Delete():
actionType = terraformlib.ActionTypeDelete
case rc.Change.Actions.Replace():
actionType = terraformlib.ActionTypeRecreate
default:
// No use case for other action types yet.
continue
}
res = append(res, terraformlib.Action{
Action: actionType,
ResourceType: rc.Type,
ResourceName: rc.Name,
})
}
return res
}
2024-08-12 14:16:25 +00:00
func approvalForDeploy(ctx context.Context, b *bundle.Bundle) (bool, error) {
tf := b.Terraform
if tf == nil {
return false, fmt.Errorf("terraform not initialized")
}
// read plan file
plan, err := tf.ShowPlanFile(ctx, b.Plan.Path)
if err != nil {
return false, err
}
schemaActions := parseTerraformActions(plan.ResourceChanges, func(typ string, actions tfjson.Actions) bool {
// Filter in only UC schema resources.
if typ != "databricks_schema" {
return false
}
// We only display prompts for destructive actions like deleting or
// recreating a schema.
return actions.Delete() || actions.Replace()
})
dltActions := parseTerraformActions(plan.ResourceChanges, func(typ string, actions tfjson.Actions) bool {
// Filter in only DLT pipeline resources.
if typ != "databricks_pipeline" {
return false
}
// Recreating DLT pipeline leads to metadata loss and for a transient period
// the underling tables will be unavailable.
2024-08-30 09:40:04 +00:00
return actions.Replace() || actions.Delete()
})
// We don't need to display any prompts in this case.
if len(dltActions) == 0 && len(schemaActions) == 0 {
return true, nil
}
// One or more UC schema resources will be deleted or recreated.
if len(schemaActions) != 0 {
cmdio.LogString(ctx, "The following UC schemas will be deleted or recreated. Any underlying data may be lost:")
for _, action := range schemaActions {
cmdio.Log(ctx, action)
}
}
// One or more DLT pipelines is being recreated.
if len(dltActions) != 0 {
2024-08-30 09:55:07 +00:00
msg := `
This action will result in the deletion or recreation of the following DLT Pipelines along with the
Streaming Tables (STs) and Materialized Views (MVs) managed by them. Recreating the Pipelines will
restore the defined STs and MVs through full refresh. Note that recreation is necessary when pipeline
properties such as the 'catalog' or 'storage' are changed:`
cmdio.LogString(ctx, msg)
for _, action := range dltActions {
cmdio.Log(ctx, action)
}
}
if b.AutoApprove {
return true, nil
}
if !cmdio.IsPromptSupported(ctx) {
return false, fmt.Errorf("the deployment requires destructive actions, but current console does not support prompting. Please specify --auto-approve if you would like to skip prompts and proceed")
}
cmdio.LogString(ctx, "")
approved, err := cmdio.AskYesOrNo(ctx, "Would you like to proceed?")
if err != nil {
return false, err
}
return approved, nil
}
// The deploy phase deploys artifacts and resources.
func Deploy() bundle.Mutator {
// Core mutators that CRUD resources and modify deployment state. These
// mutators need informed consent if they are potentially destructive.
deployCore := bundle.Defer(
bundle.Seq(
bundle.LogString("Deploying resources..."),
terraform.Apply(),
),
bundle.Seq(
terraform.StatePush(),
terraform.Load(),
metadata.Compute(),
metadata.Upload(),
bundle.LogString("Deployment complete!"),
),
)
deployMutator := bundle.Seq(
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
scripts.Execute(config.ScriptPreDeploy),
2023-05-16 16:01:50 +00:00
lock.Acquire(),
bundle.Defer(
bundle.Seq(
terraform.StatePull(),
deploy.StatePull(),
Add validation for Git settings in bundles (#578) ## Changes This checks whether the Git settings are consistent with the actual Git state of a source directory. (This PR adds to https://github.com/databricks/cli/pull/577.) Previously, we would silently let users configure their Git branch to e.g. `main` and deploy with that metadata even if they were actually on a different branch. With these changes, the following config would result in an error when deployed from any other branch than `main`: ``` bundle: name: example workspace: git: branch: main environments: ... ``` > not on the right Git branch: > expected according to configuration: main > actual: my-feature-branch It's not very useful to set the same branch for all environments, though. For development, it's better to just let the CLI auto-detect the right branch. Therefore, it's now possible to set the branch just for a single environment: ``` bundle: name: example 2 environments: development: default: true production: # production can only be deployed from the 'main' branch git: branch: main ``` Adding to that, the `mode: production` option actually checks that users explicitly set the Git branch as seen above. Setting that branch helps avoid mistakes, where someone accidentally deploys to production from the wrong branch. (I could see us offering an escape hatch for that in the future.) # Testing Manual testing to validate the experience and error messages. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com>
2023-07-30 12:44:33 +00:00
mutator.ValidateGitDetails(),
artifacts.CleanUp(),
libraries.ExpandGlobReferences(),
libraries.Upload(),
Added transformation mutator for Python wheel task for them to work on DBR <13.1 (#635) ## Changes ***Note: this PR relies on sync.include functionality from here: https://github.com/databricks/cli/pull/671*** Added transformation mutator for Python wheel task for them to work on DBR <13.1 Using wheels upload to Workspace file system as cluster libraries is not supported in DBR < 13.1 In order to make Python wheel work correctly on DBR < 13.1 we do the following: 1. Build and upload python wheel as usual 2. Transform python wheel task into special notebook task which does the following a. Installs all necessary wheels with %pip magic b. Executes defined entry point with all provided parameters 3. Upload this notebook file to workspace file system 4. Deploy transformed job task This is also beneficial for executing on existing clusters because this notebook always reinstall wheels so if there are any changes to the wheel package, they are correctly picked up ## Tests bundle.yml ```yaml bundle: name: wheel-task workspace: host: **** resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" parameters: ["first argument","first value","second argument","second value"] libraries: - whl: ./dist/*.whl ``` Output ``` andrew.nester@HFW9Y94129 wheel % databricks bundle run test_job Run URL: *** 2023-08-03 15:58:04 "[default] My Wheel Job" TERMINATED SUCCESS Output: ======= Task TestTask: Hello from my func Got arguments v1: ['python', 'first argument', 'first value', 'second argument', 'second value'] ```
2023-08-30 12:21:39 +00:00
python.TransformWheelTask(),
files.Upload(),
deploy.StateUpdate(),
deploy.StatePush(),
permissions.ApplyWorkspaceRootPermissions(),
terraform.Interpolate(),
terraform.Write(),
terraform.CheckRunningResource(),
terraform.Plan(terraform.PlanGoal("deploy")),
bundle.If(
2024-08-12 14:16:25 +00:00
approvalForDeploy,
deployCore,
bundle.LogString("Deployment cancelled!"),
),
),
lock.Release(lock.GoalDeploy),
),
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
scripts.Execute(config.ScriptPostDeploy),
)
2023-05-16 16:01:50 +00:00
return newPhase(
"deploy",
[]bundle.Mutator{deployMutator},
)
}