databricks-cli/bundle/phases/initialize.go

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

111 lines
3.8 KiB
Go
Raw Normal View History

package phases
import (
Remove bundle.{Seq,If,Defer,newPhase,logString}, switch to regular functions (#2390) ## Changes - Instead of constructing chains of mutators and then executing them, execute them directly. - Remove functionality related to chain-building: Seq, If, Defer, newPhase, logString. - Phases become functions that apply the changes directly rather than construct mutator chains that will be called later. - Add a helper ApplySeq to call multiple mutators, use it where Apply+Seq were used before. This is intended to be a refactoring without functional changes, but there are a few behaviour changes: - Since defer() is used to call unlock instead of bundle.Defer() unlocking will now happen even in case of panics. - In --debug, the phase names are are still logged once before start of the phase but each entry no longer has 'seq' or phase name in it. - The message "Deployment complete!" was printed even if terraform.Apply() mutator had an error. It no longer does that. ## Motivation The use of the chains was necessary when mutators were returning a list of other mutators instead of calling them directly. But that has since been removed, so now the chain machinery have no purpose anymore. Use of direct functions simplifies the logic and makes bugs more apparent and easy to fix. Other improvements that this unlocks: - Simpler stacktraces/debugging (breakpoints). - Use of functions with narrowly scoped API: instead of mutators that receive full bundle config, we can use focused functions that only deal with sections they care about prepareGitSettings(currentGitSection) -> updatedGitSection. This makes the data flow more apparent. - Parallel computations across mutators (within phase): launch goroutines fetching data from APIs at the beggining, process them once they are ready. ## Tests Existing tests.
2025-02-27 11:41:58 +00:00
"context"
"github.com/databricks/cli/bundle"
Added support for Databricks Apps in DABs (#1928) ## Changes Now it's possible to configure new `app` resource in bundle and point it to the custom `source_code_path` location where Databricks App code is defined. On `databricks bundle deploy` DABs will create an app. All consecutive `databricks bundle deploy` execution will update an existing app if there are any updated On `databricks bundle run <my_app>` DABs will execute app deployment. If the app is not started yet, it will start the app first. ### Bundle configuration ``` bundle: name: apps variables: my_job_id: description: "ID of job to run app" lookup: job: "My Job" databricks_name: description: "Name for app user" additional_flags: description: "Additional flags to run command app" default: "" my_app_config: type: complex description: "Configuration for my Databricks App" default: command: - flask - --app - hello - run - ${var.additional_flags} env: - name: DATABRICKS_NAME value: ${var.databricks_name} resources: apps: my_app: name: "anester-app" # required and has to be unique description: "My App" source_code_path: ./app # required and points to location of app code config: ${var.my_app_config} resources: - name: "my-job" description: "A job for app to be able to run" job: id: ${var.my_job_id} permission: "CAN_MANAGE_RUN" permissions: - user_name: "foo@bar.com" level: "CAN_VIEW" - service_principal_name: "my_sp" level: "CAN_MANAGE" targets: dev: variables: databricks_name: "Andrew (from dev)" additional_flags: --debug prod: variables: databricks_name: "Andrew (from prod)" ``` ### Execution 1. `databricks bundle deploy -t dev` 2. `databricks bundle run my_app -t dev` **If app is started** ``` ✓ Getting the status of the app my-app ✓ App is in RUNNING state ✓ Preparing source code for new app deployment. ✓ Deployment is pending ✓ Starting app with command: flask --app hello run --debug ✓ App started successfully You can access the app at <app-url> ``` **If app is not started** ``` ✓ Getting the status of the app my-app ✓ App is in UNAVAILABLE state ✓ Starting the app my-app ✓ App is starting... .... ✓ App is starting... ✓ App is started! ✓ Preparing source code for new app deployment. ✓ Downloading source code from /Workspace/Users/... ✓ Starting app with command: flask --app hello run --debug ✓ App started successfully You can access the app at <app-url> ``` ## Tests Added unit and config tests + manual test. ``` --- PASS: TestAccDeployBundleWithApp (404.59s) PASS coverage: 36.8% of statements in ./... ok github.com/databricks/cli/internal/bundle 405.035s coverage: 36.8% of statements in ./... ```
2025-01-13 16:43:48 +00:00
"github.com/databricks/cli/bundle/apps"
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/mutator"
pythonmutator "github.com/databricks/cli/bundle/config/mutator/python"
"github.com/databricks/cli/bundle/config/validate"
"github.com/databricks/cli/bundle/deploy/metadata"
"github.com/databricks/cli/bundle/deploy/terraform"
"github.com/databricks/cli/bundle/permissions"
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: *** experimental: scripts: prebuild: | echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: | echo 'Checking go version...' go version postdeploy: | echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "***" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/*.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ```
2023-09-14 10:14:13 +00:00
"github.com/databricks/cli/bundle/scripts"
"github.com/databricks/cli/bundle/trampoline"
Remove bundle.{Seq,If,Defer,newPhase,logString}, switch to regular functions (#2390) ## Changes - Instead of constructing chains of mutators and then executing them, execute them directly. - Remove functionality related to chain-building: Seq, If, Defer, newPhase, logString. - Phases become functions that apply the changes directly rather than construct mutator chains that will be called later. - Add a helper ApplySeq to call multiple mutators, use it where Apply+Seq were used before. This is intended to be a refactoring without functional changes, but there are a few behaviour changes: - Since defer() is used to call unlock instead of bundle.Defer() unlocking will now happen even in case of panics. - In --debug, the phase names are are still logged once before start of the phase but each entry no longer has 'seq' or phase name in it. - The message "Deployment complete!" was printed even if terraform.Apply() mutator had an error. It no longer does that. ## Motivation The use of the chains was necessary when mutators were returning a list of other mutators instead of calling them directly. But that has since been removed, so now the chain machinery have no purpose anymore. Use of direct functions simplifies the logic and makes bugs more apparent and easy to fix. Other improvements that this unlocks: - Simpler stacktraces/debugging (breakpoints). - Use of functions with narrowly scoped API: instead of mutators that receive full bundle config, we can use focused functions that only deal with sections they care about prepareGitSettings(currentGitSection) -> updatedGitSection. This makes the data flow more apparent. - Parallel computations across mutators (within phase): launch goroutines fetching data from APIs at the beggining, process them once they are ready. ## Tests Existing tests.
2025-02-27 11:41:58 +00:00
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/log"
)
// The initialize phase fills in defaults and connects to the workspace.
// Interpolation of fields referring to the "bundle" and "workspace" keys
// happens upon completion of this phase.
Remove bundle.{Seq,If,Defer,newPhase,logString}, switch to regular functions (#2390) ## Changes - Instead of constructing chains of mutators and then executing them, execute them directly. - Remove functionality related to chain-building: Seq, If, Defer, newPhase, logString. - Phases become functions that apply the changes directly rather than construct mutator chains that will be called later. - Add a helper ApplySeq to call multiple mutators, use it where Apply+Seq were used before. This is intended to be a refactoring without functional changes, but there are a few behaviour changes: - Since defer() is used to call unlock instead of bundle.Defer() unlocking will now happen even in case of panics. - In --debug, the phase names are are still logged once before start of the phase but each entry no longer has 'seq' or phase name in it. - The message "Deployment complete!" was printed even if terraform.Apply() mutator had an error. It no longer does that. ## Motivation The use of the chains was necessary when mutators were returning a list of other mutators instead of calling them directly. But that has since been removed, so now the chain machinery have no purpose anymore. Use of direct functions simplifies the logic and makes bugs more apparent and easy to fix. Other improvements that this unlocks: - Simpler stacktraces/debugging (breakpoints). - Use of functions with narrowly scoped API: instead of mutators that receive full bundle config, we can use focused functions that only deal with sections they care about prepareGitSettings(currentGitSection) -> updatedGitSection. This makes the data flow more apparent. - Parallel computations across mutators (within phase): launch goroutines fetching data from APIs at the beggining, process them once they are ready. ## Tests Existing tests.
2025-02-27 11:41:58 +00:00
func Initialize(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
log.Info(ctx, "Phase: initialize")
return bundle.ApplySeq(ctx, b,
validate.AllResourcesHaveValues(),
validate.NoInterpolationInAuthConfig(),
Remove bundle.{Seq,If,Defer,newPhase,logString}, switch to regular functions (#2390) ## Changes - Instead of constructing chains of mutators and then executing them, execute them directly. - Remove functionality related to chain-building: Seq, If, Defer, newPhase, logString. - Phases become functions that apply the changes directly rather than construct mutator chains that will be called later. - Add a helper ApplySeq to call multiple mutators, use it where Apply+Seq were used before. This is intended to be a refactoring without functional changes, but there are a few behaviour changes: - Since defer() is used to call unlock instead of bundle.Defer() unlocking will now happen even in case of panics. - In --debug, the phase names are are still logged once before start of the phase but each entry no longer has 'seq' or phase name in it. - The message "Deployment complete!" was printed even if terraform.Apply() mutator had an error. It no longer does that. ## Motivation The use of the chains was necessary when mutators were returning a list of other mutators instead of calling them directly. But that has since been removed, so now the chain machinery have no purpose anymore. Use of direct functions simplifies the logic and makes bugs more apparent and easy to fix. Other improvements that this unlocks: - Simpler stacktraces/debugging (breakpoints). - Use of functions with narrowly scoped API: instead of mutators that receive full bundle config, we can use focused functions that only deal with sections they care about prepareGitSettings(currentGitSection) -> updatedGitSection. This makes the data flow more apparent. - Parallel computations across mutators (within phase): launch goroutines fetching data from APIs at the beggining, process them once they are ready. ## Tests Existing tests.
2025-02-27 11:41:58 +00:00
// Update all path fields in the sync block to be relative to the bundle root path.
mutator.RewriteSyncPaths(),
// Configure the default sync path to equal the bundle root if not explicitly configured.
// By default, this means all files in the bundle root directory are synchronized.
mutator.SyncDefaultPath(),
// Figure out if the sync root path is identical or an ancestor of the bundle root path.
// If it is an ancestor, this updates all paths to be relative to the sync root path.
mutator.SyncInferRoot(),
mutator.PopulateCurrentUser(),
mutator.LoadGitDetails(),
// This mutator needs to be run before variable interpolation and defining default workspace paths
// because it affects how workspace variables are resolved.
mutator.ApplySourceLinkedDeploymentPreset(),
mutator.DefineDefaultWorkspaceRoot(),
mutator.ExpandWorkspaceRoot(),
mutator.DefineDefaultWorkspacePaths(),
mutator.PrependWorkspacePrefix(),
// This mutator needs to be run before variable interpolation because it
// searches for strings with variable references in them.
mutator.RewriteWorkspacePrefix(),
mutator.SetVariables(),
// Intentionally placed before ResolveVariableReferencesInLookup, ResolveResourceReferences,
// ResolveVariableReferencesInComplexVariables and ResolveVariableReferences.
// See what is expected in PythonMutatorPhaseInit doc
pythonmutator.PythonMutator(pythonmutator.PythonMutatorPhaseInit),
pythonmutator.PythonMutator(pythonmutator.PythonMutatorPhaseLoadResources),
pythonmutator.PythonMutator(pythonmutator.PythonMutatorPhaseApplyMutators),
mutator.ResolveVariableReferencesInLookup(),
mutator.ResolveResourceReferences(),
mutator.ResolveVariableReferences(
"bundle",
"workspace",
"variables",
),
mutator.MergeJobClusters(),
mutator.MergeJobParameters(),
mutator.MergeJobTasks(),
mutator.MergePipelineClusters(),
mutator.MergeApps(),
mutator.CaptureSchemaDependency(),
// Provide permission config errors & warnings after initializing all variables
permissions.PermissionDiagnostics(),
mutator.SetRunAs(),
mutator.OverrideCompute(),
mutator.ConfigureDashboardDefaults(),
mutator.ConfigureVolumeDefaults(),
mutator.ProcessTargetMode(),
mutator.ApplyPresets(),
mutator.DefaultQueueing(),
mutator.ExpandPipelineGlobPaths(),
// Configure use of WSFS for reads if the CLI is running on Databricks.
mutator.ConfigureWSFS(),
mutator.TranslatePaths(),
trampoline.WrapperWarning(),
apps.Validate(),
permissions.ValidateSharedRootPermissions(),
permissions.ApplyBundlePermissions(),
permissions.FilterCurrentUser(),
metadata.AnnotateJobs(),
metadata.AnnotatePipelines(),
terraform.Initialize(),
scripts.Execute(config.ScriptPostInit),
)
}