databricks-cli/bundle/phases/build.go

package phases

import (
	"context"

	"github.com/databricks/cli/bundle"
	"github.com/databricks/cli/bundle/artifacts"
	"github.com/databricks/cli/bundle/artifacts/whl"
	"github.com/databricks/cli/bundle/config"
	"github.com/databricks/cli/bundle/config/mutator"
	"github.com/databricks/cli/bundle/scripts"
	"github.com/databricks/cli/libs/diag"
	"github.com/databricks/cli/libs/log"
)

// The build phase builds artifacts.
func Build(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
	log.Info(ctx, "Phase: build")

	return bundle.ApplySeq(ctx, b,
		scripts.Execute(config.ScriptPreBuild),
		whl.DetectPackage(),
		artifacts.InferMissingProperties(),
		artifacts.PrepareAll(),
		artifacts.BuildAll(),
		scripts.Execute(config.ScriptPostBuild),
		mutator.ResolveVariableReferences(
			"artifacts",
		),
	)
}
Define deploy command as sequence of build phases (#129) 2022-12-12 11:49:25 +00:00			`package phases`

			`import (`
Remove bundle.{Seq,If,Defer,newPhase,logString}, switch to regular functions (#2390) ## Changes - Instead of constructing chains of mutators and then executing them, execute them directly. - Remove functionality related to chain-building: Seq, If, Defer, newPhase, logString. - Phases become functions that apply the changes directly rather than construct mutator chains that will be called later. - Add a helper ApplySeq to call multiple mutators, use it where Apply+Seq were used before. This is intended to be a refactoring without functional changes, but there are a few behaviour changes: - Since defer() is used to call unlock instead of bundle.Defer() unlocking will now happen even in case of panics. - In --debug, the phase names are are still logged once before start of the phase but each entry no longer has 'seq' or phase name in it. - The message "Deployment complete!" was printed even if terraform.Apply() mutator had an error. It no longer does that. ## Motivation The use of the chains was necessary when mutators were returning a list of other mutators instead of calling them directly. But that has since been removed, so now the chain machinery have no purpose anymore. Use of direct functions simplifies the logic and makes bugs more apparent and easy to fix. Other improvements that this unlocks: - Simpler stacktraces/debugging (breakpoints). - Use of functions with narrowly scoped API: instead of mutators that receive full bundle config, we can use focused functions that only deal with sections they care about prepareGitSettings(currentGitSection) -> updatedGitSection. This makes the data flow more apparent. - Parallel computations across mutators (within phase): launch goroutines fetching data from APIs at the beggining, process them once they are ready. ## Tests Existing tests. 2025-02-27 11:41:58 +00:00			`"context"`

Rename bricks -> databricks (#389) ## Changes Rename all instances of "bricks" to "databricks". ## Tests * Confirmed the goreleaser build works, uses the correct new binary name, and produces the right archives. * Help output is confirmed to be correct. * Output of `git grep -w bricks` is minimal with a couple changes remaining for after the repository rename. 2023-05-16 16:35:39 +00:00			`"github.com/databricks/cli/bundle"`
			`"github.com/databricks/cli/bundle/artifacts"`
Simplify whl artifact autodetection code (#2371) ## Changes - Get rid of artifacts.DetectPackages which is a thin wrapper around artifacts/whl.DetectPackage - Get rid of parsing name out of setup.py. Do not randomize either, use a static one. ## Tests Existing tests. 2025-02-25 13:10:25 +00:00			`"github.com/databricks/cli/bundle/artifacts/whl"`
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: * experimental: scripts: prebuild: \| echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: \| echo 'Checking go version...' go version postdeploy: \| echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ``` 2023-09-14 10:14:13 +00:00			`"github.com/databricks/cli/bundle/config"`
Use dynamic configuration model in bundles (#1098) ## Changes This is a fundamental change to how we load and process bundle configuration. We now depend on the configuration being represented as a `dyn.Value`. This representation is functionally equivalent to Go's `any` (it is variadic) and allows us to capture metadata associated with a value, such as where it was defined (e.g. file, line, and column). It also allows us to represent Go's zero values properly (e.g. empty string, integer equal to 0, or boolean false). Using this representation allows us to let the configuration model deviate from the typed structure we have been relying on so far (`config.Root`). We need to deviate from these types when using variables for fields that are not a string themselves. For example, using `${var.num_workers}` for an integer `workers` field was impossible until now (though not implemented in this change). The loader for a `dyn.Value` includes functionality to capture any and all type mismatches between the user-defined configuration and the expected types. These mismatches can be surfaced as validation errors in future PRs. Given that many mutators expect the typed struct to be the source of truth, this change converts between the dynamic representation and the typed representation on mutator entry and exit. Existing mutators can continue to modify the typed representation and these modifications are reflected in the dynamic representation (see `MarkMutatorEntry` and `MarkMutatorExit` in `bundle/config/root.go`). Required changes included in this change: * The existing interpolation package is removed in favor of `libs/dyn/dynvar`. * Functionality to merge job clusters, job tasks, and pipeline clusters are now all broken out into their own mutators. To be implemented later: * Allow variable references for non-string types. * Surface diagnostics about the configuration provided by the user in the validation output. * Some mutators use a resource's configuration file path to resolve related relative paths. These depend on `bundle/config/paths.Path` being set and populated through `ConfigureConfigFilePath`. Instead, they should interact with the dynamically typed configuration directly. Doing this also unlocks being able to differentiate different base paths used within a job (e.g. a task override with a relative path defined in a directory other than the base job). ## Tests * Existing unit tests pass (some have been modified to accommodate) * Integration tests pass 2024-02-16 19:41:58 +00:00			`"github.com/databricks/cli/bundle/config/mutator"`
Added support for experimental scripts section (#632) ## Changes Added support for experimental scripts section It allows execution of arbitrary bash commands during certain bundle lifecycle steps. ## Tests Example of configuration ```yaml bundle: name: wheel-task workspace: host: * experimental: scripts: prebuild: \| echo 'Prebuild 1' echo 'Prebuild 2' postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" predeploy: \| echo 'Checking go version...' go version postdeploy: \| echo 'Checking python version...' python --version resources: jobs: test_job: name: "[${bundle.environment}] My Wheel Job" tasks: - task_key: TestTask existing_cluster_id: "" python_wheel_task: package_name: "my_test_code" entry_point: "run" libraries: - whl: ./dist/.whl ``` Output ```bash andrew.nester@HFW9Y94129 wheel % databricks bundle deploy artifacts.whl.AutoDetect: Detecting Python wheel project... artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel 'Prebuild 1' 'Prebuild 2' artifacts.whl.Build(my_test_code): Building... artifacts.whl.Build(my_test_code): Build succeeded 'Postbuild 1' 'Postbuild 2' 'Checking go version...' go version go1.19.9 darwin/arm64 Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files! artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading... artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded Starting resource deployment Resource deployment completed! 'Checking python version...' Python 2.7.18 ``` 2023-09-14 10:14:13 +00:00			`"github.com/databricks/cli/bundle/scripts"`
Remove bundle.{Seq,If,Defer,newPhase,logString}, switch to regular functions (#2390) ## Changes - Instead of constructing chains of mutators and then executing them, execute them directly. - Remove functionality related to chain-building: Seq, If, Defer, newPhase, logString. - Phases become functions that apply the changes directly rather than construct mutator chains that will be called later. - Add a helper ApplySeq to call multiple mutators, use it where Apply+Seq were used before. This is intended to be a refactoring without functional changes, but there are a few behaviour changes: - Since defer() is used to call unlock instead of bundle.Defer() unlocking will now happen even in case of panics. - In --debug, the phase names are are still logged once before start of the phase but each entry no longer has 'seq' or phase name in it. - The message "Deployment complete!" was printed even if terraform.Apply() mutator had an error. It no longer does that. ## Motivation The use of the chains was necessary when mutators were returning a list of other mutators instead of calling them directly. But that has since been removed, so now the chain machinery have no purpose anymore. Use of direct functions simplifies the logic and makes bugs more apparent and easy to fix. Other improvements that this unlocks: - Simpler stacktraces/debugging (breakpoints). - Use of functions with narrowly scoped API: instead of mutators that receive full bundle config, we can use focused functions that only deal with sections they care about prepareGitSettings(currentGitSection) -> updatedGitSection. This makes the data flow more apparent. - Parallel computations across mutators (within phase): launch goroutines fetching data from APIs at the beggining, process them once they are ready. ## Tests Existing tests. 2025-02-27 11:41:58 +00:00			`"github.com/databricks/cli/libs/diag"`
			`"github.com/databricks/cli/libs/log"`
Define deploy command as sequence of build phases (#129) 2022-12-12 11:49:25 +00:00			`)`

			`// The build phase builds artifacts.`
Remove bundle.{Seq,If,Defer,newPhase,logString}, switch to regular functions (#2390) ## Changes - Instead of constructing chains of mutators and then executing them, execute them directly. - Remove functionality related to chain-building: Seq, If, Defer, newPhase, logString. - Phases become functions that apply the changes directly rather than construct mutator chains that will be called later. - Add a helper ApplySeq to call multiple mutators, use it where Apply+Seq were used before. This is intended to be a refactoring without functional changes, but there are a few behaviour changes: - Since defer() is used to call unlock instead of bundle.Defer() unlocking will now happen even in case of panics. - In --debug, the phase names are are still logged once before start of the phase but each entry no longer has 'seq' or phase name in it. - The message "Deployment complete!" was printed even if terraform.Apply() mutator had an error. It no longer does that. ## Motivation The use of the chains was necessary when mutators were returning a list of other mutators instead of calling them directly. But that has since been removed, so now the chain machinery have no purpose anymore. Use of direct functions simplifies the logic and makes bugs more apparent and easy to fix. Other improvements that this unlocks: - Simpler stacktraces/debugging (breakpoints). - Use of functions with narrowly scoped API: instead of mutators that receive full bundle config, we can use focused functions that only deal with sections they care about prepareGitSettings(currentGitSection) -> updatedGitSection. This makes the data flow more apparent. - Parallel computations across mutators (within phase): launch goroutines fetching data from APIs at the beggining, process them once they are ready. ## Tests Existing tests. 2025-02-27 11:41:58 +00:00			`func Build(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {`
			`log.Info(ctx, "Phase: build")`

			`return bundle.ApplySeq(ctx, b,`
			`scripts.Execute(config.ScriptPreBuild),`
			`whl.DetectPackage(),`
			`artifacts.InferMissingProperties(),`
			`artifacts.PrepareAll(),`
			`artifacts.BuildAll(),`
			`scripts.Execute(config.ScriptPostBuild),`
			`mutator.ResolveVariableReferences(`
			`"artifacts",`
			`),`
Define deploy command as sequence of build phases (#129) 2022-12-12 11:49:25 +00:00			`)`
			`}`