databricks-cli/bundle/config/bundle.go

package config

type Terraform struct {
	ExecPath string `json:"exec_path"`
}

type Bundle struct {
	Name string `json:"name"`

	// TODO
	// Default cluster to run commands on (Python, Scala).
	// DefaultCluster string `json:"default_cluster,omitempty"`

	// TODO
	// Default warehouse to run SQL on.
	// DefaultWarehouse string `json:"default_warehouse,omitempty"`

	// Target is set by the mutator that selects the target.
	Target string `json:"target,omitempty" bundle:"readonly"`

	// DEPRECATED. Left for backward compatibility with Target
	Environment string `json:"environment,omitempty" bundle:"readonly"`

	// Terraform holds configuration related to Terraform.
	// For example, where to find the binary, which version to use, etc.
	Terraform *Terraform `json:"terraform,omitempty" bundle:"readonly"`

	// Lock configures locking behavior on deployment.
	Lock Lock `json:"lock" bundle:"readonly"`

	// Force-override Git branch validation.
	Force bool `json:"force,omitempty" bundle:"readonly"`

	// Contains Git information like current commit, current branch and
	// origin url. Automatically loaded by reading .git directory if not specified
	Git Git `json:"git,omitempty"`

	// Determines the mode of the target.
	// For example, 'mode: development' can be used for deployments for
	// development purposes.
	// Annotated readonly as this should be set at the target level.
	Mode Mode `json:"mode,omitempty" bundle:"readonly"`

	// Overrides the compute used for jobs and other supported assets.
	ComputeID string `json:"compute_id,omitempty"`
}
Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00			`package config`

Automatically install Terraform if needed (#141) Users can opt out and use the system-installed version with the following configuration: ``` bundle: terraform: exec_path: terraform ``` This will find the binary in $PATH and replace it with the found value. If this is not set, the initialize phase will install Terraform in the bundle's cache directory. 2022-12-15 16:30:33 +00:00			`type Terraform struct {`
			ExecPath string `json:"exec_path"`
			`}`

Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00			`type Bundle struct {`
Make file, artifact and state path optional (#204) This PR makes bundle name required, and a few fields with defined defaults optional, to generate a better json schema 2023-02-17 01:49:39 +00:00			Name string `json:"name"`
Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00
			`// TODO`
			`// Default cluster to run commands on (Python, Scala).`
			// DefaultCluster string `json:"default_cluster,omitempty"`

			`// TODO`
			`// Default warehouse to run SQL on.`
			// DefaultWarehouse string `json:"default_warehouse,omitempty"`
Store specified environment in configuration for reference (#104) 2022-11-28 09:10:13 +00:00
Renamed `environments` to `targets` in bundle configuration (#670) ## Changes Renamed Environments to Targets in bundle.yml. The change is backward-compatible and customers can continue to use `environments` in the time being. ## Tests Added tests which checks that both `environments` and `targets` sections in bundle.yml works correctly 2023-08-17 15:22:32 +00:00			`// Target is set by the mutator that selects the target.`
			Target string `json:"target,omitempty" bundle:"readonly"`

			`// DEPRECATED. Left for backward compatibility with Target`
Add readonly bundle tag for internal fields (#302) This PR adds a bundle: "readonly" struct tag to the json schema generator. This allows us to skip generating json schema for internal readonly fields Tested using unit test 2023-04-04 10:16:07 +00:00			Environment string `json:"environment,omitempty" bundle:"readonly"`
Automatically install Terraform if needed (#141) Users can opt out and use the system-installed version with the following configuration: ``` bundle: terraform: exec_path: terraform ``` This will find the binary in $PATH and replace it with the found value. If this is not set, the initialize phase will install Terraform in the bundle's cache directory. 2022-12-15 16:30:33 +00:00
			`// Terraform holds configuration related to Terraform.`
			`// For example, where to find the binary, which version to use, etc.`
Add readonly bundle tag for internal fields (#302) This PR adds a bundle: "readonly" struct tag to the json schema generator. This allows us to skip generating json schema for internal readonly fields Tested using unit test 2023-04-04 10:16:07 +00:00			Terraform *Terraform `json:"terraform,omitempty" bundle:"readonly"`
Acquire lock prior to deploy (#270) Add configuration: ``` bundle: lock: enabled: true force: false ``` The force field can be set by passing the `--force` argument to `bricks bundle deploy`. Doing so means the deployment lock is acquired even if it is currently held. This should only be used in exceptional cases (e.g. a previous deployment has failed to release the lock). 2023-03-22 15:37:26 +00:00
			`// Lock configures locking behavior on deployment.`
Add readonly bundle tag for internal fields (#302) This PR adds a bundle: "readonly" struct tag to the json schema generator. This allows us to skip generating json schema for internal readonly fields Tested using unit test 2023-04-04 10:16:07 +00:00			Lock Lock `json:"lock" bundle:"readonly"`
Add git config block to bundle config (#356) ## Changes This config block contains commit, branch and remote_url which will be automatically loaded if specified in the repo, and can also be specified by the user ## Tests Unit and black-box tests 2023-04-26 14:54:36 +00:00
Add validation for Git settings in bundles (#578) ## Changes This checks whether the Git settings are consistent with the actual Git state of a source directory. (This PR adds to https://github.com/databricks/cli/pull/577.) Previously, we would silently let users configure their Git branch to e.g. `main` and deploy with that metadata even if they were actually on a different branch. With these changes, the following config would result in an error when deployed from any other branch than `main`: ``` bundle: name: example workspace: git: branch: main environments: ... ``` > not on the right Git branch: > expected according to configuration: main > actual: my-feature-branch It's not very useful to set the same branch for all environments, though. For development, it's better to just let the CLI auto-detect the right branch. Therefore, it's now possible to set the branch just for a single environment: ``` bundle: name: example 2 environments: development: default: true production: # production can only be deployed from the 'main' branch git: branch: main ``` Adding to that, the `mode: production` option actually checks that users explicitly set the Git branch as seen above. Setting that branch helps avoid mistakes, where someone accidentally deploys to production from the wrong branch. (I could see us offering an escape hatch for that in the future.) # Testing Manual testing to validate the experience and error messages. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com> 2023-07-30 12:44:33 +00:00			`// Force-override Git branch validation.`
Persist deployment metadata in WSFS (#845) ## Changes This PR introduces a metadata struct that stores a subset of bundle configuration that we wish to expose to other Databricks services that wish to integrate with bundles. This metadata file is uploaded to a file `${bundle.workspace.state_path}/metadata.json` in the WSFS destination of the bundle deployment. Documentation for emitted metadata fields: * `version`: Version for the metadata file schema * `config.bundle.git.branch`: Name of the git branch the bundle was deployed from. * `config.bundle.git.origin_url`: URL for git remote "origin" * `config.bundle.git.bundle_root_path`: Relative path of the bundle root from the root of the git repository. Is set to "." if they are the same. * `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this bundle was deployed from. Note, the deployment might not exactly match this commit version if there are changes that have not been committed to git at deploy time, * `file_path`: Path in workspace where we sync bundle files to. * `resources.jobs.[job-ref].id`: Id of the job * `resources.jobs.[job-ref].relative_path`: Relative path of the yaml config file from the bundle root where this job was defined. Example metadata object when bundle root and git root are the same: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "master", "origin_url": "www.host.com", "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0", "bundle_root_path": "." } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` Example metadata when the git root is one level above the bundle repo: ```json { "version": 1, "config": { "bundle": { "lock": {}, "git": { "branch": "dev-branch", "origin_url": "www.my-repo.com", "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b", "bundle_root_path": "pipeline-progress" } }, "workspace": { "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files" }, "resources": { "jobs": { "bar": { "id": "245921165354846", "relative_path": "databricks.yml" } } }, "sync": {} } } ``` This unblocks integration to the jobs break glass UI for bundles. ## Tests Unit tests and integration tests. 2023-10-27 12:55:43 +00:00			Force bool `json:"force,omitempty" bundle:"readonly"`
Add validation for Git settings in bundles (#578) ## Changes This checks whether the Git settings are consistent with the actual Git state of a source directory. (This PR adds to https://github.com/databricks/cli/pull/577.) Previously, we would silently let users configure their Git branch to e.g. `main` and deploy with that metadata even if they were actually on a different branch. With these changes, the following config would result in an error when deployed from any other branch than `main`: ``` bundle: name: example workspace: git: branch: main environments: ... ``` > not on the right Git branch: > expected according to configuration: main > actual: my-feature-branch It's not very useful to set the same branch for all environments, though. For development, it's better to just let the CLI auto-detect the right branch. Therefore, it's now possible to set the branch just for a single environment: ``` bundle: name: example 2 environments: development: default: true production: # production can only be deployed from the 'main' branch git: branch: main ``` Adding to that, the `mode: production` option actually checks that users explicitly set the Git branch as seen above. Setting that branch helps avoid mistakes, where someone accidentally deploys to production from the wrong branch. (I could see us offering an escape hatch for that in the future.) # Testing Manual testing to validate the experience and error messages. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com> 2023-07-30 12:44:33 +00:00
Add git config block to bundle config (#356) ## Changes This config block contains commit, branch and remote_url which will be automatically loaded if specified in the repo, and can also be specified by the user ## Tests Unit and black-box tests 2023-04-26 14:54:36 +00:00			`// Contains Git information like current commit, current branch and`
			`// origin url. Automatically loaded by reading .git directory if not specified`
Add omitempty tag to bundle git details (#372) ## Changes Add omit empty tag to git details. Otherwise this field becomes a required field in the config json schema ## Tests Tested by regenerating the json schema and checking that the git field is now optional 2023-05-01 12:34:12 +00:00			Git Git `json:"git,omitempty"`
Add development runs (#522) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ``` 2023-07-12 06:51:54 +00:00
Renamed `environments` to `targets` in bundle configuration (#670) ## Changes Renamed Environments to Targets in bundle.yml. The change is backward-compatible and customers can continue to use `environments` in the time being. ## Tests Added tests which checks that both `environments` and `targets` sections in bundle.yml works correctly 2023-08-17 15:22:32 +00:00			`// Determines the mode of the target.`
Add development runs (#522) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ``` 2023-07-12 06:51:54 +00:00			`// For example, 'mode: development' can be used for deployments for`
			`// development purposes.`
Renamed `environments` to `targets` in bundle configuration (#670) ## Changes Renamed Environments to Targets in bundle.yml. The change is backward-compatible and customers can continue to use `environments` in the time being. ## Tests Added tests which checks that both `environments` and `targets` sections in bundle.yml works correctly 2023-08-17 15:22:32 +00:00			`// Annotated readonly as this should be set at the target level.`
Add development runs (#522) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ``` 2023-07-12 06:51:54 +00:00			Mode Mode `json:"mode,omitempty" bundle:"readonly"`

			`// Overrides the compute used for jobs and other supported assets.`
			ComputeID string `json:"compute_id,omitempty"`
Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00			`}`