databricks-cli/bundle/config/environment.go

package config

type Mode string

// Environment defines overrides for a single environment.
// This structure is recursively merged into the root configuration.
type Environment struct {
	// Default marks that this environment must be used if one isn't specified
	// by the user (through environment variable or command line argument).
	Default bool `json:"default,omitempty"`

	// Determines the mode of the environment.
	// For example, 'mode: development' can be used for deployments for
	// development purposes.
	Mode Mode `json:"mode,omitempty"`

	// Overrides the compute used for jobs and other supported assets.
	ComputeID string `json:"compute_id,omitempty"`

	Bundle *Bundle `json:"bundle,omitempty"`

	Workspace *Workspace `json:"workspace,omitempty"`

	Artifacts map[string]*Artifact `json:"artifacts,omitempty"`

	Resources *Resources `json:"resources,omitempty"`

	// Override default values for defined variables
	// Does not permit defining new variables or redefining existing ones
	// in the scope of an environment
	Variables map[string]string `json:"variables,omitempty"`

	Git Git `json:"git,omitempty"`
}

const (
	// Development mode: deployments done purely for running things in development.
	// Any deployed resources will be marked as "dev" and might be hidden or cleaned up.
	Development Mode = "development"

	// Production mode: deployments done for production purposes.
	// Any deployed resources will not be changed but this mode will enable
	// various strictness checks to make sure that a deployment is correctly setup
	// for production purposes.
	Production Mode = "production"
)
Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00			`package config`

Add development runs (#522) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ``` 2023-07-12 06:51:54 +00:00			`type Mode string`

Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00			`// Environment defines overrides for a single environment.`
			`// This structure is recursively merged into the root configuration.`
			`type Environment struct {`
Add "default" flag to environment block (#142) If the environment is not set through command line argument or environment variable, the bundle loads either 1) the only environment, 2) the only environment with the default flag set. 2022-12-15 20:28:14 +00:00			`// Default marks that this environment must be used if one isn't specified`
			`// by the user (through environment variable or command line argument).`
			Default bool `json:"default,omitempty"`

Add development runs (#522) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ``` 2023-07-12 06:51:54 +00:00			`// Determines the mode of the environment.`
			`// For example, 'mode: development' can be used for deployments for`
			`// development purposes.`
			Mode Mode `json:"mode,omitempty"`

			`// Overrides the compute used for jobs and other supported assets.`
			ComputeID string `json:"compute_id,omitempty"`

Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00			Bundle *Bundle `json:"bundle,omitempty"`

			Workspace *Workspace `json:"workspace,omitempty"`

Model code artifacts (#107) This adds: * Top level "artifacts" configuration key * Support for notebooks (does language detection and upload) * Merge of per-environment artifacts (or artifact overrides) into top level 2022-11-30 13:15:22 +00:00			Artifacts map[string]*Artifact `json:"artifacts,omitempty"`

Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00			Resources *Resources `json:"resources,omitempty"`
Add config environment support for variable overriding (#383) ## Changes Allows to override default value for a variable definition from the environment block in a bundle config. See bundle.yml for example usage ## Tests Unit tests --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> 2023-05-15 12:07:18 +00:00
			`// Override default values for defined variables`
			`// Does not permit defining new variables or redefining existing ones`
			`// in the scope of an environment`
			Variables map[string]string `json:"variables,omitempty"`
Add validation for Git settings in bundles (#578) ## Changes This checks whether the Git settings are consistent with the actual Git state of a source directory. (This PR adds to https://github.com/databricks/cli/pull/577.) Previously, we would silently let users configure their Git branch to e.g. `main` and deploy with that metadata even if they were actually on a different branch. With these changes, the following config would result in an error when deployed from any other branch than `main`: ``` bundle: name: example workspace: git: branch: main environments: ... ``` > not on the right Git branch: > expected according to configuration: main > actual: my-feature-branch It's not very useful to set the same branch for all environments, though. For development, it's better to just let the CLI auto-detect the right branch. Therefore, it's now possible to set the branch just for a single environment: ``` bundle: name: example 2 environments: development: default: true production: # production can only be deployed from the 'main' branch git: branch: main ``` Adding to that, the `mode: production` option actually checks that users explicitly set the Git branch as seen above. Setting that branch helps avoid mistakes, where someone accidentally deploys to production from the wrong branch. (I could see us offering an escape hatch for that in the future.) # Testing Manual testing to validate the experience and error messages. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com> 2023-07-30 12:44:33 +00:00
			Git Git `json:"git,omitempty"`
Skeleton for configuration loading and mutation (#92) Load a tree of configuration files anchored at `bundle.yml` into the `config.Root` struct. All mutations (from setting defaults to merging files) are observable through the `mutator.Mutator` interface. 2022-11-18 09:57:31 +00:00			`}`
Add development runs (#522) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ``` 2023-07-12 06:51:54 +00:00
			`const (`
Extend deployment mode support (#577) ## Changes This adds `mode: production` option. This mode doesn't do any transformations but verifies that an environment is configured correctly for production: ``` environments: prod: mode: production # paths should not be scoped to a user (unless a service principal is used) root_path: /Shared/non_user_path/... # run_as and permissions should be set at the resource level (or at the top level when that is implemented) run_as: user_name: Alice permissions: - level: CAN_MANAGE user_name: Alice ``` Additionally, this extends the existing `mode: development` option, * now prefixing deployed assets with `[dev your.user]` instead of just `[dev`] * validating that development deployments _are_ scoped to a user ## Related https://github.com/databricks/cli/pull/578/files (in draft) ## Tests Manual testing to validate the experience, error messages, and functionality with all resource types. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com> 2023-07-30 07:19:49 +00:00			`// Development mode: deployments done purely for running things in development.`
			`// Any deployed resources will be marked as "dev" and might be hidden or cleaned up.`
Add development runs (#522) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ``` 2023-07-12 06:51:54 +00:00			`Development Mode = "development"`
Extend deployment mode support (#577) ## Changes This adds `mode: production` option. This mode doesn't do any transformations but verifies that an environment is configured correctly for production: ``` environments: prod: mode: production # paths should not be scoped to a user (unless a service principal is used) root_path: /Shared/non_user_path/... # run_as and permissions should be set at the resource level (or at the top level when that is implemented) run_as: user_name: Alice permissions: - level: CAN_MANAGE user_name: Alice ``` Additionally, this extends the existing `mode: development` option, * now prefixing deployed assets with `[dev your.user]` instead of just `[dev`] * validating that development deployments _are_ scoped to a user ## Related https://github.com/databricks/cli/pull/578/files (in draft) ## Tests Manual testing to validate the experience, error messages, and functionality with all resource types. Automated unit tests. --------- Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com> 2023-07-30 07:19:49 +00:00
			`// Production mode: deployments done for production purposes.`
			`// Any deployed resources will not be changed but this mode will enable`
			`// various strictness checks to make sure that a deployment is correctly setup`
			`// for production purposes.`
			`Production Mode = "production"`
Add development runs (#522) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ``` 2023-07-12 06:51:54 +00:00			`)`