## Changes
This PR adds support for UC Schemas to DABs. This allows users to define
schemas for tables and other assets their pipelines/workflows create as
part of the DAB, thus managing the life-cycle in the DAB.
The first version has a couple of intentional limitations:
1. The owner of the schema will be the deployment user. Changing the
owner of the schema is not allowed (yet). `run_as` will not be
restricted for DABs containing UC schemas. Let's limit the scope of
run_as to the compute identity used instead of ownership of data assets
like UC schemas.
2. API fields that are present in the update API but not the create API.
For example: enabling predictive optimization is not supported in the
create schema API and thus is not available in DABs at the moment.
## Tests
Manually and integration test. Manually verified the following work:
1. Development mode adds a "dev_" prefix.
2. Modified status is correctly computed in the `bundle summary`
command.
3. Grants work as expected, for assigning privileges.
4. Variable interpolation works for the schema ID.
## Changes
This cherry-picks from #1490 to address an issue that came up in #1511.
The function `dyn.SetByPath` requires intermediate values to be present.
If they are not, it returns an error that it cannot index a map. This is
not an issue on main, where the intermediate maps are always created,
even if they are not present in the dynamic configuration tree. As of
#1511, we'll no longer populate empty maps for empty structs if they are
not explicitly set (i.e., a non-nil pointer). This change writes a bool
pointer to avoid this issue altogether.
## Tests
Unit tests pass.
## Changes
This change adds support for Lakehouse monitoring in bundles.
The associated resource type name is "quality monitor".
## Testing
Unit tests.
---------
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: Arpit Jasapara <87999496+arpitjasa-db@users.noreply.github.com>
## Changes
This changes `databricks bundle deploy` so that it skips the lock
acquisition/release step for a `mode: development` target:
* This saves about 2 seconds (measured over 100 runs on a quiet/busy
workspace).
* This helps avoid the `deploy lock acquired by lennart@company.com at
2024-02-28 15:48:38.40603 +0100 CET. Use --force-lock to override` error
* Risk: this may cause deployment conflicts, but since dev mode
deployments are always scoped to a user, that risk should be minimal
Update after discussion:
* This behavior can now be disabled via a setting.
* Docs PR: https://github.com/databricks/docs/pull/15873
## Measurements
### 100 deployments of the "python_default" project to an empty
workspace
_Before this branch:_
p50 time: 11.479 seconds
p90 time: 11.757 seconds
_After this branch:_
p50 time: 9.386 seconds
p90 time: 9.599 seconds
### 100 deployments of the "python_default" project to a busy (staging)
workspace
_Before this branch:_
* p50 time: 13.335 seconds
* p90 time: 15.295 seconds
_After this branch:_
* p50 time: 11.397 seconds
* p90 time: 11.743 seconds
### Typical duration of deployment steps
* Acquiring Deployment Lock: 1.096 seconds
* Deployment Preparations and Operations: 1.477 seconds
* Uploading Artifacts: 1.26 seconds
* Finalizing Deployment: 9.699 seconds
* Releasing Deployment Lock: 1.198 seconds
---------
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
Co-authored-by: Andrew Nester <andrew.nester.dev@gmail.com>
## Changes
This diagnostics type allows us to capture multiple warnings as well as
errors in the return value. This is a preparation for returning
additional warnings from mutators in case we detect non-fatal problems.
* All return statements that previously returned an error now return
`diag.FromErr`
* All return statements that previously returned `fmt.Errorf` now return
`diag.Errorf`
* All `err != nil` checks now use `diags.HasError()` or `diags.Error()`
## Tests
* Existing tests pass.
* I confirmed no call site under `./bundle` or `./cmd/bundle` uses
`errors.Is` on the return value from mutators. This is relevant because
we cannot wrap errors with `%w` when calling `diag.Errorf` (like
`fmt.Errorf`; context in https://github.com/golang/go/issues/47641).
## Changes
This is a fundamental change to how we load and process bundle
configuration. We now depend on the configuration being represented as a
`dyn.Value`. This representation is functionally equivalent to Go's
`any` (it is variadic) and allows us to capture metadata associated with
a value, such as where it was defined (e.g. file, line, and column). It
also allows us to represent Go's zero values properly (e.g. empty
string, integer equal to 0, or boolean false).
Using this representation allows us to let the configuration model
deviate from the typed structure we have been relying on so far
(`config.Root`). We need to deviate from these types when using
variables for fields that are not a string themselves. For example,
using `${var.num_workers}` for an integer `workers` field was impossible
until now (though not implemented in this change).
The loader for a `dyn.Value` includes functionality to capture any and
all type mismatches between the user-defined configuration and the
expected types. These mismatches can be surfaced as validation errors in
future PRs.
Given that many mutators expect the typed struct to be the source of
truth, this change converts between the dynamic representation and the
typed representation on mutator entry and exit. Existing mutators can
continue to modify the typed representation and these modifications are
reflected in the dynamic representation (see `MarkMutatorEntry` and
`MarkMutatorExit` in `bundle/config/root.go`).
Required changes included in this change:
* The existing interpolation package is removed in favor of
`libs/dyn/dynvar`.
* Functionality to merge job clusters, job tasks, and pipeline clusters
are now all broken out into their own mutators.
To be implemented later:
* Allow variable references for non-string types.
* Surface diagnostics about the configuration provided by the user in
the validation output.
* Some mutators use a resource's configuration file path to resolve
related relative paths. These depend on `bundle/config/paths.Path` being
set and populated through `ConfigureConfigFilePath`. Instead, they
should interact with the dynamically typed configuration directly. Doing
this also unlocks being able to differentiate different base paths used
within a job (e.g. a task override with a relative path defined in a
directory other than the base job).
## Tests
* Existing unit tests pass (some have been modified to accommodate)
* Integration tests pass
## Changes
This PR changes the default and `mode: production` recommendation to
target `/Users` for deployment. Previously, we used `/Shared`, but
because of a lack of POSIX-like permissions in WorkspaceFS this meant
that files inside would be readable and writable by other users in the
workspace.
Detailed change:
* `default-python` no longer uses a path that starts with `/Shared`
* `mode: production` no longer requires a path that starts with
`/Shared`
## Related PRs
Docs: https://github.com/databricks/docs/pull/14585
Examples: https://github.com/databricks/bundle-examples/pull/17
## Tests
* Manual tests
* Template unit tests (with an extra check to avoid /Shared)
## Changes
Some test call sites called directly into the mutator's `Apply` function
instead of `bundle.Apply`. Calling into `bundle.Apply` is preferred
because that's where we can run pre/post logic common across all
mutators.
## Tests
Pass.
## Changes
All calls to apply a mutator must go through `bundle.Apply`. This
conflicts with the existing use of the variable `bundle`. This change
un-aliases the variable from the package name by renaming all variables
to `b`.
## Tests
Pass.
## Changes
This PR:
1. Renames `FilesPath` -> `FilePath` and `ArtifactsPath` ->
`ArtifactPath` in the bundle and metadata configuration to make them
consistant with the json tags.
2. Fixes development / production mode error messages to point to
`file_path` and `artifact_path`
## Tests
Existing unit tests. This is a strightforward renaming of the fields.
Partly mitigates #859. It's still not clear to me if there is an actual
use case or if users are trying to use "development" mode jobs for
production, but making this overridable is reasonable.
Beyond this fix I think we could do something in the Jobs schedule UI,
but it would help to better understand the use case (or actual reason of
confusion). I expect we should hint customers to move away from dev mode
rather than unpause.
## Changes
The jobs backend propagates job tags to the underlying cloud provider's
resources. As such, they need to match the constraints a cloud provider
places on tag values. The display name can contain anything. With this
change, we modify the tag value to equal the short name as used in the
name prefix.
Additionally, we leverage tag normalization as introduced in #819 to
make sure characters that aren't accepted are removed before using the
value as a tag value.
This is a new stab at #810 and should completely eliminate this class of
problems.
## Tests
Tests pass.
## Changes
This pull request extends the templating support in preparation of a
new, default template (WIP, https://github.com/databricks/cli/pull/686):
* builtin templates that can be initialized using e.g. `databricks
bundle init default-python`
* builtin templates are embedded into the executable using go's `embed`
functionality, making sure they're co-versioned with the CLI
* new helpers to get the workspace name, current user name, etc. help
craft a complete template
* (not enabled yet) when the user types `databricks bundle init` they
can interactively select the `default-python` template
And makes two tangentially related changes:
* IsServicePrincipal now uses the "users" API rather than the
"principals" API, since the latter is too slow for our purposes.
* mode: prod no longer requires the 'target.prod.git' setting. It's hard
to set that from a template. (Pieter is planning an overhaul of warnings
support; this would be one of the first warnings we show.)
The actual `default-python` template is maintained in a separate PR:
https://github.com/databricks/cli/pull/686
## Tests
Unit tests, manual testing
## Changes
Renamed Environments to Targets in bundle.yml.
The change is backward-compatible and customers can continue to use
`environments` in the time being.
## Tests
Added tests which checks that both `environments` and `targets` sections
in bundle.yml works correctly