## Changes
Added support for complex variables
Now it's possible to add and use complex variables as shown below
```
bundle:
name: complex-variables
resources:
jobs:
my_job:
job_clusters:
- job_cluster_key: key
new_cluster: ${var.cluster}
tasks:
- task_key: test
job_cluster_key: key
variables:
cluster:
description: "A cluster definition"
type: complex
default:
spark_version: "13.2.x-scala2.11"
node_type_id: "Standard_DS3_v2"
num_workers: 2
spark_conf:
spark.speculation: true
spark.databricks.delta.retentionDurationCheck.enabled: false
```
Fixes#1298
- [x] Support for complex variables
- [x] Allow variable overrides (with shortcut) in targets
- [x] Don't allow to provide complex variables via flag or env variable
- [x] Fail validation if complex value is used but not `type: complex`
provided
- [x] Support using variables inside complex variables
## Tests
Added unit tests
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
## Changes
This reverts commit dac5f09556 (#1523).
Retaining the location for nil values means equality checks no longer
pass.
We need #1520 to be merged first.
## Tests
Integration test `TestAccPythonWheelTaskDeployAndRunWithWrapper`.
## Changes
There are four different treatments location metadata can receive in the
`convert.FromTyped` method.
1. Location metadata is **retained** for maps, structs and slices if the
value is **not nil**
2. Location metadata is **lost** for maps, structs and slices if the
value is **is nil**
3. Location metadata is **retained** if a scalar type (eg. bool, string
etc) does not change.
4. Location metadata is **lost** if the value for a scalar type changes.
This PR ensures that location metadata is not lost in any case; that is,
it's always preserved.
For (2), this serves as a bug fix so that location information is not
lost on conversion to and from typed for nil values of complex types
(struct, slices, and maps).
For (4) this is a change in semantics. For primitive values modified in
a `typed` mutator, any references to `.Location()` for computed
primitive fields will now return associated YAML location metadata (if
any) instead of an empty location.
While arguable, these semantics are OK since:
1. Situations like these will be rare.
2. Knowing the YAML location (if any) is better than not knowing the
location at all. These locations are typically visible to the user in
errors and warnings.
## Tests
Unit tests
## Changes
With https://github.com/databricks/cli/pull/1507 and
https://github.com/databricks/cli/pull/1511 we are clarifying the
semantics associated with `dyn.InvalidValue` and `dyn.NilValue`. An
invalid value is the default zero value and is used to signals the
complete absence of the value.
A nil value, on the other hand, is a valid value for a piece of
configuration and signals explicitly setting a key to nil in the
configuration tree. In keeping with that theme, this PR returns
`dyn.InvalidValue` instead of `dyn.NilValue` at error sites. This change
is not expected to have a material change in behaviour and is being done
to set the right convention since we have well-defined semantics
associated with both `NilValue` and `InvalidValue`.
## Tests
Unit tests and integration tests pass. Also manually scanned the changes
and the associated call sites to verify the `NilValue` value itself was
not being relied upon.
## Changes
When a configuration defines:
```yaml
run_as:
```
It first showed up as `run_as -> nil` in the dynamic configuration only
to later be converted to `run_as -> {}` while going through typed
conversion. We were using the presence of a key to initialize an empty
value. This is incorrect and it should have remained a nil value.
This conversion was happening in `convert.FromTyped` where any struct
always returned a map value. Instead, it should only return a map value
in any one of these cases: 1) the struct has elements, 2) the struct was
originally a map in the dynamic configuration, or 3) the struct was
initialized to a non-empty pointer value.
Stacked on top of #1516 and #1518.
## Tests
* Unit tests pass.
* Integration tests pass.
* Manually ran through bundle CRUD with a bundle without resources.
## Changes
This came up in integration testing for #1511. One of the tests
converted a `map[string]any` to a dynamic value and encountered a `nil`
and errored out. We can safely return a nil in this case.
## Tests
Unit test passes.
## Changes
Previously, the functions `Get` and `Index` returned `dyn.NilValue` to
indicate that a map key or sequence index wasn't found. This is a valid
value, so we need to differentiate between actual absence and a real
`dyn.NilValue`. We do this with the zero value of a `dyn.Value` (also
captured in the constant `dyn.InvalidValue`).
## Tests
* Unit tests.
* Renamed `Get` and `Index` to find and update all call sites.
## Changes
This change adds support for Lakehouse monitoring in bundles.
The associated resource type name is "quality monitor".
## Testing
Unit tests.
---------
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: Arpit Jasapara <87999496+arpitjasa-db@users.noreply.github.com>
## Changes
This PR also fixes empty values variable overrides using the --var flag.
Now, using `--var="my_variable="` will set the value of `my_variable` to
the empty string instead of ignoring the flag altogether.
## Tests
The change using a unit test. Manually verified the `--var` flag works
now.
## Changes
Add `merge.Override` transform. It allows the override one `dyn.Value`
with another, preserving source locations for parts of the sub-tree
where nothing has changed. This is different from merging, where values
are concatenated.
`OverrideVisitor` is visiting the changes during the override process
and allows to control of what changes are allowed or update the
effective value.
The primary use case is Python code updating bundle configuration.
During override, we update locations only for changed values. This
allows us to keep track of locations where values were initially defined
and used for error reporting. For instance, merging:
```yaml
resources: # location=left.yaml:0
jobs: # location=left.yaml:1
job_0: # location=left.yaml:2
name: "job_0" # location=left.yaml:3
```
with
```yaml
resources: # location=right.yaml:0
jobs: # location=right.yaml:1
job_0: # location=right.yaml:2
name: "job_0" # location=right.yaml:3
description: job 0 # location=right.yaml:4
job_1: # location=right.yaml:5
name: "job_1" # location=right.yaml:5
```
produces
```yaml
resources: # location=left.yaml:0
jobs: # location=left.yaml:1
job_0: # location=left.yaml:2
name: "job_0" # location=left.yaml:3
description: job 0 # location=right.yaml:4
job_1: # location=right.yaml:5
name: "job_1" # location=right.yaml:5
```
## Tests
Unit tests
## Changes
We currently issue a warning if an integer is used where a floating
point number is expected. But if they are convertible, we should convert
and not issue a warning. This change fixes normalization if they are
convertible between each other. We still produce a warning if the type
conversion leads to a loss in precision.
## Tests
Unit tests pass.
## Changes
In 0.217.0 we started to emit warning on unknown fields in YAML
configuration but wrongly considered YAML anchor blocks as unknown
field.
This PR fixes this by skipping normalising of YAML blocks.
## Tests
Added regression tests
## Changes
Errors in normalization mean hard failure as of #1319.
We currently allow malformed configurations and ignore the malformed
fields and should continue to do so.
## Tests
* Tests pass.
* No calls to `diag.Errorf` from `libs/dyn`
## Changes
Variable substitution works as if the variable reference is literally
replaced with its contents.
The following fields should be interpreted in the same way regardless of
where the variable is defined:
```yaml
foo: ${var.some_path}
bar: "./${var.some_path}"
```
Before this change, `foo` would inherit the location information of the
variable definition. After this change, it uses the location information
of the variable reference, making the behavior for `foo` and `bar`
identical.
Fixes#1330.
## Tests
The new test passes only with the fix.
## Changes
This adds context to warnings and errors. For example:
* Summary: `unknown field bar`
* Location: `foo.yml:6:10`
* Path: `.targets.dev.workspace`
## Tests
Unit tests.
## Changes
It's not necessary to error out if a configuration field is present but
not set.
For example, the following would error out, but after this change only
produces a warning:
```yaml
workspace:
# This is a string field, but if not specified, it ends up being a null.
host:
```
## Tests
Updated the unit tests to match the new behavior.
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
## Changes
Before this change maps were stored as a regular Go map with string
keys. This didn't let us capture metadata (location information) for map
keys.
To address this, this change replaces the use of the regular Go map with
a dedicated type for a dynamic map. This type stores the `dyn.Value` for
both the key and the value. It uses a map to still allow O(1) lookups
and redirects those into a slice.
## Tests
* All existing unit tests pass (some with minor modifications due to
interface change).
* Equality assertions with `assert.Equal` no longer worked because the
new `dyn.Mapping` persists the order in which keys are set and is
therefore susceptible to map ordering issues. To fix this, I added a
`dynassert` package that forwards all assertions to `testify/assert` but
intercepts equality for `dyn.Value` arguments.
## Changes
While working on #1273, I found that calls to `Append` on a
`dyn.Pattern` were mutating the original slice. This is expected because
appending to a slice will mutate in place if the capacity of the
original slice is large enough. This change updates the `Append` call on
the `dyn.Path` as well to return a newly allocated slice to avoid
inadvertently mutating the originals.
We have existing call sites in the `dyn` package that mutate a
`dyn.Path` (e.g. walk or visit) and these are modified to continue to do
this with a direct call to `append`. Callbacks that use the `dyn.Path`
argument outside of the callback need to make a copy to ensure it isn't
mutated (this is no different from existing semantics).
The `Join` function wasn't used and is removed as part of this change.
## Tests
Unit tests.
## Changes
This change addresses the path resolution behavior in resource
definitions. Previously, all paths were resolved relative to where the
resource was first defined, which could lead to confusion and errors
when paths were specified in different directories. The new behavior is
to resolve paths relative to where they are defined, making it more
intuitive.
However, to avoid breaking existing configurations, compatibility with
the old behavior is maintained.
## Tests
* Existing unit tests for path translation pass.
* Additional test to cover both the nominal and the fallback behavior.
## Changes
We now keep location metadata associated with every configuration value.
When expanding globs for pipeline libraries, this annotation was erased
because of the conversion to/from the typed structure. This change
modifies the expansion mutator to work with `dyn.Value` and retain the
location of the value that holds the glob pattern.
## Tests
Unit tests pass.
## Changes
The new `dyn.Pattern` type represents a path pattern that can match one
or more paths in a configuration tree. Every `dyn.Path` can be converted
to a `dyn.Pattern` that matches only a single path.
To accommodate this change, the visit function needed to be modified to
take a `dyn.Pattern` suffix. Every component in the pattern implements
an interface to work with the visit function. This function can recurse
on the visit function for one or more elements of the value being
visited. For patterns derived from a `dyn.Path`, it will work as it did
before and select the matching element. For the new pattern components
(e.g. `dyn.AnyKey` or `dyn.AnyIndex`), it recurses on all the elements
in the container.
## Tests
Unit tests. Confirmed full coverage for the new code.
## Changes
The `dyn.Path` argument wasn't tested and could regress. Spotted this
while working on related code. Follow up to #1260.
## Tests
Unit tests.
## Changes
This removes the need for the `allowMissingKeyInMap` option to the
private `visit` function and ensures that the body of the visit function
doesn't add or remove values of the configuration it traverses.
This in turn prepares for visiting a path pattern that yields more than
one callback, which doesn't match well with the now-removed option.
## Tests
Unit tests pass and fully cover the inlined code.
## Changes
This change means the callback supplied to `dyn.Foreach` can introspect
the path of the value it is being called for. It also prepares for
allowing visiting path patterns where the exact path is not known
upfront.
## Tests
Unit tests.
## Changes
This change enables the use of bundle variables for boolean, integer,
and floating point fields.
## Tests
* Unit tests.
* I ran a manual test to confirm parameterizing the number of workers in
a cluster definition works.
## Changes
This builds on #1098 and uses the `dyn.Value` representation of the
bundle configuration to generate the Terraform JSON definition of
resources in the bundle.
The existing code (in `BundleToTerraform`) was not great and in an
effort to slightly improve this, I added a package `tfdyn` that includes
dedicated files for each resource type. Every resource type has its own
conversion type that takes the `dyn.Value` of the bundle-side resource
and converts it into Terraform resources (e.g. a job and optionally its
permissions).
Because we now use a `dyn.Value` as input, we can represent and emit
zero-values that have so far been omitted. For example, setting
`num_workers: 0` in your bundle configuration now propagates all the way
to the Terraform JSON definition.
## Tests
* Unit tests for every converter. I reused the test inputs from
`convert_test.go`.
* Equivalence tests in every existing test case checks that the
resulting JSON is identical.
* I manually compared the TF JSON file generated by the CLI from the
main branch and from this PR on all of our bundles and bundle examples
(internal and external) and found the output doesn't change (with the
exception of the odd zero-value being included by the version in this
PR).
## Changes
This is a fundamental change to how we load and process bundle
configuration. We now depend on the configuration being represented as a
`dyn.Value`. This representation is functionally equivalent to Go's
`any` (it is variadic) and allows us to capture metadata associated with
a value, such as where it was defined (e.g. file, line, and column). It
also allows us to represent Go's zero values properly (e.g. empty
string, integer equal to 0, or boolean false).
Using this representation allows us to let the configuration model
deviate from the typed structure we have been relying on so far
(`config.Root`). We need to deviate from these types when using
variables for fields that are not a string themselves. For example,
using `${var.num_workers}` for an integer `workers` field was impossible
until now (though not implemented in this change).
The loader for a `dyn.Value` includes functionality to capture any and
all type mismatches between the user-defined configuration and the
expected types. These mismatches can be surfaced as validation errors in
future PRs.
Given that many mutators expect the typed struct to be the source of
truth, this change converts between the dynamic representation and the
typed representation on mutator entry and exit. Existing mutators can
continue to modify the typed representation and these modifications are
reflected in the dynamic representation (see `MarkMutatorEntry` and
`MarkMutatorExit` in `bundle/config/root.go`).
Required changes included in this change:
* The existing interpolation package is removed in favor of
`libs/dyn/dynvar`.
* Functionality to merge job clusters, job tasks, and pipeline clusters
are now all broken out into their own mutators.
To be implemented later:
* Allow variable references for non-string types.
* Surface diagnostics about the configuration provided by the user in
the validation output.
* Some mutators use a resource's configuration file path to resolve
related relative paths. These depend on `bundle/config/paths.Path` being
set and populated through `ConfigureConfigFilePath`. Instead, they
should interact with the dynamically typed configuration directly. Doing
this also unlocks being able to differentiate different base paths used
within a job (e.g. a task override with a relative path defined in a
directory other than the base job).
## Tests
* Existing unit tests pass (some have been modified to accommodate)
* Integration tests pass
## Changes
When resolving a value returned by the lookup function, the code would
call into `resolveRef` with the key that `resolveKey` was called with.
In doing so, it would cache the _new_ ref under that key.
We fix this by caching ref resolution only at the top level and relying
on lookup caching to avoid duplicate work.
This came up while testing #1098.
## Tests
Unit test.
## Changes
This is a follow-up to #1211 prompted by the addition of a recursive
type in the Go SDK v0.31.0 (`jobs.ForEachTask`).
When populating missing fields with their zero values we must not
inadvertently recurse into a recursive type.
## Tests
New unit test fails with a stack overflow if the fix if the check is
disabled.
## Changes
This feature supports variable lookups in a `dyn.Value` that are present
in the type but haven't been initialized with a value.
For example: `${bundle.git.origin_url}` is present in the `dyn.Value`
only if it was assigned a value. If it wasn't assigned a value it should
resolve to the empty string. This normalization option, when set,
ensures that all fields that are represented in the specified type are
present in the return value.
This change is in support of #1098.
## Tests
Added unit test.
These fields (key and values) needs to be double quoted in order for
yaml loader to read, parse and unmarshal it into Go struct correctly
because these fields are `map[string]string` type.
## Tests
Added regression unit and E2E tests
## Changes
Before this change, any error in a subtree would cause the entire
subtree to be dropped from the output.
This is not ideal when debugging, so instead we drop only the values
that cannot be normalized. Note that this doesn't change behavior if the
caller is properly checking the returned diagnostics for errors.
Note: this includes a change to use `dyn.InvalidValue` as opposed to
`dyn.NilValue` when returning errors.
## Tests
Added unit tests for the case where nested struct, map, or slice
elements contain an error.
## Changes
Not doing this means that the output struct is not a true representation
of the `dyn.Value` and unrepresentable state (e.g. unexported fields)
can be carried over across `convert.ToTyped` calls.
## Tests
Unit tests.
## Changes
This was an issue in cases where the typed structure contains a non-nil
pointer to an empty struct. After conversion to a `dyn.Value` and back
to the typed structure, the pointer became nil.
## Tests
Unit tests.
## Changes
References to keys that themselves are also variable references were
shortcircuited in the previous approach. This meant that certain fields
were resolved even if the lookup function would have instructed to skip
resolution.
To fix this we separate the memoization of resolved variable references
from the memoization of lookups. Now, every variable reference is passed
through the lookup function.
## Tests
Before this change, the new test failed with:
```
=== RUN TestResolveWithSkipEverything
[...]/libs/dyn/dynvar/resolve_test.go:208:
Error Trace: [...]/libs/dyn/dynvar/resolve_test.go:208
Error: Not equal:
expected: "${d} ${c} ${c} ${d}"
actual : "${b} ${a} ${a} ${b}"
Diff:
--- Expected
+++ Actual
@@ -1 +1 @@
-${d} ${c} ${c} ${d}
+${b} ${a} ${a} ${b}
Test: TestResolveWithSkipEverything
```
## Changes
This function could panic when either side of the comparison is a nil or
empty slice. This logic is triggered when comparing the input value to
the output value when calling `dyn.Map`.
## Tests
Unit tests.
## Changes
In the dynamic configuration, the nil value (dyn.NilValue) denotes a
value that should not be serialized, ie a value being nil is the same as
it not existing in the first place.
This is not true for zero values in maps and slices. This PR fixes the
conversion from typed values to dyn.Value, to treat zero values in maps
and slices as zero and not nil.
## Tests
Unit tests
## Changes
This is the `dyn` counterpart to the `bundle/config/interpolation`
package.
It relies on the paths in `${foo.bar}` being valid `dyn.Path` instances.
It leverages `dyn.Walk` to get a complete picture of all variable
references and uses `dyn.Get` to retrieve values pointed to by variable
references.
Depends on #1142.
## Tests
Unit test coverage. I tried to mirror the tests from
`bundle/config/interpolation` and added new ones where applicable (for
example to test type retention of referenced values).
## Changes
This change adds the following functions:
* `dyn.Get(value, "foo.bar") -> (dyn.Value, error)`
* `dyn.Set(value, "foo.bar", newValue) -> (dyn.Value, error)`
* `dyn.Map(value, "foo.bar", func) -> (dyn.Value, error)`
And equivalent functions that take a previously constructed `dyn.Path`:
* `dyn.GetByPath(value, dyn.Path) -> (dyn.Value, error)`
* `dyn.SetByPath(value, dyn.Path, newValue) -> (dyn.Value, error)`
* `dyn.MapByPath(value, dyn.Path, func) -> (dyn.Value, error)`
Changes made by the "set" and "map" functions are never reflected in the
input argument; they return new `dyn.Value` instances for all nodes in
the path leading up to the changed value.
## Tests
New unit tests cover all critical paths.
## Changes
Now it's possible to generate bundle configuration for existing job.
For now it only supports jobs with notebook tasks.
It will download notebooks referenced in the job tasks and generate
bundle YAML config for this job which can be included in larger bundle.
## Tests
Running command manually
Example of generated config
```
resources:
jobs:
job_128737545467921:
name: Notebook job
format: MULTI_TASK
tasks:
- task_key: as_notebook
existing_cluster_id: 0704-xxxxxx-yyyyyyy
notebook_task:
base_parameters:
bundle_root: /Users/andrew.nester@databricks.com/.bundle/job_with_module_imports/development/files
notebook_path: ./entry_notebook.py
source: WORKSPACE
run_if: ALL_SUCCESS
max_concurrent_runs: 1
```
## Tests
Manual (on our last 100 jobs) + added end-to-end test
```
--- PASS: TestAccGenerateFromExistingJobAndDeploy (50.91s)
PASS
coverage: 61.5% of statements in ./...
ok github.com/databricks/cli/internal/bundle 51.209s coverage: 61.5% of
statements in ./...
```
## Changes
The nil value is a real valid value that we need to represent. To
accommodate this we introduced `dyn.KindInvalid` as the zero-value for
`dyn.Kind` (see #904), but did not yet update the comments on
`dyn.NilValue` or add tests for `kind.go`.
This also moves `KindNil` to be last in the definition order (least
likely to care about it).
## Tests
Tests pass.
## Changes
The file `value.go` had a couple `AsZZZ` and `MustZZZ` functions.
This change backfills missing versions and moves all of them to a
separate file.
## Tests
Tests pass; full coverage.
## Changes
The name "dynamic value", or "dyn" for short, is more descriptive than
the opaque "config". Also, it conveniently does not alias with other
packages in the repository, or (popular ones) elsewhere.
(discussed with @andrewnester)
## Tests
n/a