Commit Graph

19 Commits

Author SHA1 Message Date
Pieter Noordhuis a2a4948047
Allow use of variables references in primitive non-string fields (#1219)
## Changes

This change enables the use of bundle variables for boolean, integer,
and floating point fields.

## Tests

* Unit tests.
* I ran a manual test to confirm parameterizing the number of workers in
a cluster definition works.
2024-02-19 10:44:51 +00:00
Pieter Noordhuis f70ec359dc
Use `dyn.Value` as input to generating Terraform JSON (#1218)
## Changes

This builds on #1098 and uses the `dyn.Value` representation of the
bundle configuration to generate the Terraform JSON definition of
resources in the bundle.

The existing code (in `BundleToTerraform`) was not great and in an
effort to slightly improve this, I added a package `tfdyn` that includes
dedicated files for each resource type. Every resource type has its own
conversion type that takes the `dyn.Value` of the bundle-side resource
and converts it into Terraform resources (e.g. a job and optionally its
permissions).

Because we now use a `dyn.Value` as input, we can represent and emit
zero-values that have so far been omitted. For example, setting
`num_workers: 0` in your bundle configuration now propagates all the way
to the Terraform JSON definition.

## Tests

* Unit tests for every converter. I reused the test inputs from
`convert_test.go`.
* Equivalence tests in every existing test case checks that the
resulting JSON is identical.
* I manually compared the TF JSON file generated by the CLI from the
main branch and from this PR on all of our bundles and bundle examples
(internal and external) and found the output doesn't change (with the
exception of the odd zero-value being included by the version in this
PR).
2024-02-16 20:54:38 +00:00
Pieter Noordhuis 87dd46a3f8
Use dynamic configuration model in bundles (#1098)
## Changes

This is a fundamental change to how we load and process bundle
configuration. We now depend on the configuration being represented as a
`dyn.Value`. This representation is functionally equivalent to Go's
`any` (it is variadic) and allows us to capture metadata associated with
a value, such as where it was defined (e.g. file, line, and column). It
also allows us to represent Go's zero values properly (e.g. empty
string, integer equal to 0, or boolean false).

Using this representation allows us to let the configuration model
deviate from the typed structure we have been relying on so far
(`config.Root`). We need to deviate from these types when using
variables for fields that are not a string themselves. For example,
using `${var.num_workers}` for an integer `workers` field was impossible
until now (though not implemented in this change).

The loader for a `dyn.Value` includes functionality to capture any and
all type mismatches between the user-defined configuration and the
expected types. These mismatches can be surfaced as validation errors in
future PRs.

Given that many mutators expect the typed struct to be the source of
truth, this change converts between the dynamic representation and the
typed representation on mutator entry and exit. Existing mutators can
continue to modify the typed representation and these modifications are
reflected in the dynamic representation (see `MarkMutatorEntry` and
`MarkMutatorExit` in `bundle/config/root.go`).

Required changes included in this change:
* The existing interpolation package is removed in favor of
`libs/dyn/dynvar`.
* Functionality to merge job clusters, job tasks, and pipeline clusters
are now all broken out into their own mutators.

To be implemented later:
* Allow variable references for non-string types.
* Surface diagnostics about the configuration provided by the user in
the validation output.
* Some mutators use a resource's configuration file path to resolve
related relative paths. These depend on `bundle/config/paths.Path` being
set and populated through `ConfigureConfigFilePath`. Instead, they
should interact with the dynamically typed configuration directly. Doing
this also unlocks being able to differentiate different base paths used
within a job (e.g. a task override with a relative path defined in a
directory other than the base job).

## Tests

* Existing unit tests pass (some have been modified to accommodate)
* Integration tests pass
2024-02-16 19:41:58 +00:00
Pieter Noordhuis 5f59572cb3
Fix issue where interpolating a new ref would rewrite unrelated fields (#1217)
## Changes

When resolving a value returned by the lookup function, the code would
call into `resolveRef` with the key that `resolveKey` was called with.
In doing so, it would cache the _new_ ref under that key.

We fix this by caching ref resolution only at the top level and relying
on lookup caching to avoid duplicate work.

This came up while testing #1098.

## Tests

Unit test.
2024-02-16 16:19:40 +00:00
Pieter Noordhuis ea8daf1f97
Avoid infinite recursion when normalizing a recursive type (#1213)
## Changes

This is a follow-up to #1211 prompted by the addition of a recursive
type in the Go SDK v0.31.0 (`jobs.ForEachTask`).

When populating missing fields with their zero values we must not
inadvertently recurse into a recursive type.

## Tests

New unit test fails with a stack overflow if the fix if the check is
disabled.
2024-02-16 12:56:02 +00:00
Pieter Noordhuis 18166f5b47
Add option to include fields present in the type but not in the value (#1211)
## Changes

This feature supports variable lookups in a `dyn.Value` that are present
in the type but haven't been initialized with a value.

For example: `${bundle.git.origin_url}` is present in the `dyn.Value`
only if it was assigned a value. If it wasn't assigned a value it should
resolve to the empty string. This normalization option, when set,
ensures that all fields that are represented in the specified type are
present in the return value.

This change is in support of #1098.

## Tests

Added unit test.
2024-02-15 15:16:40 +00:00
Andrew Nester e474948a4b
Generate correct YAML if custom_tags or spark_conf is used for pipeline or job cluster configuration (#1210)
These fields (key and values) needs to be double quoted in order for
yaml loader to read, parse and unmarshal it into Go struct correctly
because these fields are `map[string]string` type.

## Tests
Added regression unit and E2E tests
2024-02-15 15:03:19 +00:00
Pieter Noordhuis aa0c715930
Retain partially valid structs in `convert.Normalize` (#1203)
## Changes

Before this change, any error in a subtree would cause the entire
subtree to be dropped from the output.

This is not ideal when debugging, so instead we drop only the values
that cannot be normalized. Note that this doesn't change behavior if the
caller is properly checking the returned diagnostics for errors.

Note: this includes a change to use `dyn.InvalidValue` as opposed to
`dyn.NilValue` when returning errors.

## Tests

Added unit tests for the case where nested struct, map, or slice
elements contain an error.
2024-02-13 14:12:19 +00:00
Pieter Noordhuis 0b5fdcc346
Zero destination struct in `convert.ToTyped` (#1178)
## Changes

Not doing this means that the output struct is not a true representation
of the `dyn.Value` and unrepresentable state (e.g. unexported fields)
can be carried over across `convert.ToTyped` calls.

## Tests

Unit tests.
2024-02-07 09:25:53 +00:00
Pieter Noordhuis dcb9c85201
Empty struct should yield empty map in `convert.FromTyped` (#1177)
## Changes

This was an issue in cases where the typed structure contains a non-nil
pointer to an empty struct. After conversion to a `dyn.Value` and back
to the typed structure, the pointer became nil.

## Tests

Unit tests.
2024-02-07 09:25:07 +00:00
Pieter Noordhuis f54e790a3b
Ensure every variable reference is passed to lookup function (#1176)
## Changes

References to keys that themselves are also variable references were
shortcircuited in the previous approach. This meant that certain fields
were resolved even if the lookup function would have instructed to skip
resolution.

To fix this we separate the memoization of resolved variable references
from the memoization of lookups. Now, every variable reference is passed
through the lookup function.

## Tests

Before this change, the new test failed with:
```
=== RUN   TestResolveWithSkipEverything
    [...]/libs/dyn/dynvar/resolve_test.go:208: 
        	Error Trace:	[...]/libs/dyn/dynvar/resolve_test.go:208
        	Error:      	Not equal: 
        	            	expected: "${d} ${c} ${c} ${d}"
        	            	actual  : "${b} ${a} ${a} ${b}"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-${d} ${c} ${c} ${d}
        	            	+${b} ${a} ${a} ${b}
        	Test:       	TestResolveWithSkipEverything
```
2024-02-06 15:01:49 +00:00
Pieter Noordhuis 20e45b87ae
Harden `dyn.Value` equality check (#1173)
## Changes

This function could panic when either side of the comparison is a nil or
empty slice. This logic is triggered when comparing the input value to
the output value when calling `dyn.Map`.

## Tests

Unit tests.
2024-02-05 16:54:41 +00:00
shreyas-goenka 6beda4405e
Fix dynamic representation of zero values in maps and slices (#1154)
## Changes
In the dynamic configuration, the nil value (dyn.NilValue) denotes a
value that should not be serialized, ie a value being nil is the same as
it not existing in the first place.

This is not true for zero values in maps and slices. This PR fixes the
conversion from typed values to dyn.Value, to treat zero values in maps
and slices as zero and not nil.

## Tests
Unit tests
2024-01-31 14:25:13 +00:00
Pieter Noordhuis 14abcb3ad7
Add `dynvar` package for variable resolution with a `dyn.Value` tree (#1143)
## Changes

This is the `dyn` counterpart to the `bundle/config/interpolation`
package.

It relies on the paths in `${foo.bar}` being valid `dyn.Path` instances.
It leverages `dyn.Walk` to get a complete picture of all variable
references and uses `dyn.Get` to retrieve values pointed to by variable
references.

Depends on #1142.

## Tests

Unit test coverage. I tried to mirror the tests from
`bundle/config/interpolation` and added new ones where applicable (for
example to test type retention of referenced values).
2024-01-24 18:49:06 +00:00
Pieter Noordhuis ff6e0354b9
Add functionality to visit values in `dyn.Value` tree (#1142)
## Changes

This change adds the following functions:
* `dyn.Get(value, "foo.bar") -> (dyn.Value, error)`
* `dyn.Set(value, "foo.bar", newValue) -> (dyn.Value, error)`
* `dyn.Map(value, "foo.bar", func) -> (dyn.Value, error)`

And equivalent functions that take a previously constructed `dyn.Path`:
* `dyn.GetByPath(value, dyn.Path) -> (dyn.Value, error)`
* `dyn.SetByPath(value, dyn.Path, newValue) -> (dyn.Value, error)`
* `dyn.MapByPath(value, dyn.Path, func) -> (dyn.Value, error)`

Changes made by the "set" and "map" functions are never reflected in the
input argument; they return new `dyn.Value` instances for all nodes in
the path leading up to the changed value.

## Tests

New unit tests cover all critical paths.
2024-01-24 18:38:46 +00:00
Andrew Nester 70fe0e36ef
Added `databricks bundle generate job` command (#1043)
## Changes
Now it's possible to generate bundle configuration for existing job.
For now it only supports jobs with notebook tasks.

It will download notebooks referenced in the job tasks and generate
bundle YAML config for this job which can be included in larger bundle.

## Tests
Running command manually

Example of generated config
```
resources:
  jobs:
    job_128737545467921:
      name: Notebook job
      format: MULTI_TASK
      tasks:
        - task_key: as_notebook
          existing_cluster_id: 0704-xxxxxx-yyyyyyy
          notebook_task:
            base_parameters:
              bundle_root: /Users/andrew.nester@databricks.com/.bundle/job_with_module_imports/development/files
            notebook_path: ./entry_notebook.py
            source: WORKSPACE
          run_if: ALL_SUCCESS
      max_concurrent_runs: 1
 ```

## Tests
Manual (on our last 100 jobs) + added end-to-end test

```
--- PASS: TestAccGenerateFromExistingJobAndDeploy (50.91s)
PASS
coverage: 61.5% of statements in ./...
ok github.com/databricks/cli/internal/bundle 51.209s coverage: 61.5% of
statements in ./...
```
2024-01-17 14:26:33 +00:00
Pieter Noordhuis d8a64e6617
Define constant for the invalid `dyn.Value` (#1101)
## Changes

The nil value is a real valid value that we need to represent. To
accommodate this we introduced `dyn.KindInvalid` as the zero-value for
`dyn.Kind` (see #904), but did not yet update the comments on
`dyn.NilValue` or add tests for `kind.go`.

This also moves `KindNil` to be last in the definition order (least
likely to care about it).

## Tests

Tests pass.
2024-01-05 13:02:04 +00:00
Pieter Noordhuis bae220d1bc
Consolidate functions to convert `dyn.Value` to native types (#1100)
## Changes

The file `value.go` had a couple `AsZZZ` and `MustZZZ` functions.
This change backfills missing versions and moves all of them to a
separate file.

## Tests

Tests pass; full coverage.
2024-01-05 12:06:12 +00:00
Pieter Noordhuis 938eb1600c
Rename libs/config -> libs/dyn (#1086)
## Changes

The name "dynamic value", or "dyn" for short, is more descriptive than
the opaque "config". Also, it conveniently does not alias with other
packages in the repository, or (popular ones) elsewhere.

(discussed with @andrewnester)

## Tests

n/a
2023-12-22 13:20:45 +00:00