Commit Graph

72 Commits

Author SHA1 Message Date
Andrew Nester 2c4f3e1fd0
addressed feedback 2024-12-10 17:11:41 +01:00
Andrew Nester f5781e2707
Merge branch 'main' into feature/apps 2024-12-10 16:11:20 +01:00
Denis Bilenko 1b2be1b2cb
Add error checking in tests and enable errcheck there (#1980)
## Changes
Fix all errcheck-found issues in tests and test helpers. Mostly this
done by adding require.NoError(t, err), sometimes panic() where t object
is not available).

Initial change is obtained with aider+claude, then manually reviewed and
cleaned up.

## Tests
Existing tests.
2024-12-09 13:56:41 +01:00
Pieter Noordhuis 6e754d4f34
Rewrite 'interface{} -> any' (#1959)
## Changes

The `any` alias for `interface{}` has been around since Go 1.18.

Now that we're using golangci-lint (#1953), we can lint on it.

Existing commits can be updated with:
```
gofmt -w -r 'interface{} -> any' .
```

## Tests

n/a
2024-12-05 15:37:24 +00:00
Andrew Nester d0d875b0db
Merge branch 'main' into feature/apps 2024-12-05 15:43:50 +01:00
Andrew Nester 97a1456038
addressed feedback 2024-12-05 15:40:34 +01:00
shreyas-goenka 2847533e1e
Add DABs support for Unity Catalog volumes (#1762)
## Changes

This PR adds support for UC volumes to DABs.

### Can I use a UC volume managed by DABs in `artifact_path`?

Yes, but we require the volume to exist before being referenced in
`artifact_path`. Otherwise you'll see an error that the volume does not
exist. For this case, this PR also adds a warning if we detect that the
UC volume is defined in the DAB itself, which informs the user to deploy
the UC volume in a separate deployment first before using it in
`artifact_path`.

We cannot create the UC volume and then upload the artifacts to it in
the same `bundle deploy` because `bundle deploy` always uploads the
artifacts to `artifact_path` before materializing any resources defined
in the bundle. Supporting this in a single deployment requires us to
migrate away from our dependency on the Databricks Terraform provider to
manage the CRUD lifecycle of DABs resources.

### Why do we not support `preset.name_prefix` for UC volumes?

UC volumes will not have a `dev_shreyas_goenka` prefix added in `mode:
development`. Configuring `presets.name_prefix` will be a no-op for UC
volumes. We have decided not to support prefixing for UC resources. This
is because:
1. UC provides its own namespace hierarchy that is independent of DABs.
2. Users can always manually use `${workspace.current_user.short_name}`
to configure the prefixes manually.

Customers often manually set up a UC hierarchy for dev and prod,
including a schema or catalog per developer. Thus, it's often
unnecessary for us to add prefixing in `mode: development` by default
for UC resources.

In retrospect, supporting prefixing for UC schemas and registered models
was a mistake and will be removed in a future release of DABs.

## Tests
Unit, integration test, and manually. 

### Manual Testing cases:
 1. UC volume does not exist:
```
➜  bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/my_volume that is configured in the artifact_path: Not Found
```

2. UC Volume does not exist, but is defined in the DAB
```
➜  bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/managed_by_dab that is configured in the artifact_path: Not Found

Warning: You might be using a UC volume in your artifact_path that is managed by this bundle but which has not been deployed yet. Please deploy the UC volume in a separate bundle deploy before using it in the artifact_path.
  at resources.volumes.bar
  in databricks.yml:24:7

```

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-12-02 21:18:07 +00:00
Pieter Noordhuis 1db384018c
Make `TableName` field part of quality monitor schema (#1903)
## Changes

This field was special-cased in #1307 because it's not part of the JSON
payload in the SDK struct.

This approach, while pragmatic, meant it didn't show up in the JSON
schema. While debugging an issue with quality monitors in #1900, I
couldn't figure out why I was getting schema errors on this field, or
how it was passed through to the TF representation. This commit removes
the special case and makes it behave like everything else.

## Tests

* Unit tests pass.
* Confirmed that the updated schema failed validation before this
change.
2024-11-14 17:39:38 +00:00
Pieter Noordhuis fa25b92ba1
Add `libs/dyn/jsonsaver` (#1862)
## Changes

This package can be used to marshal a `dyn.Value` as JSON and retain the
ordering of keys in a mapping. Unlike the default behavior of
`json.Marshal,` the output does not encode HTML characters.

Otherwise, this is no different from using `JSON.Marshal` with
`v.AsAny().`

## Tests

Unit tests.
2024-10-29 15:32:33 +00:00
Pieter Noordhuis dedec58e41
Add behavioral tests for examples from the YAML spec (#1835)
## Changes

I took the examples from https://yaml.org/spec/1.2.2.

The required modifications to the loader are:
* Correctly parse floating point infinities and NaN
* Correctly parse octal numbers per the YAML 1.2 spec
* Treat "null" keys in a map as valid

## Tests

Existing and new unit tests pass.
2024-10-17 13:13:30 +00:00
Pieter Noordhuis e4d039a1aa
Handle normalization of `dyn.KindTime` into an any type (#1836)
## Changes

The issue reported in #1828 illustrates how using a YAML timestamp-like
value (a date in this case) causes an issue during conversion to and
from the typed configuration tree.

We use the `AsAny()` function on the `dyn.Value` when normalizing for
the `any` type. We only use the `any` type for variable values, because
they can assume every type. The `AsAny()` function returns a `time.Time`
for the time value during conversion **to** the typed configuration
tree. Upon conversion **from** the typed configuration tree back into
the dynamic configuration tree, we cannot distinguish a `time.Time`
struct from any other struct.

To address this, we use the underlying string value of the time value
when we normalize for the `any` type.

Fixes #1828.

## Tests

Existing unit tests pass
2024-10-17 10:00:40 +00:00
Andrew Nester f0e2981596
Added JSON input validation for CLI commands (#1771)
## Changes
Added JSON input validation for CLI commands. Now when invalid JSON
passed as a payload to CLI commands, CLI performs input normalisation
and detects if there are any mismatches such as incorrect types, unknown
fields and etc.

This diagnostic information is printed in standard error output and does
not block command execution, so the change is backward compatible.

Fixes #1769 #1764 #1625 #1560


## Tests
Added unit tests

```
andrew.nester@HFW9Y94129 ~ % databricks jobs create --json '{"seeti}'
Error: error decoding JSON at (inline):1:2: unexpected EOF


andrew.nester@HFW9Y94129 ~ % databricks jobs create --json '{"seeti": true}'
Warning: unknown field: seeti
  in (inline):1:9

Error: Job settings must be specified.
```

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-10-11 14:39:53 +00:00
Andrew Nester 66307134c1
Fixed generated YAML missing 'default' for empty values (#1765)
## Changes
Fixed generated YAML missing 'default' for empty values

## Tests
Added unit test
2024-09-11 09:49:58 +00:00
shreyas-goenka 28b39cd3f7
Make bundle JSON schema modular with `$defs` (#1700)
## Changes
This PR makes sweeping changes to the way we generate and test the
bundle JSON schema. The main benefits are:

1. More modular JSON schema. Every definition in the schema now is one
level deep and points to references instead of inlining the entire
schema for a field. This unblocks PyDABs from taking a dependency on the
JSON schema.

2. Generate the JSON schema during CLI code generation. Directly stream
it instead of computing it at runtime whenever a user calls `databricks
bundle schema`. This is nice because we no longer need to embed a
partial OpenAPI spec in the CLI. Down the line, we can add a `Schema()`
method to every struct in the Databricks Go SDK and remove the
dependency on the OpenAPI spec altogether. It'll become more important
once we decouple Go SDK structs and methods from the underlying APIs.

3. Add enum values for Go SDK fields in the JSON schema. Better
autocompletion and validation for these fields. As a follow-up, we can
add enum values for non-Go SDK enums as well (created internal ticket to
track).

4. Use "packageName.structName" as a key to read JSON schemas from the
OpenAPI spec for Go SDK structs. Before, we would use an unrolled
presentation of the JSON schema (stored in `bundle_descriptions.json`),
which was complex to parse and include in the final JSON schema output.
This also means loading values from the OpenAPI spec for `target` schema
works automatically and no longer needs custom code.
5. Support recursive types (eg: `for_each_task`). With us now using
$refs everywhere it's trivial to support.
6. Using complex variables would be invalid according to the schema
generated before this PR. Now that bug is fixed. In the future adding
more custom rules will be easier as well due to the single level nature
of the JSON schema.


Since this is a complete change of approach in how we generate the JSON
schema, there are a few (very minor) regressions worth calling out.
1. We'll lose a few custom descriptions for non Go SDK structs that were
a part of `bundle_descriptions.json`. Support for those can be added in
the future as a followup.
2. Since now the final JSON schema is a static artefact, we lose some
lead time for the signal that JSON schema integration tests are failing.
It's okay though since we have a lot of coverage via the existing unit
tests.

## Tests
Unit tests. End to end tests are being added in this PR:
https://github.com/databricks/cli/pull/1726

Previous unit tests were all deleted because they were bloated. Effort
was made to make the new unit tests provide (almost) equivalent
coverage.
2024-09-10 13:55:18 +00:00
Pieter Noordhuis ceefa80d72
Pass copy of `dyn.Path` to callback function (#1747)
## Changes

Some call sites hold on to the `dyn.Path` provided to them by the
callback. It must therefore never be mutated after the callback returns,
or these mutations leak out into unknown scope.

This change means it is no longer possible for this failure mode to
happen.

## Tests

Unit test.
2024-09-05 11:05:16 +00:00
Pieter Noordhuis 0f4891f0fe
Add `dyn.Time` to box a timestamp with its original string value (#1732)
## Changes

If not explicitly quoted, the YAML loader interprets a value like
`2024-08-29` as a timestamp. Such a value is usually intended to be a
string instead. Our normalization logic was not able to turn a time
value back into the original string.

This change boxes the time value to include its original string
representation. Normalization of one of these values into a string can
now use the original input value.

## Tests

Unit tests in `libs/dyn/convert`.
2024-08-29 13:02:34 +00:00
shreyas-goenka 7ae80de351
Stop tracking file path locations in bundle resources (#1673)
## Changes
Since locations are already tracked in the dynamic value tree, we no
longer need to track it at the resource/artifact level. This PR:
1. Removes use of `paths.Paths`. Uses dyn.Location instead.
2. Refactors the validation of resources not being empty valued to be
generic across all resource types.
  
## Tests
Existing unit tests.
2024-08-13 12:50:15 +00:00
shreyas-goenka a52b188e99
Use dynamic walking to validate unique resource keys (#1614)
## Changes
This PR:
1. Uses dynamic walking (via the `dyn.MapByPattern` func) to validate no
two resources have the same resource key. The allows us to remove this
validation at merge time.
2. Modifies `dyn.Mapping` to always return a sorted slice of pairs. This
makes traversal functions like `dyn.Walk` or `dyn.MapByPattern`
deterministic.

## Tests
Unit tests. Also manually.
2024-07-29 13:04:02 +00:00
shreyas-goenka 37b9df96e6
Support multiple paths for diagnostics (#1616)
## Changes
Some diagnostics can have multiple paths associated with them. For
instance, ensuring that unique resource keys are used across all
resources. This PR extends `diag.Diagnostic` to accept multiple paths.

This PR is symmetrical to
https://github.com/databricks/cli/pull/1610/files

## Tests
Unit tests
2024-07-25 15:16:27 +00:00
shreyas-goenka 4bf88b4209
Support multiple locations for diagnostics (#1610)
## Changes
This PR changes `diag.Diagnostics` to allow including multiple locations
associated with the diagnostic message. The diagnostics that now return
multiple locations with this PR are:
1. Warning for unknown keys in config.
2. Use of experimental.run_as
3. Accidental sync.exludes that exclude all files.

## Tests
Existing unit tests pass. New unit test case to assert on error message
when multiple locations are included.

Example output:
```
➜  bundle-playground-2 ~/cli2/cli/cli bundle validate              
Warning: You are using the legacy mode of run_as. The support for this mode is experimental and might be removed in a future release of the CLI. In order to run the DLT pipelines in your DAB as the run_as user this mode changes the owners of the pipelines to the run_as identity, which requires the user deploying the bundle to be a workspace admin, and also a Metastore admin if the pipeline target is in UC.
  at experimental.use_legacy_run_as
  in resources.yml:10:22
     databricks.yml:13:22

Name: fix run_if
Target: default
Workspace:
  User: shreyas.goenka@databricks.com
  Path: /Users/shreyas.goenka@databricks.com/.bundle/fix run_if/default

Found 1 warning
```
2024-07-23 17:20:11 +00:00
shreyas-goenka 8ed9964482
Track multiple locations associated with a `dyn.Value` (#1510)
## Changes
This PR changes the location metadata associated with a `dyn.Value` to a
slice of locations. This will allow us to keep track of location
metadata across merges and overrides.

The convention is to treat the first location in the slice as the
primary location. Also, the semantics are the same as before if there's
only one location associated with a value, that is:
1. For complex values (maps, sequences) the location of the v1 is
primary in Merge(v1, v2)
2. For primitive values the location of v2 is primary in Merge(v1, v2)

## Tests
Modifying existing merge unit tests. Other existing unit tests and
integration tests pass.

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-07-16 11:27:27 +00:00
Gleb Kanterov b9e3c98723
PythonMutator: support omitempty in PyDABs (#1513)
## Changes
PyDABs output can omit empty sequences/mappings because we don't track
them as optional. There is no semantic difference between empty and
missing, which makes omitting correct. CLI detects that we falsely
modify input resources by deleting all empty collections.

To handle that, we extend `dyn.Override` to allow visitors to ignore
certain deletes. If we see that an empty sequence or mapping is deleted,
we revert such delete.

## Tests
Unit tests

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
2024-07-03 07:22:03 +00:00
Andrew Nester 3d2f7622bc
Fixed bundle not loading when empty variable is defined (#1552)
## Changes
Fixes #1544

## Tests
Added regression test
2024-07-02 12:40:39 +00:00
Pieter Noordhuis da603c6ead
Ignore `dyn.NilValue` when traversing value from `dyn.Map` (#1547)
## Changes

The map function ignores cases where either a key in a map is not
present or an index in a sequence is out of bounds. As of recently, we
retain nil values as valid values in a configuration tree. As such, it
makes sense to also ignore cases where a map or sequence is expected but
nil is found. This is semantically no different from an empty map where
a key is not found.

Without this fix, all calls to `dyn.Map` would need to be updated with
nil-checks at every path component.

Related PRs:
* #1507
* #1511

## Tests

Unit tests pass.
2024-07-01 13:00:31 +00:00
shreyas-goenka 4d8eba04cd
Compare `.Kind()` instead of direct equality checks on a `dyn.Value` (#1520)
## Changes

This PR makes two changes:

1. In https://github.com/databricks/cli/pull/1510 we'll be adding
multiple associated location metadata with a dyn.Value. The Go compiler
does not allow comparing structs if they contain slice values
(presumably due to multiple possible definitions for equality). In
anticipation for adding a `[]dyn.Location` type field to `dyn.Value`
this PR removes all direct comparisons of `dyn.Value` and instead relies
on the kind.

2. Retain location metadata for values in convert.FromTyped. The change
diff is exactly the same as https://github.com/databricks/cli/pull/1523.
It's been combined with this PR because they both depend on each other
to prevent test failures (forming a test failure deadlock).

Go patch used:
```
@@
var x expression
@@
-x == dyn.InvalidValue
+x.Kind() == dyn.KindInvalid

@@
var x expression
@@
-x != dyn.InvalidValue
+x.Kind() != dyn.KindInvalid

@@
var x expression
@@
-x == dyn.NilValue
+x.Kind() == dyn.KindNil

@@
var x expression
@@
-x != dyn.NilValue
+x.Kind() != dyn.KindNil
```
 

## Tests
Unit tests and integration tests pass.
2024-06-27 13:28:19 +00:00
Gleb Kanterov dba6164a4c
merge.Override: Fix handling of dyn.NilValue (#1530)
## Changes
Fix handling of `dyn.NilValue` in `merge.Override` in case `dyn.Value`
has location

## Tests
Unit tests
2024-06-27 09:47:58 +00:00
Andrew Nester 5f42791609
Added support for complex variables (#1467)
## Changes
Added support for complex variables

Now it's possible to add and use complex variables as shown below

```
bundle:
  name: complex-variables

resources:
  jobs:
    my_job:
      job_clusters:
        - job_cluster_key: key
          new_cluster: ${var.cluster}
      tasks:
      - task_key: test
        job_cluster_key: key

variables:
  cluster:
    description: "A cluster definition"
    type: complex
    default:
      spark_version: "13.2.x-scala2.11"
      node_type_id: "Standard_DS3_v2"
      num_workers: 2
      spark_conf:
        spark.speculation: true
        spark.databricks.delta.retentionDurationCheck.enabled: false
```

Fixes #1298

- [x] Support for complex variables
- [x] Allow variable overrides (with shortcut) in targets
- [x] Don't allow to provide complex variables via flag or env variable
- [x] Fail validation if complex value is used but not `type: complex`
provided
- [x] Support using variables inside complex variables 

## Tests
Added unit tests

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-06-26 10:25:32 +00:00
Pieter Noordhuis 482d83cba8
Revert "Retain location metadata for values in `convert.FromTyped`" (#1528)
## Changes

This reverts commit dac5f09556 (#1523).

Retaining the location for nil values means equality checks no longer
pass.

We need #1520 to be merged first.

## Tests

Integration test `TestAccPythonWheelTaskDeployAndRunWithWrapper`.
2024-06-26 09:26:40 +00:00
shreyas-goenka dac5f09556
Retain location metadata for values in `convert.FromTyped` (#1523)
## Changes

There are four different treatments location metadata can receive in the
`convert.FromTyped` method.

1. Location metadata is **retained** for maps, structs and slices if the
value is **not nil**
2. Location metadata is **lost** for maps, structs and slices if the
value is **is nil**
3. Location metadata is **retained** if a scalar type (eg. bool, string
etc) does not change.
4. Location metadata is **lost** if the value for a scalar type changes.

This PR ensures that location metadata is not lost in any case; that is,
it's always preserved.

For (2), this serves as a bug fix so that location information is not
lost on conversion to and from typed for nil values of complex types
(struct, slices, and maps).

For (4) this is a change in semantics. For primitive values modified in
a `typed` mutator, any references to `.Location()` for computed
primitive fields will now return associated YAML location metadata (if
any) instead of an empty location.

While arguable, these semantics are OK since:
1. Situations like these will be rare.
2. Knowing the YAML location (if any) is better than not knowing the
location at all. These locations are typically visible to the user in
errors and warnings.

## Tests

Unit tests
2024-06-25 13:40:21 +00:00
shreyas-goenka 068c7cfc2d
Return `dyn.InvalidValue` instead of `dyn.NilValue` when errors happen (#1514)
## Changes
With https://github.com/databricks/cli/pull/1507 and
https://github.com/databricks/cli/pull/1511 we are clarifying the
semantics associated with `dyn.InvalidValue` and `dyn.NilValue`. An
invalid value is the default zero value and is used to signals the
complete absence of the value.

A nil value, on the other hand, is a valid value for a piece of
configuration and signals explicitly setting a key to nil in the
configuration tree. In keeping with that theme, this PR returns
`dyn.InvalidValue` instead of `dyn.NilValue` at error sites. This change
is not expected to have a material change in behaviour and is being done
to set the right convention since we have well-defined semantics
associated with both `NilValue` and `InvalidValue`.

## Tests
Unit tests and integration tests pass. Also manually scanned the changes
and the associated call sites to verify the `NilValue` value itself was
not being relied upon.
2024-06-21 14:22:42 +00:00
Pieter Noordhuis 446a9d0c52
Properly deal with nil values in `convert.FromTyped` (#1511)
## Changes

When a configuration defines:
```yaml
run_as:
```

It first showed up as `run_as -> nil` in the dynamic configuration only
to later be converted to `run_as -> {}` while going through typed
conversion. We were using the presence of a key to initialize an empty
value. This is incorrect and it should have remained a nil value.

This conversion was happening in `convert.FromTyped` where any struct
always returned a map value. Instead, it should only return a map value
in any one of these cases: 1) the struct has elements, 2) the struct was
originally a map in the dynamic configuration, or 3) the struct was
initialized to a non-empty pointer value.

Stacked on top of #1516 and #1518.

## Tests

* Unit tests pass.
* Integration tests pass.
* Manually ran through bundle CRUD with a bundle without resources.
2024-06-21 13:43:21 +00:00
Pieter Noordhuis 87bc583819
Allow the any type to be set to nil in `convert.FromTyped` (#1518)
## Changes

This came up in integration testing for #1511. One of the tests
converted a `map[string]any` to a dynamic value and encountered a `nil`
and errored out. We can safely return a nil in this case.

## Tests

Unit test passes.
2024-06-21 11:19:48 +00:00
Pieter Noordhuis b2c03ea54c
Use `dyn.InvalidValue` to indicate absence (#1507)
## Changes

Previously, the functions `Get` and `Index` returned `dyn.NilValue` to
indicate that a map key or sequence index wasn't found. This is a valid
value, so we need to differentiate between actual absence and a real
`dyn.NilValue`. We do this with the zero value of a `dyn.Value` (also
captured in the constant `dyn.InvalidValue`).

## Tests

* Unit tests.
* Renamed `Get` and `Index` to find and update all call sites.
2024-06-19 15:24:57 +00:00
Aravind Segu a33d0c8bf9
Add support for Lakehouse monitoring in bundles (#1307)
## Changes

This change adds support for Lakehouse monitoring in bundles.

The associated resource type name is "quality monitor".

## Testing

Unit tests.

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: Arpit Jasapara <87999496+arpitjasa-db@users.noreply.github.com>
2024-05-31 09:42:25 +00:00
shreyas-goenka c5032644a0
Fix conversion of zero valued scalar pointers to a dynamic value (#1433)
## Changes
This PR also fixes empty values variable overrides using the --var flag.
Now, using `--var="my_variable="` will set the value of `my_variable` to
the empty string instead of ignoring the flag altogether.

## Tests
The change using a unit test. Manually verified the `--var` flag works
now.
2024-05-21 11:53:00 +00:00
Gleb Kanterov 09aa3cb9e9
Add more tests for `merge.Override` (#1439)
## Changes
Add test coverage to ensure we respect return value and error

## Tests
Unit tests
2024-05-21 06:48:42 +00:00
Gleb Kanterov 04e56aa472
Add `merge.Override` transform (#1428)
## Changes
Add `merge.Override` transform. It allows the override one `dyn.Value`
with another, preserving source locations for parts of the sub-tree
where nothing has changed. This is different from merging, where values
are concatenated.

`OverrideVisitor` is visiting the changes during the override process
and allows to control of what changes are allowed or update the
effective value.

The primary use case is Python code updating bundle configuration.

During override, we update locations only for changed values. This
allows us to keep track of locations where values were initially defined
and used for error reporting. For instance, merging:

```yaml
resources:               # location=left.yaml:0
  jobs:                  # location=left.yaml:1
    job_0:               # location=left.yaml:2
      name: "job_0"      # location=left.yaml:3
```

with

```yaml
resources:               # location=right.yaml:0
  jobs:                  # location=right.yaml:1
    job_0:               # location=right.yaml:2
      name: "job_0"      # location=right.yaml:3
      description: job 0 # location=right.yaml:4
    job_1:               # location=right.yaml:5
      name: "job_1"      # location=right.yaml:5
```

produces

```yaml
resources:               # location=left.yaml:0
  jobs:                  # location=left.yaml:1
    job_0:               # location=left.yaml:2
      name: "job_0"      # location=left.yaml:3
      description: job 0 # location=right.yaml:4
    job_1:               # location=right.yaml:5
      name: "job_1"      # location=right.yaml:5
```

## Tests
Unit tests
2024-05-17 09:34:39 +00:00
shreyas-goenka 6fd581d173
Allow variable references in non-string fields in the JSON schema (#1398)
## Tests
Verified manually.

Before:
<img width="373" alt="Screenshot 2024-04-24 at 7 18 44 PM"
src="https://github.com/databricks/cli/assets/88374338/b4aef51f-0c16-4589-9d47-cdec9ab91158">

After:
<img width="364" alt="Screenshot 2024-04-24 at 7 18 31 PM"
src="https://github.com/databricks/cli/assets/88374338/3d8e412e-77ee-4641-943d-f99eab26ba02">
<img width="356" alt="Screenshot 2024-04-24 at 7 16 54 PM"
src="https://github.com/databricks/cli/assets/88374338/2aed369a-3c6a-4754-9c76-0969423f319e">

Manually verified the schema diff is sane. Example:
```
<                               "type": "boolean",
<                               "description": "If inference tables are enabled or not. NOTE: If you have already disabled payload logging once, you cannot enable again."
---
>                               "description": "If inference tables are enabled or not. NOTE: If you have already disabled payload logging once, you cannot enable again.",
>                               "anyOf": [
>                                 {
>                                   "type": "boolean"
>                                 },
>                                 {
>                                   "type": "string",
>                                   "pattern": "\\$\\{([a-zA-Z]+([-_]?[a-zA-Z0-9]+)*(\\.[a-zA-Z]+([-_]?[a-zA-Z0-9]+)*)*)\\}"
>                                 }
>                               ]
```
2024-04-25 11:20:45 +00:00
Pieter Noordhuis 77d6820075
Convert between integer and float in normalization (#1371)
## Changes

We currently issue a warning if an integer is used where a floating
point number is expected. But if they are convertible, we should convert
and not issue a warning. This change fixes normalization if they are
convertible between each other. We still produce a warning if the type
conversion leads to a loss in precision.

## Tests

Unit tests pass.
2024-04-17 08:58:07 +00:00
Andrew Nester d914a1b1e2
Do not emit warning on YAML anchor blocks (#1354)
## Changes
In 0.217.0 we started to emit warning on unknown fields in YAML
configuration but wrongly considered YAML anchor blocks as unknown
field.

This PR fixes this by skipping normalising of YAML blocks.

## Tests
Added regression tests
2024-04-10 09:55:02 +00:00
Pieter Noordhuis b4e2645942
Make normalization return warnings instead of errors (#1334)
## Changes

Errors in normalization mean hard failure as of #1319.

We currently allow malformed configurations and ignore the malformed
fields and should continue to do so.

## Tests

* Tests pass.
* No calls to `diag.Errorf` from `libs/dyn`
2024-04-03 11:14:23 +00:00
Pieter Noordhuis a95b1c7dcf
Retain location information of variable reference (#1333)
## Changes

Variable substitution works as if the variable reference is literally
replaced with its contents.

The following fields should be interpreted in the same way regardless of
where the variable is defined:
```yaml
foo: ${var.some_path}
bar: "./${var.some_path}"
```

Before this change, `foo` would inherit the location information of the
variable definition. After this change, it uses the location information
of the variable reference, making the behavior for `foo` and `bar`
identical.

Fixes #1330.

## Tests

The new test passes only with the fix.
2024-04-03 10:40:29 +00:00
Pieter Noordhuis c1963ec0df
Include `dyn.Path` in normalization warnings and errors (#1332)
## Changes

This adds context to warnings and errors. For example:

* Summary: `unknown field bar`
* Location: `foo.yml:6:10`
* Path: `.targets.dev.workspace`

## Tests

Unit tests.
2024-04-03 08:56:46 +00:00
Pieter Noordhuis dca81a40f4
Return warning for nil primitive types during normalization (#1329)
## Changes

It's not necessary to error out if a configuration field is present but
not set.

For example, the following would error out, but after this change only
produces a warning:
```yaml
workspace:
  # This is a string field, but if not specified, it ends up being a null.
  host:
```

## Tests

Updated the unit tests to match the new behavior.

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-04-02 12:17:29 +00:00
Pieter Noordhuis 26094f01a0
Define `dyn.Mapping` to represent maps (#1301)
## Changes

Before this change maps were stored as a regular Go map with string
keys. This didn't let us capture metadata (location information) for map
keys.

To address this, this change replaces the use of the regular Go map with
a dedicated type for a dynamic map. This type stores the `dyn.Value` for
both the key and the value. It uses a map to still allow O(1) lookups
and redirects those into a slice.

## Tests

* All existing unit tests pass (some with minor modifications due to
interface change).
* Equality assertions with `assert.Equal` no longer worked because the
new `dyn.Mapping` persists the order in which keys are set and is
therefore susceptible to map ordering issues. To fix this, I added a
`dynassert` package that forwards all assertions to `testify/assert` but
intercepts equality for `dyn.Value` arguments.
2024-03-25 11:01:09 +00:00
Pieter Noordhuis 8255c9d9fb
Make `Append` function to `dyn.Path` return independent slice (#1295)
## Changes

While working on #1273, I found that calls to `Append` on a
`dyn.Pattern` were mutating the original slice. This is expected because
appending to a slice will mutate in place if the capacity of the
original slice is large enough. This change updates the `Append` call on
the `dyn.Path` as well to return a newly allocated slice to avoid
inadvertently mutating the originals.

We have existing call sites in the `dyn` package that mutate a
`dyn.Path` (e.g. walk or visit) and these are modified to continue to do
this with a direct call to `append`. Callbacks that use the `dyn.Path`
argument outside of the callback need to make a copy to ensure it isn't
mutated (this is no different from existing semantics).

The `Join` function wasn't used and is removed as part of this change.

## Tests

Unit tests.
2024-03-19 09:49:26 +00:00
Pieter Noordhuis 7c4b34945c
Rewrite relative paths using `dyn.Location` of the underlying value (#1273)
## Changes

This change addresses the path resolution behavior in resource
definitions. Previously, all paths were resolved relative to where the
resource was first defined, which could lead to confusion and errors
when paths were specified in different directories. The new behavior is
to resolve paths relative to where they are defined, making it more
intuitive.

However, to avoid breaking existing configurations, compatibility with
the old behavior is maintained.

## Tests

* Existing unit tests for path translation pass.
* Additional test to cover both the nominal and the fallback behavior.
2024-03-18 16:23:39 +00:00
Pieter Noordhuis 4a9a12af19
Retain location annotation when expanding globs for pipeline libraries (#1274)
## Changes

We now keep location metadata associated with every configuration value.
When expanding globs for pipeline libraries, this annotation was erased
because of the conversion to/from the typed structure. This change
modifies the expansion mutator to work with `dyn.Value` and retain the
location of the value that holds the glob pattern.

## Tests

Unit tests pass.
2024-03-11 21:59:36 +00:00
Pieter Noordhuis 2453cd49d9
Add `dyn.MapByPattern` to map a function to values with matching paths (#1266)
## Changes

The new `dyn.Pattern` type represents a path pattern that can match one
or more paths in a configuration tree. Every `dyn.Path` can be converted
to a `dyn.Pattern` that matches only a single path.

To accommodate this change, the visit function needed to be modified to
take a `dyn.Pattern` suffix. Every component in the pattern implements
an interface to work with the visit function. This function can recurse
on the visit function for one or more elements of the value being
visited. For patterns derived from a `dyn.Path`, it will work as it did
before and select the matching element. For the new pattern components
(e.g. `dyn.AnyKey` or `dyn.AnyIndex`), it recurses on all the elements
in the container.

## Tests

Unit tests. Confirmed full coverage for the new code.
2024-03-08 14:33:01 +00:00
Pieter Noordhuis c950826ac1
Add assertions for the `dyn.Path` argument to the visit callback (#1265)
## Changes

The `dyn.Path` argument wasn't tested and could regress. Spotted this
while working on related code. Follow up to #1260.

## Tests

Unit tests.
2024-03-08 10:48:40 +00:00