## Changes
We now keep location metadata associated with every configuration value.
When expanding globs for pipeline libraries, this annotation was erased
because of the conversion to/from the typed structure. This change
modifies the expansion mutator to work with `dyn.Value` and retain the
location of the value that holds the glob pattern.
## Tests
Unit tests pass.
## Changes
The databricks terraform provider does not allow changing permission of
the current user. Instead, the current identity is implictly set to be
the owner of all resources on the platform side.
This PR introduces a mutator to filter permissions from the bundle
configuration at deploy time, allowing users to define permissions for
their own identities in their bundle config.
This would allow configurations like, allowing both alice and bob to
collaborate on the same DAB:
```
permissions:
level: CAN_MANAGE
user_name: alice
level: CAN_MANAGE
user_name: bob
```
This PR is a reincarnation of
https://github.com/databricks/cli/pull/1145. The earlier attempt had to
be reverted due to metadata loss converting to and from the dynamic
configuration representation (reverted here:
https://github.com/databricks/cli/pull/1179)
## Tests
Unit test and manually
Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.15.0 to
0.16.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="766dc5df63"><code>766dc5d</code></a>
modfile: use new go version string format in WorkFile.add error</li>
<li>See full diff in <a
href="https://github.com/golang/mod/compare/v0.15.0...v0.16.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/mod&package-manager=go_modules&previous-version=0.15.0&new-version=0.16.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [golang.org/x/oauth2](https://github.com/golang/oauth2) from
0.17.0 to 0.18.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="85231f99d6"><code>85231f9</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="34a7afaa85"><code>34a7afa</code></a>
google/externalaccount: add Config.UniverseDomain</li>
<li><a
href="95bec95381"><code>95bec95</code></a>
google/externalaccount: moves externalaccount package out of internal
and exp...</li>
<li>See full diff in <a
href="https://github.com/golang/oauth2/compare/v0.17.0...v0.18.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/oauth2&package-manager=go_modules&previous-version=0.17.0&new-version=0.18.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Changes
The new `dyn.Pattern` type represents a path pattern that can match one
or more paths in a configuration tree. Every `dyn.Path` can be converted
to a `dyn.Pattern` that matches only a single path.
To accommodate this change, the visit function needed to be modified to
take a `dyn.Pattern` suffix. Every component in the pattern implements
an interface to work with the visit function. This function can recurse
on the visit function for one or more elements of the value being
visited. For patterns derived from a `dyn.Path`, it will work as it did
before and select the matching element. For the new pattern components
(e.g. `dyn.AnyKey` or `dyn.AnyIndex`), it recurses on all the elements
in the container.
## Tests
Unit tests. Confirmed full coverage for the new code.
## Changes
The `dyn.Path` argument wasn't tested and could regress. Spotted this
while working on related code. Follow up to #1260.
## Tests
Unit tests.
## Changes
This removes the need for the `allowMissingKeyInMap` option to the
private `visit` function and ensures that the body of the visit function
doesn't add or remove values of the configuration it traverses.
This in turn prepares for visiting a path pattern that yields more than
one callback, which doesn't match well with the now-removed option.
## Tests
Unit tests pass and fully cover the inlined code.
## Changes
This change means the callback supplied to `dyn.Foreach` can introspect
the path of the value it is being called for. It also prepares for
allowing visiting path patterns where the exact path is not known
upfront.
## Tests
Unit tests.
## Changes
With the current template, we can't execute the Python file and the jobs
notebook using DBConnect from VSCode because we import `from pyspark.sql
import SparkSession`, which doesn't support Databricks unified auth.
This PR fixes this by passing spark into the library code and by
explicitly instantiating a spark session where the spark global is not
available.
Other changes:
* add auto-reload to notebooks
* add DLT typings for code completion
Check if `bundle.tf.json` doesn't exist and create it before executing
`terraform init` (inside `terraform.Load`)
Fixes a problem when during `terraform.Load` it fails with:
```
Error: Failed to load plugin schemas
Error while loading schemas for plugin components: Failed to obtain provider
schema: Could not load the schema for provider
registry.terraform.io/databricks/databricks: failed to instantiate provider
"registry.terraform.io/databricks/databricks" to obtain schema: unavailable
provider "registry.terraform.io/databricks/databricks"..
```
## Changes
Upgrade Terraform provider to 1.37.0
Currently we're using 1.36.2 version which uses Go SDK 0.30 which does
not have U2M enabled for all clouds.
Upgrading to 1.37.0 allows TF provider (and thus DABs) to use U2M
Fixes#1231
## Changes
Fixes an issue when `compute_id` is defined in the bundle config,
correctly replaced in `validate` command but not used in `deploy`
command
## Tests
Manually
## Changes
Currently, when the CLI run a list API call (like list jobs), it uses
the `List*All` methods from the SDK, which list all resources in the
collection. This is very slow for large collections: if you need to list
all jobs from a workspace that has 10,000+ jobs, you'll be waiting for
at least 100 RPCs to complete before seeing any output.
Instead of using List*All() methods, the SDK recently added an iterator
data structure that allows traversing the collection without needing to
completely list it first. New pages are fetched lazily if the next
requested item belongs to the next page. Using the List() methods that
return these iterators, the CLI can proactively print out some of the
response before the complete collection has been fetched.
This involves a pretty major rewrite of the rendering logic in `cmdio`.
The idea there is to define custom rendering logic based on the type of
the provided resource. There are three renderer interfaces:
1. textRenderer: supports printing something in a textual format (i.e.
not JSON, and not templated).
2. jsonRenderer: supports printing something in a pretty-printed JSON
format.
3. templateRenderer: supports printing something using a text template.
There are also three renderer implementations:
1. readerRenderer: supports printing a reader. This only implements the
textRenderer interface.
2. iteratorRenderer: supports printing a `listing.Iterator` from the Go
SDK. This implements jsonRenderer and templateRenderer, buffering 20
resources at a time before writing them to the output.
3. defaultRenderer: supports printing arbitrary resources (the previous
implementation).
Callers will either use `cmdio.Render()` for rendering individual
resources or `io.Reader` or `cmdio.RenderIterator()` for rendering an
iterator. This separate method is needed to safely be able to match on
the type of the iterator, since Go does not allow runtime type matches
on generic types with an existential type parameter.
One other change that needs to happen is to split the templates used for
text representation of list resources into a header template and a row
template. The template is now executed multiple times for List API
calls, but the header should only be printed once. To support this, I
have added `headerTemplate` to `cmdIO`, and I have also changed
`RenderWithTemplate` to include a `headerTemplate` parameter everywhere.
## Tests
- [x] Unit tests for text rendering logic
- [x] Unit test for reflection-based iterator construction.
---------
Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
## Changes
```
shreyas.goenka@THW32HFW6T cli % databricks fs -h
Commands to do file system operations on DBFS and UC Volumes.
Usage:
databricks fs [command]
Available Commands:
cat Show file content.
cp Copy files and directories.
ls Lists files.
mkdir Make directories.
rm Remove files and directories.
```
This PR adds support for UC Volumes to the fs commands. The fs commands
for UC volumes work the same as they currently do for DBFS. This is
ensured by running the same test matrix we across both DBFS and UC
Volumes versions of the fs commands.
## Tests
Support for UC volumes is tested by running the same tests as we did
originally for DBFS commands. The tests require a `main` catalog to
exist in the workspace, which does in our test workspaces environments
which have the `TEST_METASTORE_ID` environment variable set.
For the Files API filer, we do the same by running mostly common tests
to ensure the filers for "local", "wsfs", "dbfs" and "files API" are
consistent.
The tests are also made to all run in parallel to reduce the time taken.
To ensure the separation of the tests, each test creates its own UC
schema (for UC volumes tests) or DBFS directories (for DBFS tests).
## Changes
This adds a `default-sql` template!
In this latest revision, I've hidden the new template from the list so
we can merge it, iterate over it, and properly release the template at
the right time.
- [x] WorkspaceFS support for .sql files is in prod
- [x] SQL extension is preconfigured based on extension settings (if
possible)
- [ ] Streaming tables support is either ungated or the template
provides instructions about signup
- _Mitigation for now: this template is hidden from the list of
templates._
- [x] Support non-UC workspaces
## Tests
- [x] Unit tests
- [x] Manual testing
- [x] More manual testing
- [x] Reviewer testing
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>
## Changes
This change enables the use of bundle variables for boolean, integer,
and floating point fields.
## Tests
* Unit tests.
* I ran a manual test to confirm parameterizing the number of workers in
a cluster definition works.
## Changes
This adds a new dbt-sql template. This work requires the new WorkspaceFS
support for dbt tasks.
In this latest revision, I've hidden the new template from the list so
we can merge it, iterate over it, and propertly release the template at
the right time.
Blockers:
- [x] WorkspaceFS support for dbt projects is in prod
- [x] Move dbt files into a subdirectory
- [ ] Wait until the next (>1.7.4) release of the dbt plugin which will
have major improvements!
- _Rather than wait, this template is hidden from the list of
templates._
- [x] SQL extension is preconfigured based on extension settings (if
possible)
- MV / streaming tables:
- [x] Add to template
- [x] Fix https://github.com/databricks/dbt-databricks/issues/535 (to be
released with in 1.7.4)
- [x] Merge https://github.com/databricks/dbt-databricks/pull/338 (to be
released with in 1.7.4)
- [ ] Fix "too many 503 errors" issue
(https://github.com/databricks/dbt-databricks/issues/570, internal
tracker: ES-1009215, ES-1014138)
- [x] Support ANSI mode in the template
- [ ] Streaming tables support is either ungated or the template
provides instructions about signup
- _Mitigation for now: this template is hidden from the list of
templates._
- [x] Support non-workspace-admin deployment
- [x] Make sure `data_security_mode: SINGLE_USER` works on non-UC
workspaces (it's required to be explicitly specified on UC workspaces
with single-node clusters)
- [x] Support non-UC workspaces
## Tests
- [x] Unit tests
- [x] Manual testing
- [x] More manual testing
- [ ] Reviewer manual testing
- _I'd like to do a small bug bash post-merging._
- [x] Unit tests
## Changes
This builds on #1098 and uses the `dyn.Value` representation of the
bundle configuration to generate the Terraform JSON definition of
resources in the bundle.
The existing code (in `BundleToTerraform`) was not great and in an
effort to slightly improve this, I added a package `tfdyn` that includes
dedicated files for each resource type. Every resource type has its own
conversion type that takes the `dyn.Value` of the bundle-side resource
and converts it into Terraform resources (e.g. a job and optionally its
permissions).
Because we now use a `dyn.Value` as input, we can represent and emit
zero-values that have so far been omitted. For example, setting
`num_workers: 0` in your bundle configuration now propagates all the way
to the Terraform JSON definition.
## Tests
* Unit tests for every converter. I reused the test inputs from
`convert_test.go`.
* Equivalence tests in every existing test case checks that the
resulting JSON is identical.
* I manually compared the TF JSON file generated by the CLI from the
main branch and from this PR on all of our bundles and bundle examples
(internal and external) and found the output doesn't change (with the
exception of the odd zero-value being included by the version in this
PR).
## Changes
This is a fundamental change to how we load and process bundle
configuration. We now depend on the configuration being represented as a
`dyn.Value`. This representation is functionally equivalent to Go's
`any` (it is variadic) and allows us to capture metadata associated with
a value, such as where it was defined (e.g. file, line, and column). It
also allows us to represent Go's zero values properly (e.g. empty
string, integer equal to 0, or boolean false).
Using this representation allows us to let the configuration model
deviate from the typed structure we have been relying on so far
(`config.Root`). We need to deviate from these types when using
variables for fields that are not a string themselves. For example,
using `${var.num_workers}` for an integer `workers` field was impossible
until now (though not implemented in this change).
The loader for a `dyn.Value` includes functionality to capture any and
all type mismatches between the user-defined configuration and the
expected types. These mismatches can be surfaced as validation errors in
future PRs.
Given that many mutators expect the typed struct to be the source of
truth, this change converts between the dynamic representation and the
typed representation on mutator entry and exit. Existing mutators can
continue to modify the typed representation and these modifications are
reflected in the dynamic representation (see `MarkMutatorEntry` and
`MarkMutatorExit` in `bundle/config/root.go`).
Required changes included in this change:
* The existing interpolation package is removed in favor of
`libs/dyn/dynvar`.
* Functionality to merge job clusters, job tasks, and pipeline clusters
are now all broken out into their own mutators.
To be implemented later:
* Allow variable references for non-string types.
* Surface diagnostics about the configuration provided by the user in
the validation output.
* Some mutators use a resource's configuration file path to resolve
related relative paths. These depend on `bundle/config/paths.Path` being
set and populated through `ConfigureConfigFilePath`. Instead, they
should interact with the dynamically typed configuration directly. Doing
this also unlocks being able to differentiate different base paths used
within a job (e.g. a task override with a relative path defined in a
directory other than the base job).
## Tests
* Existing unit tests pass (some have been modified to accommodate)
* Integration tests pass
## Changes
When resolving a value returned by the lookup function, the code would
call into `resolveRef` with the key that `resolveKey` was called with.
In doing so, it would cache the _new_ ref under that key.
We fix this by caching ref resolution only at the top level and relying
on lookup caching to avoid duplicate work.
This came up while testing #1098.
## Tests
Unit test.
## Changes
This is a follow-up to #1211 prompted by the addition of a recursive
type in the Go SDK v0.31.0 (`jobs.ForEachTask`).
When populating missing fields with their zero values we must not
inadvertently recurse into a recursive type.
## Tests
New unit test fails with a stack overflow if the fix if the check is
disabled.
## Changes
We plan to use the any-equivalent of a `dyn.Value` such that we can use
variable references for non-string fields (e.g.
`${databricks_job.some_job.id}` where an integer is expected), as well
as properly emit zero values for primitive types (e.g. 0 for integers or
false for booleans).
This change is in preparation for the above.
## Tests
Unit tests.
## Changes
* Update `go.mod` with latest dependencies
* Update `go.mod` to require Go 1.21 to match root `go.mod`
* Regenerate structs for Terraform provider v1.36.2
## Tests
n/a
## Changes
This feature supports variable lookups in a `dyn.Value` that are present
in the type but haven't been initialized with a value.
For example: `${bundle.git.origin_url}` is present in the `dyn.Value`
only if it was assigned a value. If it wasn't assigned a value it should
resolve to the empty string. This normalization option, when set,
ensures that all fields that are represented in the specified type are
present in the return value.
This change is in support of #1098.
## Tests
Added unit test.
These fields (key and values) needs to be double quoted in order for
yaml loader to read, parse and unmarshal it into Go struct correctly
because these fields are `map[string]string` type.
## Tests
Added regression unit and E2E tests
## Changes
Added `bundle deployment bind` and `unbind` command.
This command allows to bind bundle-defined resources to existing
resources in Databricks workspace so they become DABs-managed.
## Tests
Manually + added E2E test
## Changes
The OpenAPI spec used to generate the CLI doesn't match the version used
for the SDK version that the CLI currently depends on. This PR
regenerates the CLI based on the same version of the OpenAPI spec used
by the SDK on v0.30.1.
## Tests
<!-- How is this tested? -->
## Changes
Bundle schema generation does not support recursive API fields. This PR
skips generation for for_each_task until we add proper support for
recursive types in the bundle schema.
## Tests
Manually. This fixes the generation of the CLI and the bundle schema
command works as expected, with the sub-schema for `for_each_task` being
set to null in the output.
```
"for_each_task": null,
```