## Changes
The integration test for `fs ls` tried to produce completions for the
temporary directory without a trailing slash. The command will list the
parent directory to see if there are more directories with the same
prefix that are also valid completions. The test intends to test
completions for files inside the temporary directory, so we need that
trailing slash.
This test was hanging because the parent directory of the temporary DBFS
directory has more than 1k entries (on our integration testing
workspaces) and the line buffer in the test runner has a capacity of 1k
lines. To avoid the same hang in the future, this change modifies the
test runner to panic if the line buffer is full.
## Tests
Confirmed the integration test no longer hangs.
## Changes
This PR adds autocomplete for cat, cp, ls, mkdir and rm.
The new completer can do completion for any `Filer`. The command
completion for the `sync` command can be moved to use this general
completer as a follow-up.
## Tests
- Tested manually against a workspace
- Unit tests
## Changes
This didn't work as expected because the generic build mutator called
into the type-specific build mutator in the middle of the function. This
invalidated the `config.Artifact` pointer that was being mutated later
on, effectively hiding these mutations from its caller.
To fix this, I turned glob expansion into its own mutator. It now works
as expected, _and_ produces better errors if the glob patterns are
invalid or do not match files.
## Tests
Unit tests.
Manual verification:
```
% databricks bundle deploy
Building sbt_example...
Error: target/scala-2.12/sbt-e[xam22ple*.jar: syntax error in pattern
at artifacts.sbt_example.files[1].source
in databricks.yml:15:17
```
## Changes
This change enables overriding the default value of job parameters in
target overrides.
This is the same approach we already take for job clusters and job
tasks.
Closes#1620.
## Tests
Mutator unit tests and lightweight end-to-end tests.
## Changes
In https://github.com/databricks/cli/pull/1618 we introduced prepare
step in which Python wheel folder was cleaned. Now it was cleaned
everytime instead of only when there is a build command how it is used
to work.
This PR fixes it by only cleaning up dist folder when there is a build
command for wheels.
Fixes#1638
## Tests
Added regression test
Bumps [golang.org/x/oauth2](https://github.com/golang/oauth2) from
0.21.0 to 0.22.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="6d8340f1c5"><code>6d8340f</code></a>
LICENSE: update per Google Legal</li>
<li>See full diff in <a
href="https://github.com/golang/oauth2/compare/v0.21.0...v0.22.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/oauth2&package-manager=go_modules&previous-version=0.21.0&new-version=0.22.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [golang.org/x/mod](https://github.com/golang/mod) from 0.19.0 to
0.20.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="bc151c4e8c"><code>bc151c4</code></a>
README: fix link to x/tools</li>
<li><a
href="d1f873e3c1"><code>d1f873e</code></a>
modfile: fix Cleanup clobbering Line reference</li>
<li><a
href="b56a28f8bd"><code>b56a28f</code></a>
modfile: Add support for tool lines</li>
<li><a
href="79169e9d36"><code>79169e9</code></a>
LICENSE: update per Google Legal</li>
<li>See full diff in <a
href="https://github.com/golang/mod/compare/v0.19.0...v0.20.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/mod&package-manager=go_modules&previous-version=0.19.0&new-version=0.20.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [golang.org/x/sync](https://github.com/golang/sync) from 0.7.0 to
0.8.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="411f99ef12"><code>411f99e</code></a>
LICENSE: update per Google Legal</li>
<li>See full diff in <a
href="https://github.com/golang/sync/compare/v0.7.0...v0.8.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/sync&package-manager=go_modules&previous-version=0.7.0&new-version=0.8.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Changes
A new Service Control Policy has removed the `ec2.RunInstances`
permission from our service principal for our AWS integration tests.
This PR switches over to using the instance pool which does not require
creating new clusters.
## Tests
The integration tests pass now.
# Changes
With https://github.com/databricks/cli/pull/1413 we started to compute
and partially print the plan if it contained deletion of UC schemas.
This PR uses the precomputed plan to avoid double planning when actually
doing the terraform plan.
This fixes a performance regression introduced in
https://github.com/databricks/cli/pull/1413.
# Tests
Tested manually.
1. Verified bundle deployment still works and deploys resources.
2. Verified that the precomputed plan is indeed being used by attaching
a debugger and removing the plan file right before the terraform apply
process is spawned and asserting that terraform apply fails because the
plan is not found.
## Changes
This PR adds support for UC Schemas to DABs. This allows users to define
schemas for tables and other assets their pipelines/workflows create as
part of the DAB, thus managing the life-cycle in the DAB.
The first version has a couple of intentional limitations:
1. The owner of the schema will be the deployment user. Changing the
owner of the schema is not allowed (yet). `run_as` will not be
restricted for DABs containing UC schemas. Let's limit the scope of
run_as to the compute identity used instead of ownership of data assets
like UC schemas.
2. API fields that are present in the update API but not the create API.
For example: enabling predictive optimization is not supported in the
create schema API and thus is not available in DABs at the moment.
## Tests
Manually and integration test. Manually verified the following work:
1. Development mode adds a "dev_" prefix.
2. Modified status is correctly computed in the `bundle summary`
command.
3. Grants work as expected, for assigning privileges.
4. Variable interpolation works for the schema ID.
## Changes
Add upgrade and upgrade eager flags to pip install call for Databricks
labs projects. See [this
documentation](https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-U)
for more information about the flags.
Resolves#1634
## Tests
- [x] Manually
## Changes
This PR:
1. Uses dynamic walking (via the `dyn.MapByPattern` func) to validate no
two resources have the same resource key. The allows us to remove this
validation at merge time.
2. Modifies `dyn.Mapping` to always return a sorted slice of pairs. This
makes traversal functions like `dyn.Walk` or `dyn.MapByPattern`
deterministic.
## Tests
Unit tests. Also manually.
## Changes
Some diagnostics can have multiple paths associated with them. For
instance, ensuring that unique resource keys are used across all
resources. This PR extends `diag.Diagnostic` to accept multiple paths.
This PR is symmetrical to
https://github.com/databricks/cli/pull/1610/files
## Tests
Unit tests
The install script might require the up-to-date Python dependencies,
explained in more detail in the referenced issue below
Fixes#1623
## Tests
! Need support with testing !
## Changes
Right now we ask users for two confirmations when destroying a bundle.
One to destroy the resources and one to delete the files. This PR
consolidates the two prompts into one.
## Tests
Manually
Destroying a bundle with no resources:
```
➜ bundle-playground git:(master) ✗ cli bundle destroy
All files and directories at the following location will be deleted: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default
Would you like to proceed? [y/n]: y
No resources to destroy
Updating deployment state...
Deleting files...
Destroy complete!
```
Destroying a bundle with no remote state:
```
➜ bundle-playground git:(master) ✗ cli bundle destroy
No active deployment found to destroy!
```
When a user cancells a deployment:
```
➜ bundle-playground git:(master) ✗ cli bundle destroy
The following resources will be deleted:
delete job job_1
delete job job_2
delete pipeline foo
All files and directories at the following location will be deleted: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default
Would you like to proceed? [y/n]: n
Destroy cancelled!
```
When a user destroys resources:
```
➜ bundle-playground git:(master) ✗ cli bundle destroy
The following resources will be deleted:
delete job job_1
delete job job_2
delete pipeline foo
All files and directories at the following location will be deleted: /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default
Would you like to proceed? [y/n]: y
Updating deployment state...
Deleting files...
Destroy complete!
```
## Changes
Now prepare stage which does cleanup is execute once before every build,
so artifacts built into the same folder are correctly kept
Fixes workaround 2 from this issue #1602
## Tests
Added unit test
## Changes
This PR changes `diag.Diagnostics` to allow including multiple locations
associated with the diagnostic message. The diagnostics that now return
multiple locations with this PR are:
1. Warning for unknown keys in config.
2. Use of experimental.run_as
3. Accidental sync.exludes that exclude all files.
## Tests
Existing unit tests pass. New unit test case to assert on error message
when multiple locations are included.
Example output:
```
➜ bundle-playground-2 ~/cli2/cli/cli bundle validate
Warning: You are using the legacy mode of run_as. The support for this mode is experimental and might be removed in a future release of the CLI. In order to run the DLT pipelines in your DAB as the run_as user this mode changes the owners of the pipelines to the run_as identity, which requires the user deploying the bundle to be a workspace admin, and also a Metastore admin if the pipeline target is in UC.
at experimental.use_legacy_run_as
in resources.yml:10:22
databricks.yml:13:22
Name: fix run_if
Target: default
Workspace:
User: shreyas.goenka@databricks.com
Path: /Users/shreyas.goenka@databricks.com/.bundle/fix run_if/default
Found 1 warning
```
## Changes
Add support for google/uuid.New() to DAB templates.
This is needed to generate UUIDs in downstream templates like MLOps
Stacks.
## Tests
Unit tests.
## Changes
By default, construct a read/write instance. If constructed in read-only
mode, the underlying filer is wrapped in a readahead cache.
## Tests
* Filer integration tests pass.
* Manual test that caching is enabled when running on WSFS.
## Changes
DABs deployments should be isolated if `root_path` and workspace host
are different. This PR fixes a bug where local terraform state gets
piggybacked if the same cwd is used to deploy two isolated deployments
for the same bundle target. This can happen if:
1. A user switches to a different identity on the same machine.
2. The workspace host URL the bundle/target points to is changed.
3. A user changes the `root_path` while doing bundle development.
To solve this problem we rely on the lineage field available in the
terraform state, which is a uuid identifying unique terraform
deployments. There's a 1:1 mapping between a terraform deployment and a
bundle deployment.
For more details on how lineage works in terraform, see:
https://developer.hashicorp.com/terraform/language/state/backends#manual-state-pull-push
## Tests
Manually verified that changing the identity no longer results in the
incorrect terraform state being used. Also, new unit tests are added.
## Changes
The reason this readahead cache exists is that we frequently need to
recursively find all files in the bundle root directory, especially for
sync include and exclude processing. By caching the response for every
file/directory and frontloading the latency cost of these calls, we
significantly improve performance and eliminate redundant operations.
## Tests
* [ ] Working on unit tests
## Changes
This PR adds cli to the user agent sent downstream to the databricks
terraform provider when invoked via DABs.
## Tests
Unit tests. Based on the comment here
(10fe02075f/bundle/config/mutator/verify_cli_version_test.go (L113))
we don't need to set the version to make the test assertion work
correctly. This is likely because we use `go test` to run the tests
while the CLI is compiled and the version is set via `goreleaser`.
## Changes
This PR fixes a performance bug that led downloaded files (e.g. with
`databricks fs cp dbfs:/Volumes/.../somefile .`) to be buffered in
memory before being written.
Results from profiling the download of a ~100MB file:
Before:
```
Type: alloc_space
Showing nodes accounting for 374.02MB, 98.50% of 379.74MB total
```
After:
```
Type: alloc_space
Showing nodes accounting for 3748.67kB, 100% of 3748.67kB total
```
Note that this fix is temporary. A longer term solution should be to use
the API provided by the Go SDK rather than making an HTTP request
directly from the CLI.
fix#1575
## Tests
Verified that the CLI properly download the file when doing the
profiling.
## Changes
This PR changes the location metadata associated with a `dyn.Value` to a
slice of locations. This will allow us to keep track of location
metadata across merges and overrides.
The convention is to treat the first location in the slice as the
primary location. Also, the semantics are the same as before if there's
only one location associated with a value, that is:
1. For complex values (maps, sequences) the location of the v1 is
primary in Merge(v1, v2)
2. For primitive values the location of v2 is primary in Merge(v1, v2)
## Tests
Modifying existing merge unit tests. Other existing unit tests and
integration tests pass.
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
We need a mechanism to invalidate the locally cached deployment state if
a user uses the same working directory to deploy to multiple distinct
deployments (separate targets, root_paths or even hosts).
This PR just adds the UUID to the deployment state in preparation for
invalidating this cache. The actual invalidation will follow up at a
later date (tracked in internal backlog).
## Tests
Unit test. Manually checked the deployment state is actually being
written.
## Changes
This change allows to specify UC volumes path as an artifact paths so
all artifacts (JARs, wheels) are uploaded to UC Volumes.
Example configuration is here:
```
bundle:
name: jar-bundle
workspace:
host: https://foo.com
artifact_path: /Volumes/main/default/foobar
artifacts:
my_java_code:
path: ./sample-java
build: "javac PrintArgs.java && jar cvfm PrintArgs.jar META-INF/MANIFEST.MF PrintArgs.class"
files:
- source: ./sample-java/PrintArgs.jar
resources:
jobs:
jar_job:
name: "Test Spark Jar Job"
tasks:
- task_key: TestSparkJarTask
new_cluster:
num_workers: 1
spark_version: "14.3.x-scala2.12"
node_type_id: "i3.xlarge"
spark_jar_task:
main_class_name: PrintArgs
libraries:
- jar: ./sample-java/PrintArgs.jar
```
## Tests
Manually + added E2E test for Java jobs
E2E test is temporarily skipped until auth related issues for UC for
tests are resolved
## Changes
Print diagnostics in 'bundle deploy' similar to 'bundle validate'. This
way if a bundle has any errors or warnings, they are going to be easy to
notice.
NB: due to how we render errors, there is one extra trailing new line in
output, preserved in examples below
## Example: No errors or warnings
```
% databricks bundle deploy
Building default...
Deploying resources...
Updating deployment state...
Deployment complete!
```
## Example: Error on load
```
% databricks bundle deploy
Error: Databricks CLI version constraint not satisfied. Required: >= 1337.0.0, current: 0.0.0-dev
```
## Example: Warning on load
```
% databricks bundle deploy
Building default...
Deploying resources...
Updating deployment state...
Deployment complete!
Warning: unknown field: foo
in databricks.yml:6:1
```
## Example: Error + warning on load
```
% databricks bundle deploy
Warning: unknown field: foo
in databricks.yml:6:1
Error: something went wrong
```
## Example: Warning on load + error in init
```
% databricks bundle deploy
Warning: unknown field: foo
in databricks.yml:6:1
Error: Failed to xxx
in yyy.yml
Detailed explanation
in multiple lines
```
## Tests
Tested manually
## Changes
Add regression tests for https://github.com/databricks/cli/issues/1563
We test 2 code paths:
- if there is an error, we can print to stderr
- if there is a valid output, we can print to stdout
We should also consider adding black-box tests that will run the CLI
binary as a black box and inspect its output to stderr/stdout.
## Tests
Unit tests
## Changes
If we're using a `vfs.Path` backed by a workspace filesystem filer, we
have access to the `workspace.ObjectInfo` value for every file. By
providing access to this value we can use it directly and avoid reading
the first line of the underlying file.
A follow-up change will implement the interface defined in this change
for the workspace filesystem filer.
## Tests
Unit tests.