- Change GitRepositoryInfo.WorktreeRoot to string
- Restore original warnings logs for when branch and commit could not be read
- Add new logs for when repository could not be read. However, if we can figure out git root then use it, even if parsing later fails.
Since there is no .git directory in Workspace file system, we need to make
an API call to fetch git checkout status (root of the repo, current branch, etc).
(api/2.0/workspace/get-status?return_git_info=true).
Refactor Repository to accept repository root rather than calculate it.
This helps, because Repository is currently created in multiple places and
finding the repository root is expensive.
## Changes
This PR adds support for UC volumes to DABs.
### Can I use a UC volume managed by DABs in `artifact_path`?
Yes, but we require the volume to exist before being referenced in
`artifact_path`. Otherwise you'll see an error that the volume does not
exist. For this case, this PR also adds a warning if we detect that the
UC volume is defined in the DAB itself, which informs the user to deploy
the UC volume in a separate deployment first before using it in
`artifact_path`.
We cannot create the UC volume and then upload the artifacts to it in
the same `bundle deploy` because `bundle deploy` always uploads the
artifacts to `artifact_path` before materializing any resources defined
in the bundle. Supporting this in a single deployment requires us to
migrate away from our dependency on the Databricks Terraform provider to
manage the CRUD lifecycle of DABs resources.
### Why do we not support `preset.name_prefix` for UC volumes?
UC volumes will not have a `dev_shreyas_goenka` prefix added in `mode:
development`. Configuring `presets.name_prefix` will be a no-op for UC
volumes. We have decided not to support prefixing for UC resources. This
is because:
1. UC provides its own namespace hierarchy that is independent of DABs.
2. Users can always manually use `${workspace.current_user.short_name}`
to configure the prefixes manually.
Customers often manually set up a UC hierarchy for dev and prod,
including a schema or catalog per developer. Thus, it's often
unnecessary for us to add prefixing in `mode: development` by default
for UC resources.
In retrospect, supporting prefixing for UC schemas and registered models
was a mistake and will be removed in a future release of DABs.
## Tests
Unit, integration test, and manually.
### Manual Testing cases:
1. UC volume does not exist:
```
➜ bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/my_volume that is configured in the artifact_path: Not Found
```
2. UC Volume does not exist, but is defined in the DAB
```
➜ bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/managed_by_dab that is configured in the artifact_path: Not Found
Warning: You might be using a UC volume in your artifact_path that is managed by this bundle but which has not been deployed yet. Please deploy the UC volume in a separate bundle deploy before using it in the artifact_path.
at resources.volumes.bar
in databricks.yml:24:7
```
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
This PR adds the `bundle_uuid` helper function that'll return a stable
identifier for the bundle for the duration of the `bundle init` command.
This is also the UUID that'll be set in the telemetry event sent during
`databricks bundle init` and would be used to correlate revenue from
bundle init with resource deployments.
Template authors should add the uuid field to their `databricks.yml`
file they generate:
```
bundle:
# A stable identified for your DAB project. We use this UUID in the Databricks backend
# to correlate and identify multiple deployments of the same DAB project.
uuid: {{ bundle_uuid }}
```
## Tests
Unit test
This will require API call when run inside a workspace, which will
require workspace client (we don't have one at the current point). We
want to keep Load phase quick, since it's common across all commands.
## Changes
This PR introduces use of new `isNil` method. It allows to ensure we
filter out all improperly defined resources in `bundle summary` command.
This includes deleted resources or resources with incorrect
configuration such as only defining key of the resource and nothing
else.
Fixes#1919, #1913
## Tests
Added regression unit test case
## Changes
The built-in template contains a reference to `${bundle.environment}`.
This property has been deprecated in favor of `${bundle.target}` a long
time ago (#670), so we should no longer emit it. The environment field
will continue to be usable until we cut a new major version in some far
away future.
## Tests
* Unit tests
* The test `TestInterpolationWithTarget` still covers correct
interpolation of `${bundle.environment}`
Bumps
[github.com/Masterminds/semver/v3](https://github.com/Masterminds/semver)
from 3.3.0 to 3.3.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/Masterminds/semver/releases">github.com/Masterminds/semver/v3's
releases</a>.</em></p>
<blockquote>
<h2>v3.3.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix for allowing some version that were invalid by <a
href="https://github.com/mattfarina"><code>@mattfarina</code></a> in <a
href="https://redirect.github.com/Masterminds/semver/pull/253">Masterminds/semver#253</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/Masterminds/semver/compare/v3.3.0...v3.3.1">https://github.com/Masterminds/semver/compare/v3.3.0...v3.3.1</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Masterminds/semver/blob/master/CHANGELOG.md">github.com/Masterminds/semver/v3's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1558ca3488"><code>1558ca3</code></a>
Merge pull request <a
href="https://redirect.github.com/Masterminds/semver/issues/253">#253</a>
from mattfarina/fix-bad-versions</li>
<li><a
href="252dd61dd3"><code>252dd61</code></a>
Fix for allowing some version that were invalid</li>
<li>See full diff in <a
href="https://github.com/Masterminds/semver/compare/v3.3.0...v3.3.1">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/Masterminds/semver/v3&package-manager=go_modules&previous-version=3.3.0&new-version=3.3.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [golang.org/x/term](https://github.com/golang/term) from 0.25.0 to
0.26.0.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b725e362a8"><code>b725e36</code></a>
go.mod: update golang.org/x dependencies</li>
<li><a
href="54df7da90d"><code>54df7da</code></a>
README: don't recommend go get</li>
<li>See full diff in <a
href="https://github.com/golang/term/compare/v0.25.0...v0.26.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=golang.org/x/term&package-manager=go_modules&previous-version=0.25.0&new-version=0.26.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Changes
This PR adds a warning validating that the configuration for a single
node cluster is valid for interactive, job, job-task, and pipeline
clusters.
Note: We skip the validation if a cluster policy is configured because
the policy is likely to configure `spark_conf` / `custom_tags` itself.
Note: Terrform originally only had validation for interactive, job, and
job-task clusters. This PR adding the validation for pipeline clusters
as well is new.
This PR follows the same logic as we used to have in Terraform. The
validation was removed from Terraform because we had no way to demote
the error to a warning:
https://github.com/databricks/terraform-provider-databricks/pull/4222
### Background
Single-node clusters require `spark_conf` and `custom_tags` to be
correctly set in the cluster definition for them to function optimally.
The cluster will be created even if incorrectly configured, but its
performance will not be great.
For example, if both `spark_conf` and `custom_tags` are not set and
`num_workers` is 0, then only the driver process will be launched on the
cluster compute instance thus leading to sub-optimal utilization of
available compute resources and no parallelization across worker
processes when processing a spark query.
### Issue
This PR addresses some issues reported in
https://github.com/databricks/cli/issues/1546
## Tests
Unit tests and manually.
Example output of the warning:
```
➜ bundle-playground git:(master) ✗ cli bundle validate
Warning: Single node cluster is not correctly configured
at resources.pipelines.bar.clusters[0]
in databricks.yml:29:11
num_workers should be 0 only for single-node clusters. To create a
valid single node cluster please ensure that the following properties
are correctly set in the cluster specification:
spark_conf:
spark.databricks.cluster.profile: singleNode
spark.master: local[*]
custom_tags:
ResourceClass: SingleNode
Name: foobar
Target: default
Workspace:
User: shreyas.goenka@databricks.com
Path: /Workspace/Users/shreyas.goenka@databricks.com/.bundle/foobar/default
Found 1 warning
```
## Changes
Users can configure the bundle to not synchronize any files with:
```yaml
sync:
paths: []
```
If it is explicitly configured as an empty list, the validate command
must not warn about not having any files to synchronize. The warning
exists to alert users who are unintentionally not synchronizing any
files (they might have a `.gitignore` pattern that matches everything).
Closes#1663.
## Tests
* New unit test.
## Changes
The ML production team modified mlops-stack to use `mode: development`
for their development target here:
https://github.com/databricks/mlops-stacks/pull/174
This PR makes the integration test assertion agnostic of the prefix to
make it pass again.
## Tests
The test passes now
## Changes
The full workspace path for a notebook does not contain the notebook's
extension. If a user converts that file path to a relative path (like
`/Workspace/bundle_root/bar/nb` -> `./bar/nb`), they can be confused as
to why the new file path does not work.
The changes in this PR nudge them to add the appropriate file extension
(e.g., `./bar/nb.py` or `./bar/nb.ipynb`).
One common way users can end up in this scenario is by using the view
job as YAML functionality in the Databricks UI.
## Tests
Unit test and manually.
```
(.venv) ➜ bundle-playground git:(master) ✗ cli bundle validate
Error: notebook ./foo not found. Local notebook references are expected
to contain one of the following file extensions: [.py, .r, .scala, .sql, .ipynb]
```
## Changes
While looking into adding variable lookups for notification destinations
([API][API]), I found the codegen approach for different classes of
variable lookups a bit complex. The template had a custom field override
(for service principals), the package had an override for the cluster
lookup, and it didn't produce tests.
The notification destinations API uses a default page size of 20 for
listing. I want to use a larger page size to limit the number of API
calls, so that would imply another customization on the template or a
manual override.
This code being rather mechanical, I used copilot to produce all
instances of the resolvers and their tests (after writing one of them
manually).
[api]: https://docs.databricks.com/api/workspace/notificationdestinations
## Tests
* Unit tests pass
* Manual confirmation that lookups of warehouses still work
## Changes
Integration tests using these fixtures could have been flaky when run in
parallel using the same user's identity. They would also possibly have
piggybacked state from previous runs.
This PR adds a UUID to the root_path to force independent bundle
deployments for every test run.
I have checked that all bundles in `internal/bundle/bundles` have
`root_path` namespaced to a UUID.
## Tests
Self testing.
## Changes
Update filenames used by bundle generate to use '.resource-type.yml'
Similar to [Add sub-extension to resource files in built-in templates by
shreyas-goenka · Pull Request #1777 ·
databricks/cli](https://github.com/databricks/cli/pull/1777)
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
## Changes
Added integration test to deploy bundle to /Shared root path
## Tests
```
--- PASS: TestAccDeployBasicToSharedWorkspace (24.58s)
PASS
coverage: 31.2% of statements in ./...
ok github.com/databricks/cli/internal/bundle 25.572s coverage: 31.2% of statements in ./...
```
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
## Changes
This change adds a preset for source-linked deployments. It is enabled
by default for targets in `development` mode **if** the Databricks CLI
is running from the `/Workspace` directory on DBR. It does not have an
effect when running the CLI anywhere else.
Key highlights:
1. Files in this mode won't be uploaded to workspace
2. Created resources will use references to source files instead of
their workspace copies
## Tests
1. Apply preset unit test covering conditional logic
2. High-level process target mode unit test for testing integration
between mutators
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
When running the CLI on Databricks Runtime (DBR), use the
extension-aware filer to write an instantiated template if the instance
path is located in the workspace filesystem.
Notebooks cannot be written through the workspace filesystem's FUSE
mount. As a result, this is the only method for initializing templates
that contain notebooks when running the CLI on DBR and writing to the
workspace filesystem.
Depends on #1910 and #1911.
Supersedes #1744.
## Tests
* Manually confirmed I can initialize a template with notebooks when
running the CLI from the web terminal.
## Changes
Prior to this change, the output directory was part of the `renderer`
type and passed down to every `file` it produced. Every file knew its
absolute destination path. This is incompatible with the use of a filer,
where all operations are automatically anchored to some base path.
To make this compatible, this change updates:
* the `file` type to only know its own path relative to the instantiation root,
* the `renderer` type to no longer require or pass along the output directory,
* the `persistToDisk` function to take a context and filer argument,
* the `filer.WriteMode` to represent permission bits
## Tests
* Existing tests pass.
* Manually confirmed template initialization works as expected.
## Changes
While working on the v2 of #1744, I found that:
* Template initialization first copies built-in templates to a temporary
directory before initializing them
* Reading a template's contents goes through a `filer.Filer` but is
hardcoded to a local one
This change updates the interface for reading templates to be `fs.FS`.
This is compatible with the `embed.FS` type for the built-in templates,
so they no longer have to be copied to a temporary directory before
being used.
The alternative is to use a `filer.Filer` throughout, but this would
have required even more plumbing, and we don't need to _read_ templates,
including notebooks, from the workspace filesystem (yet?).
As part of making `template.Materialize` take an `fs.FS` argument, the
logic to match a given argument to a particular built-in template in the
`init` command has moved to sit next to its implementation.
## Tests
Existing tests pass.
## Changes
The workspace extensions filer should not read or stat a notebook called
`foo` if the user calls `.Stat(ctx, "foo")`.
Instead, the filer should return a file not found error. This is because
the contract for the workspace extensions filer is to only work for
notebooks when the file path / name includes the extension (example:
`foo.ipynb` or `foo.sql` instead of just `foo`)
## Tests
Integration tests.
## Changes
We had a number of copies of test helpers for `io/fs` in the repository.
This change consolidates all of them to use the `libs/fakefs` package.
## Tests
Unit tests pass.
## Changes
This field was special-cased in #1307 because it's not part of the JSON
payload in the SDK struct.
This approach, while pragmatic, meant it didn't show up in the JSON
schema. While debugging an issue with quality monitors in #1900, I
couldn't figure out why I was getting schema errors on this field, or
how it was passed through to the TF representation. This commit removes
the special case and makes it behave like everything else.
## Tests
* Unit tests pass.
* Confirmed that the updated schema failed validation before this
change.