## Summary of changes
This PR introduces three new abstractions:
1. `Resolver`: Resolves which reader and writer to use for a template.
2. `Writer`: Writes a template project to disk. Prompts the user if
necessary.
3. `Reader`: Reads a template specification from disk, built into the
CLI or from GitHub.
Introducing these abstractions helps decouple reading a template from
writing it. When I tried adding telemetry for the `bundle init` command,
I noticed that the code in `cmd/init.go` was getting convoluted and hard
to test. A future change could have accidentally logged PII when a user
initialised a custom template.
Hedging against that risk is important here because we use a generic
untyped `map<string, string>` representation in the backend to log
telemetry for the `databricks bundle init`. Otherwise, we risk
accidentally breaking our compliance with our centralization
requirements.
### Details
After this PR there are two classes of templates that can be
initialized:
1. A `databricks` template: This could be a builtin template or a
template outside the CLI like mlops-stacks, which is still owned and
managed by Databricks. These templates log their telemetry arguments and
template name.
2. A `custom` template: These are templates created by and managed by
the end user. In these templates we do not log the template name and
args. Instead a generic placeholder string of "custom" is logged in our
telemetry system.
NOTE: The functionality of the `databricks bundle init` command remains
the same after this PR. Only the internal abstractions used are changed.
## Tests
New unit tests. Existing golden and unit tests. Also a fair bit of
manual testing.
## Changes
Add experimental-jobs-as-code template allowing defining jobs using
Python instead of YAML through the `databricks-bundles` PyPI package.
## Tests
Manually and acceptance tests.
## Changes
Setting Verbose=false on testcli.Runner disables all logging related to
running process (stdout, stderr, error, args).
I'm using this in #2184 where I'm using testcli runner to run acceptance
tests and seeing all output is not useful.
## Tests
Manually inspecting test output in #2184
## Changes
- Fix incorrect use Errorf on literal string. This resulted in garbage
output in tests diagnostics where % was replaced by "(MISSING)".
- Enable linter on testingT.Errorf.
Note, the autofix by the linter is wrong, it proposes `t.Errorf("%s",
string)` but it should be `t.Error(string)`. That can corrected manually
though.
## Tests
Linter was tested manually by reverting the fix on Errorf.
## Changes
Include a materialized copy of built-in templates as reference output.
This updates the output comparison logic to work against an output
directory. The `doComparison` function now always works on real files.
It can now tell apart non-existing files and empty files (e.g., the
`.gitkeep` files in templates).
## Changes
This includes a change to the defaults for the output directory flags of
the "generate" commands. These defaults included the expanded working
directory. This can be omitted because it is implied.
## Changes
The assertions on the output made are now captured in the `output.*`
files. These don't capture intent like actual assertions do, but we
still have regular test coverage in the path translation tests under
`bundle/config/mutator`.
## Tests
Tests pass.
## Changes
This came up in #2122 where relative library paths showed up with
backslashes on Windows. It's hard to run acceptance tests where paths
may be in either form. This change updates path translation logic to
always use forward slash-separated paths, including for absolute paths.
## Tests
* Unit tests pass.
* Confirmed that code where library paths are used uses the `filepath`
package for path manipulation. The functions in this package always
normalize their inputs to be platform-native paths.
* Confirmed that code that uses absolute paths works with forward
slash-separated paths on Windows.
## Changes
The materialized templates included in #2146 include Python code that we
require to be formatted. Instead of running ruff as part of the
testcase, we can enforce that all Python code in the repository is
formatted. It won't be possible to have a passing acceptance test for
template initialization with unformatted code.
## Changes
- Instead of doing 2 passes on variable resolution, do a loop until
there are no more updates (or we reach count 100).
- Stacked on top of #2163 which is a regression test for this:
acceptance/bundle/variables/complex-transitive-deep
## Tests
Existing tests, new regression tests.
These tests already passed before, added for completeness:
- acceptance/bundle/variables/cycle
- acceptance/bundle/variables/complex-cross-ref
## Changes
Fixes https://github.com/databricks/cli/issues/1977.
This PR modifies the bundle configuration to capture the dependency that
a UC Volume or a DLT pipeline might have on a UC schema at deployment
time. It does so by replacing the schema name with a reference of the
form `${resources.schemas.foo.name}`.
For example:
The following UC Volume definition depends on the UC schema with the
name `schema_name`. This mutator converts this configuration
from:
```
resources:
volumes:
bar:
catalog_name: catalog_name
name: volume_name
schema_name: schema_name
schemas:
foo:
catalog_name: catalog_name
name: schema_name
```
to:
```
resources:
volumes:
bar:
catalog_name: catalog_name
name: volume_name
schema_name: ${resources.schemas.foo.name}`
schemas:
foo:
catalog_name: catalog_name
name: schema_name
```
## Tests
Unit tests and manually.
Follow up to #2157. That PR repeated variable resolution. This test
still does not resolve fully but would resolve with 3 passes. This is
slightly different from complex-transitive-deeper - this test does not
show any errors, the issue is purely not enough passes.
## Changes
When users create one or more Databricks apps in their bundle it can
lead to initial bundle deployment being slower because apps need to wait
until their compute is fully provisioned and started.
This PR adds a message to warn about it. This message will be removed
when `no_compute` option becomes available in TF provider and used in
DABs (https://github.com/databricks/cli/pull/2144)
## Changes
Pipeline backend requires `target` to be always specified.
In order to do this we will create an unique schema as part of
`TestBundlePipelineRecreateWithoutAutoApprove` test which will be used
in pipelines
## Tests
```
helpers_test.go:148: stderr: Destroy complete!
--- PASS: TestBundlePipelineRecreateWithoutAutoApprove (415.39s)
PASS
coverage: [no statements]
ok github.com/databricks/cli/integration/bundle 416.141s coverage: [no statements]
```
## Changes
- Remove ResolveVariableReferencesInComplexVariables - it blocked
complex-within-complex for no good reason.
- Repeat regular resolution twice, it helps with a couple test cases we
have.
There may be a case for running it 3 times or more in a loop, but there
is no test case for that, so this PR is simple incremental improvement.
## Tests
Existing acceptance tests. Previously all unit tests for complex
variables were converted to acceptance tests, to capture this change and
ensure nothing breaks.
## Changes
It covers both https://$DATABRICKS_HOST and http://$DATABRICKS_HOST so
the test output does not change between local and the cloud.
## Tests
Existing tests using golden files (acceptance and integration) catch
this and were updated.
## Changes
When regenerating CLI with a new Go SDK
https://github.com/databricks/cli/pull/2126 I've noticed that some
parameters such as `no_compute` for apps are not added as flags for the
CLI commands.
This happened because we ignored all other top level fields if there's a
request body object field.
This PR relies on new AllFields method from Genkit which returns fields
from both request object and request body object.
## Changes
If before running an app, the app was stopped with an active deployment,
then Apps backend start it and does the auto-deploy of the last active
deployment. In such cases StartApp API won't return any active or
pending deployments in its response but doing the deploy immediately
after the start might result in the error `Cannot deploy app *** as
there is an active deployment in progress`.
From DABs side, we have to do a new deployment on every `bundle run`
(command which start the app and does deployment) because local files in
bundle might have been changed and users expect to have the app running
with new code.
Thus this PR works around the error by catching “deployment in progress”
error, getting any active / pending deployments, waits for them to
finish and start the new deployment again. If 2nd attempts fails, the
whole command fails.
## Tests
Added unit test.
## Changes
Replacement was split between the type `ReplacementContext` and the
`ReplaceOutput` function. The latter also ran a couple of regular
expressions. This change consolidates them such that it is up to the
caller to compose the set of replacements to use.
This change is required to accommodate UUID replacement in #2146.
## Changes
When setting up PATH in tests, put desired entry first but keep the rest
as well. Otherwise it fails on Windows
```
D:/a/cli/cli/libs/exec/exec_test.go:108
Error: Received unexpected error:
exit status 0xc0000135
```
Explanation from Claude:
> The error code 0xc0000135 is a Windows error indicating "Unable to
locate DLL"
> When code coverage is enabled, Go instruments the binary with coverage
tracking code, which requires additional DLL dependencies on Windows.
## Tests
Separate draft PR with coverage enabled on CI:
https://github.com/databricks/cli/pull/2141
The current default of building a binary is not frequently used.
The new default is useful after switching branches, stashing/unstashing,
AI changes.
Note "fmt" and "lint" are separate steps because "fmt" can fix
formatting and imports in presence of compilation errors and "lint"
cannot. Without compilation errors, "lint" also does formatting.
This test checks load git details functionality + variable interpolation
there.
The variables are not working there because LoadGitDetails mutator is
running before variable interpolation.
Additionally, correctly replace tmp path that is used for DATABRICKS_TF_EXEC_PATH
## Changes
Use name format "TestAccept/bundle/variables/host" (previously slashes
were reversed on Windows).
## Tests
Manually run "go test ./acceptance -v -run
TestAccept/bundle/variables/host" in Windows VM.
## Changes
To accommodate:
* Add the server URL to the set of output replacements
* Include a call to the permissions API to the dummy server
* Run the main script in a subshell to isolate working directory changes
## Changes
Always include both verbatim and json version of replacement.
This helps when the string in question contains \\ or other chars that
need to be quoted.
Needed for https://github.com/databricks/cli/pull/2118
## Tests
Existing tests.
## Changes
Add two new make commands:
- make acc-cover: runs acceptance tests and outputs
coverage-acceptance.txt
- make acc-showcover: show coverage-acceptance.txt locally in browser
Using the GOCOVERDIR functionality:
https://go.dev/blog/integration-test-coverage
This works, but there are a couple of issues encountered:
- GOCOVERDIR does not play well with regular "go test -cover". Once this
fixed, we can simplify the code and have 'make cover' output coverage
for everything at once. We can also probably get rid of CLI_GOCOVERDIR.
https://github.com/golang/go/issues/66225
- When running tests in parallel to the same directory there is rare
conflict on writing covmeta file. For this reason each tests writes
coverage to their own directory which is then merged together by 'make
acc-cover'.
<!-- Summary of your changes that are easy to understand --
## Tests
Manually running the new make commands.
## Changes
- If CLOUD_ENV variable is set, acceptance will no longer set up server
& override DATABRICKS_HOST/DATABRICKS_TOKEN/HOME env vars.
- I've updated replacements logic in testdiff to use tester /
tester@databricks.com convention.
## Tests
Manually running current acceptance tests against dogfood on my laptop I
get all test pass except for 2 failures.
```
--- FAIL: TestAccept/bundle/variables/env_overrides (0.09s)
--- FAIL: TestAccept/bundle/variables/resolve-builtin (1.30s)
```
## Changes
We perform a check during path translation that the path being
referenced is contained in the bundle's sync root. If it isn't, it's not
a valid remote reference. However, this doesn't apply to paths that are
_always_ local, such as the artifact path. An artifact's build command
is executed in its path. Files created by the artifact build (e.g.
wheels or JARs) don't need to be in the sync root because they have a
dedicated and different upload path into `${workspace.artifact_path}`.
Therefore, this check that a path is contained in the bundle's sync root
doesn't apply to artifact paths. This change modifies the structure of
path translation to allow opting out of this check.
Fixes#1927.
## Tests
* Existing and new tests pass.
* Manually confirmed that building and using a wheel built outside the
sync root path works as expected.
* No acceptance tests because we don't run build as part of validate.
## Changes
As of the clusters API v2.1 the results include system clusters. On
large workspaces this can lead to long load times and include many
irrelevant results. The cluster picker should only show interactive
clusters.
Also see #1754.
## Tests
Manually confirmed the picker runs fast on a large workspace.
## Changes
Now it's possible to configure new `app` resource in bundle and point it
to the custom `source_code_path` location where Databricks App code is
defined.
On `databricks bundle deploy` DABs will create an app. All consecutive
`databricks bundle deploy` execution will update an existing app if
there are any updated
On `databricks bundle run <my_app>` DABs will execute app deployment. If
the app is not started yet, it will start the app first.
### Bundle configuration
```
bundle:
name: apps
variables:
my_job_id:
description: "ID of job to run app"
lookup:
job: "My Job"
databricks_name:
description: "Name for app user"
additional_flags:
description: "Additional flags to run command app"
default: ""
my_app_config:
type: complex
description: "Configuration for my Databricks App"
default:
command:
- flask
- --app
- hello
- run
- ${var.additional_flags}
env:
- name: DATABRICKS_NAME
value: ${var.databricks_name}
resources:
apps:
my_app:
name: "anester-app" # required and has to be unique
description: "My App"
source_code_path: ./app # required and points to location of app code
config: ${var.my_app_config}
resources:
- name: "my-job"
description: "A job for app to be able to run"
job:
id: ${var.my_job_id}
permission: "CAN_MANAGE_RUN"
permissions:
- user_name: "foo@bar.com"
level: "CAN_VIEW"
- service_principal_name: "my_sp"
level: "CAN_MANAGE"
targets:
dev:
variables:
databricks_name: "Andrew (from dev)"
additional_flags: --debug
prod:
variables:
databricks_name: "Andrew (from prod)"
```
### Execution
1. `databricks bundle deploy -t dev`
2. `databricks bundle run my_app -t dev`
**If app is started**
```
✓ Getting the status of the app my-app
✓ App is in RUNNING state
✓ Preparing source code for new app deployment.
✓ Deployment is pending
✓ Starting app with command: flask --app hello run --debug
✓ App started successfully
You can access the app at <app-url>
```
**If app is not started**
```
✓ Getting the status of the app my-app
✓ App is in UNAVAILABLE state
✓ Starting the app my-app
✓ App is starting...
....
✓ App is starting...
✓ App is started!
✓ Preparing source code for new app deployment.
✓ Downloading source code from /Workspace/Users/...
✓ Starting app with command: flask --app hello run --debug
✓ App started successfully
You can access the app at <app-url>
```
## Tests
Added unit and config tests + manual test.
```
--- PASS: TestAccDeployBundleWithApp (404.59s)
PASS
coverage: 36.8% of statements in ./...
ok github.com/databricks/cli/internal/bundle 405.035s coverage: 36.8% of statements in ./...
```