## Changes
This will generate bundle YAML configuration for Git based jobs but
won't download any related files as they are in Git repo.
Fixes#1423
## Tests
Added unit test
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
`CreatePipeline` is a more complete structure (superset of PipelineSpec
one) which enables support of additional fields such as `run_as` and
`allow_duplicate_names` in DABs configuration. Note: these fields are
subject to support in TF in order to correctly work.
## Tests
Existing tests pass + no fields are removed from JSON schema
## Changes
This PR fails the acceptance test when an unknown endpoint (i.e. not
stubbed) is used. We want to ensure that all API endpoints used in an
acceptance test are stubbed and do not otherwise silently fail with a
404.
The logs on failure output include a configuration that developers can
simply copy-paste to `test.toml` to stub the missing API endpoint. It'll
look something like:
```
[[Server]]
Pattern = "<method> <path>"
Response.Body = '''
<response body here>
'''
Response.StatusCode = <response status-code here>
```
## Tests
Manually:
output.txt when an endpoint is not found:
```
>>> [CLI] jobs create --json {"name":"abc"}
Error: No stub found for pattern: POST /api/2.1/jobs/create
```
How this renders in the test logs:
```
--- FAIL: TestAccept/workspace/jobs/create (0.03s)
server.go:46:
----------------------------------------
No stub found for pattern: POST /api/2.1/jobs/create
To stub a response for this request, you can add
the following to test.toml:
[[Server]]
Pattern = "POST /api/2.1/jobs/create"
Response.Body = '''
<response body here>
'''
Response.StatusCode = <response status-code here>
----------------------------------------
```
Manually checked that the debug mode still works.
## Changes
- Print warnings and errors by default.
- Fix ErrAlreadyPrinted not to be logged at Error level.
- Format log messages as "Warn: message" instead of "WARN" to make it
more readable and in-line with the rest of the output.
- Only print attributes (pid, mutator, etc) and time when the overall
level is debug (so --debug output has not changed much).
## Tests
- Existing acceptance tests show how warning messages appear in various
test case.
- Added new test for `--debug` output.
- Add sort_lines.py helper to avoid dependency on 'sort' which is
locale-sensitive.
## Changes
Extend testserver for bundle deployment:
- Allocate a new workspace per test case to isolate test cases from each
other
- Support jobs get/list/create
- Support creation and listing of workspace files
## Tests
Using existing acceptance tests
## Changes
1. Allow `any` examples in json-schema type since we have many of them
in open api spec
2. Fix issue with missing overrides annotations when re-generating the
schema
## Tests
<!-- How is this tested? -->
## Changes
HTTP headers like the User-Agent are an important part of our internal
ETL pipelines. This PR adds the ability to validate the headers used in
an HTTP request as part of our acceptance tests.
## Tests
Modifying existing test.
## Changes
The APIs at Databricks when returning a non `200` status code will
return a response body of the format:
```
{
"error_code": "Error code",
"message": "Human-readable error message."
}
```
This PR adds the ability to stub non-200 status codes in the test
server, allowing us to mock API errors from Databricks.
## Tests
New test
$VARNAME is what we use for environment variables, it's good to
separate.
Some people use envsubst for homemade variable interpolation, it's also
good to have separation there.
## Changes
With this PR, any acceptance tests that define custom server stubs in
`test.toml` will automatically record all HTTP requests made and assert
on them.
Builds on top of https://github.com/databricks/cli/pull/2226
## Tests
Modifying existing acceptance test.
## Changes
This PR registers the `server.Close()` function to be run during test
cleanup in the server initialization function. This ensures that all
test servers are closed as soon as the test they are scoped to finish.
Motivated by https://github.com/databricks/cli/pull/2255/files where a
regression was introduced where we did not close the test server.
## Tests
N/A
## Changes
Added support for double underscore variable references.
Previously we made this restriction stronger with no particular reason,
TF provider supports multiple underscores and thus DABs should do as
well.
Fixes#1753
## Tests
Added acceptance and integration tests
## Changes
These types correspond to the telemetry protobufs defined in universe.
## Tests
No tests are needed since this PR only adds the type bindings.
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
Noticed this when working on
https://github.com/databricks/cli/pull/2221. `<` is a special HTML
character that is encoded during text replacement when using
`AssertEqualTexts`.
## Tests
N/A
## Changes
- Replace development version with $DEV_VERSION
- Update experimental-jobs-as-code to make use of it.
## Tests
- Existing tests.
- Using this in https://github.com/databricks/cli/pull/2213
## Changes
- Remove DetectInterpreters from DetectExecutable call: python3 or
python should always be on on the PATH. We don't need to detect
non-standard situations like python3.10 is present but python3 is not.
- I moved DetectInterpreters to cmd/labs where it is still used.
This is a follow up to https://github.com/databricks/cli/pull/2034
## Tests
Existing tests.
## Changes
- File comparison files in acceptance test, print the contents of all
applied replacements. Do it once per test.
- Remove duplicate entries in replacement list.
## Tests
Manually, change out files of existing test, you'll get this printed
once, after first assertion:
```
acceptance_test.go:307: Available replacements:
REPL /Users/denis\.bilenko/work/cli/acceptance/build/databricks => $$CLI
REPL /private/var/folders/5y/9kkdnjw91p11vsqwk0cvmk200000gp/T/TestAccept598522733/001 => $$TMPHOME
...
```
## Changes
When adding path, a few things should take care of:
- symlink expansion
- forward/backward slashes, so that tests could do sed 's/\\\\/\//g' to
make it pass on Windows (see
acceptance/bundle/syncroot/dotdot-git/script)
SetPath() function takes care of both.
This PR uses SetPath() on all paths consistently.
## Tests
Existing tests.
- Move acceptance/bundle/sync-paths-dotdot test to
acceptance/bundle/syncroot/dotdot-notgit
- Add new test acceptance/bundle/syncroot/dotdot-git
Fix replacer to work with this test and on Windows:
- Make PATH work on Windows by using EvalSymlinks.
- Make concatenated path match within JSON but stripping quotes.
## Changes
- Add a new method Clone() on ReplacementContext
- Use it when passing common replacements to test cases.
## Tests
Manually. I have a different branch where this bug manifested and this
change helped.
## Summary of changes
This PR introduces three new abstractions:
1. `Resolver`: Resolves which reader and writer to use for a template.
2. `Writer`: Writes a template project to disk. Prompts the user if
necessary.
3. `Reader`: Reads a template specification from disk, built into the
CLI or from GitHub.
Introducing these abstractions helps decouple reading a template from
writing it. When I tried adding telemetry for the `bundle init` command,
I noticed that the code in `cmd/init.go` was getting convoluted and hard
to test. A future change could have accidentally logged PII when a user
initialised a custom template.
Hedging against that risk is important here because we use a generic
untyped `map<string, string>` representation in the backend to log
telemetry for the `databricks bundle init`. Otherwise, we risk
accidentally breaking our compliance with our centralization
requirements.
### Details
After this PR there are two classes of templates that can be
initialized:
1. A `databricks` template: This could be a builtin template or a
template outside the CLI like mlops-stacks, which is still owned and
managed by Databricks. These templates log their telemetry arguments and
template name.
2. A `custom` template: These are templates created by and managed by
the end user. In these templates we do not log the template name and
args. Instead a generic placeholder string of "custom" is logged in our
telemetry system.
NOTE: The functionality of the `databricks bundle init` command remains
the same after this PR. Only the internal abstractions used are changed.
## Tests
New unit tests. Existing golden and unit tests. Also a fair bit of
manual testing.
## Changes
Add experimental-jobs-as-code template allowing defining jobs using
Python instead of YAML through the `databricks-bundles` PyPI package.
## Tests
Manually and acceptance tests.
## Changes
- Fix incorrect use Errorf on literal string. This resulted in garbage
output in tests diagnostics where % was replaced by "(MISSING)".
- Enable linter on testingT.Errorf.
Note, the autofix by the linter is wrong, it proposes `t.Errorf("%s",
string)` but it should be `t.Error(string)`. That can corrected manually
though.
## Tests
Linter was tested manually by reverting the fix on Errorf.
## Changes
Include a materialized copy of built-in templates as reference output.
This updates the output comparison logic to work against an output
directory. The `doComparison` function now always works on real files.
It can now tell apart non-existing files and empty files (e.g., the
`.gitkeep` files in templates).
## Changes
The materialized templates included in #2146 include Python code that we
require to be formatted. Instead of running ruff as part of the
testcase, we can enforce that all Python code in the repository is
formatted. It won't be possible to have a passing acceptance test for
template initialization with unformatted code.
## Changes
It covers both https://$DATABRICKS_HOST and http://$DATABRICKS_HOST so
the test output does not change between local and the cloud.
## Tests
Existing tests using golden files (acceptance and integration) catch
this and were updated.
## Changes
Replacement was split between the type `ReplacementContext` and the
`ReplaceOutput` function. The latter also ran a couple of regular
expressions. This change consolidates them such that it is up to the
caller to compose the set of replacements to use.
This change is required to accommodate UUID replacement in #2146.
## Changes
When setting up PATH in tests, put desired entry first but keep the rest
as well. Otherwise it fails on Windows
```
D:/a/cli/cli/libs/exec/exec_test.go:108
Error: Received unexpected error:
exit status 0xc0000135
```
Explanation from Claude:
> The error code 0xc0000135 is a Windows error indicating "Unable to
locate DLL"
> When code coverage is enabled, Go instruments the binary with coverage
tracking code, which requires additional DLL dependencies on Windows.
## Tests
Separate draft PR with coverage enabled on CI:
https://github.com/databricks/cli/pull/2141
## Changes
Always include both verbatim and json version of replacement.
This helps when the string in question contains \\ or other chars that
need to be quoted.
Needed for https://github.com/databricks/cli/pull/2118
## Tests
Existing tests.
## Changes
As of the clusters API v2.1 the results include system clusters. On
large workspaces this can lead to long load times and include many
irrelevant results. The cluster picker should only show interactive
clusters.
Also see #1754.
## Tests
Manually confirmed the picker runs fast on a large workspace.
## Changes
Now it's possible to configure new `app` resource in bundle and point it
to the custom `source_code_path` location where Databricks App code is
defined.
On `databricks bundle deploy` DABs will create an app. All consecutive
`databricks bundle deploy` execution will update an existing app if
there are any updated
On `databricks bundle run <my_app>` DABs will execute app deployment. If
the app is not started yet, it will start the app first.
### Bundle configuration
```
bundle:
name: apps
variables:
my_job_id:
description: "ID of job to run app"
lookup:
job: "My Job"
databricks_name:
description: "Name for app user"
additional_flags:
description: "Additional flags to run command app"
default: ""
my_app_config:
type: complex
description: "Configuration for my Databricks App"
default:
command:
- flask
- --app
- hello
- run
- ${var.additional_flags}
env:
- name: DATABRICKS_NAME
value: ${var.databricks_name}
resources:
apps:
my_app:
name: "anester-app" # required and has to be unique
description: "My App"
source_code_path: ./app # required and points to location of app code
config: ${var.my_app_config}
resources:
- name: "my-job"
description: "A job for app to be able to run"
job:
id: ${var.my_job_id}
permission: "CAN_MANAGE_RUN"
permissions:
- user_name: "foo@bar.com"
level: "CAN_VIEW"
- service_principal_name: "my_sp"
level: "CAN_MANAGE"
targets:
dev:
variables:
databricks_name: "Andrew (from dev)"
additional_flags: --debug
prod:
variables:
databricks_name: "Andrew (from prod)"
```
### Execution
1. `databricks bundle deploy -t dev`
2. `databricks bundle run my_app -t dev`
**If app is started**
```
✓ Getting the status of the app my-app
✓ App is in RUNNING state
✓ Preparing source code for new app deployment.
✓ Deployment is pending
✓ Starting app with command: flask --app hello run --debug
✓ App started successfully
You can access the app at <app-url>
```
**If app is not started**
```
✓ Getting the status of the app my-app
✓ App is in UNAVAILABLE state
✓ Starting the app my-app
✓ App is starting...
....
✓ App is starting...
✓ App is started!
✓ Preparing source code for new app deployment.
✓ Downloading source code from /Workspace/Users/...
✓ Starting app with command: flask --app hello run --debug
✓ App started successfully
You can access the app at <app-url>
```
## Tests
Added unit and config tests + manual test.
```
--- PASS: TestAccDeployBundleWithApp (404.59s)
PASS
coverage: 36.8% of statements in ./...
ok github.com/databricks/cli/internal/bundle 405.035s coverage: 36.8% of statements in ./...
```
## Changes
This updates `mode: production` to allow `root_path` to indicate
uniqueness. Historically, we required `run_as` for this, which isn't
actually very effective for that purpose. `run_as` also had the problem
that it doesn't work for pipelines.
This is a cherry-pick from https://github.com/databricks/cli/pull/1387
---------
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
## Changes
This is useful to track telemetry associated with the templates and can
later be useful for functional usecases as well. Mlops stacks does the
same here: https://github.com/databricks/mlops-stacks/pull/185
## Tests
Existing tests.
It's easier to remember and type and also validated and part of help:
```
% go test ./acceptance -updat 2>&1 | grep updat
flag provided but not defined: -updat
-update
```
## Changes
This used to work because the permission bits for built-in templates
were hardcoded to 0644 for files and 0755 for directories.
As of #1912 (and the PRs it depends on), built-in templates are no
longer pre-materialized to a temporary directory and read directly from
the embedded filesystem. This built-in filesystem returns 0444 as the
permission bits for the files it contains. These bits are carried over
to the destination filesystem.
This change updates template materialization to always set the owner's
write bit. It doesn't really make sense to write read-only files and
expect users to work with these files in a VCS (note: Git only stores
the executable bit).
The regression shipped as part of v0.235.0 and will be fixed as of
v0.238.0.
## Tests
Unit tests.