## Changes
The two functions `GetShortUserName` and `IsServicePrincipal` are
unrelated to auth or the purpose of the auth package. This change moves
them into their own package and updates `IsServicePrincipal` to take an
`*iam.User` argument instead of a string username.
## Tests
Tests pass.
## Changes
This adds diagnostics for collaborative (production) deployment
scenarios, including:
- Bob deploys a bundle that is normally deployed by Alice, but this
fails because Bob can't write to `/Users/Alice/.bundle`.
- Charlie deploys a bundle that is normally deployed by Alice, but this
fails because he can't create a new pipeline where Alice would be the
owner.
- Alice deploys a bundle where she didn't list herself as one of the
CAN_MANAGE users in permissions. That can work, but is probably a
mistake.
## Tests
Unit tests, manual testing.
## Changes
We want to encourage a pattern of specifying only a single resource in a
YAML file when the `.(resource-type).yml` extension is used (for
example, `.job.yml`). This convention could allow us to bijectively map
a resource YAML file to its corresponding resource in the Databricks
workspace.
This PR:
1. Emits a recommendation diagnostic when we detect this convention is
being violated. We can promote this to a warning when we want to
encourage this pattern more strongly.
2. Visualises the recommendation diagnostics in the `bundle validate`
command.
**NOTE:** While this PR also shows the recommendation for `.yaml` files,
we do not encourage users to use this extension. We only support it here
since it's part of the YAML standard and some existing users might
already be using `.yaml`.
## Tests
Unit tests and manually. Here's what an example output looks like:
```
Recommendation: define a single job in a file with the .job.yml extension.
at resources.jobs.bar
resources.jobs.foo
in foo.job.yml:13:7
foo.job.yml:5:7
The following resources are defined or configured in this file:
- bar (job)
- foo (job)
```
---------
Co-authored-by: Lennart Kats (databricks) <lennart.kats@databricks.com>
## Changes
Due to platform changes, all libraries, notebooks and etc. paths used in
Databricks must be started with either /Workspace or /Volumes prefix.
This PR makes sure that all bundle paths are correctly prefixed.
Note: this change is a breaking change if user previously configured and
used `/Workspace/Workspace` folder in their workspace file system or
having `/Workspace/${workspace.root_path}...` pattern configured
anywhere in their bundle config
Fixes: #1751
AI:
- [x] Scan DABs config and error out on
`/Workspace/${workspace.root_path}...` pattern usage
## Tests
Added unit tests
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
Default workspace path for resources with a presence in the workspace
tree.
Note: this path is **not** created automatically (yet). We need this
only for dashboards (so far), so can take care of creation if one or
more dashboards are part of a deployment. This saves an API call for
deployments where this is not necessary.
## Tests
Expanded existing tests.
## Changes
This fixes the user-reported panic in `apply_presets.go`. I'm still
unsure how to reproduce this, since the CLI just reports `ob broken_job
is not defined` when I try to use `bundle deploy` with an empty job.
That said — we may as well be defensive here and I see we have lots of
checks for empty job/cluster/etc. settings scattered throughout our code
base so at least we're somewhat consistent.
## Changes
After introducing the `SyncRootPath` field on the bundle (#1694), the
previous `RootPath` became ambiguous. Does it mean the bundle root path
or the sync root path? This PR renames to field to `BundleRootPath` to
remove the ambiguity.
## Tests
n/a
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
I plan to use this in https://github.com/databricks/cli/pull/1780, to
set the line and column numbers as well for the locations.
gopatch file used:
```
@@
var x expression
var y expression
var z expression
@@
-bundletest.SetLocation(x, y, z)
+bundletest.SetLocation(x, y, []dyn.Location{{File: z}})
```
## Changes
Add JobTaskClusterSpec validate mutator. It catches the case when tasks
don't which cluster to use.
For example, we can get this error with minor modifications to
`default-python` template:
```yaml
tasks:
- task_key: python_file_task
spark_python_task:
python_file: ../src/my_project_10/main.py
```
```
% databricks bundle validate
Error: Missing required cluster or environment settings
at resources.jobs.my_project_10_job.tasks[0]
in resources/my_project_10_job.yml:17:11
Task "print_github_stars" requires a cluster or an environment to run.
Specify one of the following fields: job_cluster_key, environment_key, existing_cluster_id, new_cluster.
```
We implicitly rely on "one of" validation, which does not exist. Many
bundle fields can't co-exist, for instance, specifying:
`JobTask.{existing_cluster_id,job_cluster_key}`, `Library.{whl,pypi}`,
`JobTask.{notebook_task,python_wheel_task}`, etc.
## Tests
Unit tests
---------
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
## Summary
Use the friendly name of service principals when shortening their name.
This change is helpful for the prefix in development mode. Instead of
adding a prefix like `[dev 1706906c-c0a2-4c25-9f57-3a7aa3cb8123]`, we'll
prefix like `[dev my_principal]`.
## Changes
This PR aliases and overrides the schema associated with the variables
block in `target` to allow for directly specifying a variable value in
the JSON schema (without an levels of nesting). This is needed because
this direct value is resolved by dynamically parsing the configuration
tree.
ca6332a5a4/bundle/config/root.go (L424)
## Tests
Existing unit tests.
## Changes
We added a custom resolver for the cluster to add filtering for the
cluster source when we list all clusters.
Without the filtering listing could take a very long time (5-10 mins)
which leads to lookup timeouts.
## Tests
Existing unit tests passing
## Changes
Some call sites hold on to the `dyn.Path` provided to them by the
callback. It must therefore never be mutated after the callback returns,
or these mutations leak out into unknown scope.
This change means it is no longer possible for this failure mode to
happen.
## Tests
Unit test.
## Changes
Fixes an `Error: no value assigned to required variable <variable>.`
when the main complex variable definition is defined in one file but
target override is defined in separate file which is included in the
main one.
## Tests
Added regression test
## Changes
Explain the error when the `databricks-pydabs` package is not installed
or the Python environment isn't correctly activated.
Example output:
```
Error: python mutator process failed: ".venv/bin/python3 -m databricks.bundles.build --phase load --input .../input.json --output .../output.json --diagnostics .../diagnostics.json: exit status 1", use --debug to enable logging
.../.venv/bin/python3: Error while finding module specification for 'databricks.bundles.build' (ModuleNotFoundError: No module named 'databricks')
Explanation: 'databricks-pydabs' library is not installed in the Python environment.
If using Python wheels, ensure that 'databricks-pydabs' is included in the dependencies,
and that the wheel is installed in the Python environment:
$ .venv/bin/pip install -e .
If using a virtual environment, ensure it is specified as the venv_path property in databricks.yml,
or activate the environment before running CLI commands:
experimental:
pydabs:
venv_path: .venv
```
## Tests
Unit tests
## Changes
Preserve diagnostics if there are any errors or warnings when
PythonMutator normalizes output. If anything goes wrong during
conversion, diagnostics contain the relevant location and path.
## Tests
Unit tests
## Changes
* Provide a more helpful error when using an artifact_path based on
/Volumes
* Allow the use of short_names in /Volumes paths
## Example cases
Example of a valid /Volumes artifact_path:
* `artifact_path:
/Volumes/catalog/schema/${workspace.current_user.short_name}/libs`
Example of an invalid /Volumes path (when using `mode: development`):
* `artifact_path: /Volumes/catalog/schema/libs`
* Resulting error: `artifact_path should contain the current username or
${workspace.current_user.short_name} to ensure uniqueness when using
'mode: development'`
## Changes
This changes makes sure we ignore CLI version check on development
builds of the CLI.
Before:
```
$ cat databricks.yml | grep cli_version
databricks_cli_version: ">= 0.223.1"
$ cli bundle deploy
Error: Databricks CLI version constraint not satisfied. Required: >= 0.223.1, current: 0.0.0-dev+06b169284737
```
after
```
...
$ cli bundle deploy
...
Warning: Ignoring Databricks CLI version constraint for development build. Required: >= 0.223.1, current: 0.0.0-dev+d52d6f08fcd5
```
## Tests
<!-- How is this tested? -->
## Changes
This field allows a user to configure paths to synchronize to the
workspace.
Allowed values are relative paths to files and directories anchored at
the directory where the field is set. If one or more values traverse up
the directory tree (to an ancestor of the bundle root directory), the
CLI will dynamically determine the root path to use to ensure that the
file tree structure remains intact.
For example, given a `databricks.yml` in `my_bundle` that includes:
```yaml
sync:
paths:
- ../common
- .
```
Then upon synchronization, the workspace will look like:
```
.
├── common
│ └── lib.py
└── my_bundle
├── databricks.yml
└── notebook.py
```
If not set behavior remains identical.
## Tests
* Newly added unit tests for the mutators and under `bundle/tests`.
* Manually confirmed a bundle without this configuration works the same.
* Manually confirmed a bundle with this configuration works.
## Changes
In https://github.com/databricks/cli/pull/1490 we regressed and started
using the development mode prefix for UC schemas regardless of the mode
of the bundle target.
This PR fixes the regression and adds a regression test
## Tests
Failing integration tests pass now.
## Changes
While experimenting with DAB I discovered that requirements libraries
are being ignored.
One thing worth mentioning is that `bundle validate` runs successfully,
but `bundle deploy` fails. This PR only covers the second part.
## Tests
<!-- How is this tested? -->
Added a unit test
## Changes
Make `pydabs/venv_path` optional. When not specified, CLI detects the
Python interpreter using `python.DetectExecutable`, the same way as for
`artifacts`. `python.DetectExecutable` works correctly if a virtual
environment is activated or `python3` is available on PATH through other
means.
Extract the venv detection code from PyDABs into `libs/python/detect`.
This code will be used when we implement the `python/venv_path` section
in `databricks.yml`.
## Tests
Unit tests and manually
---------
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
## Changes
This PR addressed post-merge feedback from
https://github.com/databricks/cli/pull/1673.
## Tests
Unit tests, and manually.
```
Error: experiment undefined-experiment is not defined
at resources.experiments.undefined-experiment
in databricks.yml:11:26
Error: job undefined-job is not defined
at resources.jobs.undefined-job
in databricks.yml:6:19
Error: pipeline undefined-pipeline is not defined
at resources.pipelines.undefined-pipeline
in databricks.yml:14:24
Name: undefined-job
Target: default
Found 3 errors
```
## Changes
This adds configurable transformations based on the transformations
currently seen in `mode: development`.
Example databricks.yml showcasing how some transformations:
```
bundle:
name: my_bundle
targets:
dev:
presets:
prefix: "myprefix_" # prefix all resource names with myprefix_
pipelines_development: true # set development to true by default for pipelines
trigger_pause_status: PAUSED # set pause_status to PAUSED by default for all triggers and schedules
jobs_max_concurrent_runs: 10 # set max_concurrent runs to 10 by default for all jobs
tags:
dev: true
```
## Tests
* Existing process_target_mode tests that were adapted to use this new
code
* Unit tests specific for the new mutator
* Unit tests for config loading and merging
* Manual e2e testing