Commit Graph

247 Commits

Author SHA1 Message Date
Andrew Nester ffdbec87cc
Added support for pip options in environment dependencies (#1842)
## Changes
Added support for specifying pip options such as `--extra-index-url` and
etc. in environments dependencies

```
environments:
  - environment_key: Default
    spec:
      client: "1"
      dependencies:
        - --extra-index-url https://foo@bar.com/packages/smth somepackage
        - json==1.0.0
```

## Tests
Added regression test

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-10-21 11:45:39 +00:00
Lennart Kats (databricks) c5043c3d9d
Add `bundle summary` to display URLs for deployed resources (#1731)
## Changes

Adds a textual output to the `databricks bundle summary` command, which
includes URLs of deployed resources.

Example usage:

```
$ databricks bundle summary
Name: my_pipeline
Target: dev
Workspace:
  Host: https://domain.databricks.com
  User: user@databricks.com
  Path: /Users/user@databricks.com/.bundle/my_pipeline/dev
Resources:
  Jobs:
    my_project_job:
      Name: [dev lennart] my_project_job
      URL:  https://domain.databricks.com/jobs/206899209187287?o=6051921418418893
  Pipelines:
    my_project_pipeline:
      Name: [dev lennart] my_project_pipeline
      URL:  https://domain.databricks.com/pipelines/3f849fd5-ba7d-47fa-a34c-c6bf034b4f58?o=6051921418418893
```

Notes:
* The top headers of the output are the same as those from the existing
`bundle validate` command
* URLs are colored light blue in the output
* For resources that haven't been deployed yet, we show `(not deployed)`
in place of the URL

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
2024-10-18 06:45:47 +00:00
Pieter Noordhuis 3270afaff4
Move utility functions dealing with IAM to libs/iamutil (#1820)
## Changes

The two functions `GetShortUserName` and `IsServicePrincipal` are
unrelated to auth or the purpose of the auth package. This change moves
them into their own package and updates `IsServicePrincipal` to take an
`*iam.User` argument instead of a string username.

## Tests

Tests pass.
2024-10-10 13:02:25 +00:00
Lennart Kats (databricks) e885794722
Show actionable errors for collaborative deployment scenarios (#1386)
## Changes

This adds diagnostics for collaborative (production) deployment
scenarios, including:

- Bob deploys a bundle that is normally deployed by Alice, but this
fails because Bob can't write to `/Users/Alice/.bundle`.
- Charlie deploys a bundle that is normally deployed by Alice, but this
fails because he can't create a new pipeline where Alice would be the
owner.
- Alice deploys a bundle where she didn't list herself as one of the
CAN_MANAGE users in permissions. That can work, but is probably a
mistake.

## Tests

Unit tests, manual testing.
2024-10-10 11:18:23 +00:00
shreyas-goenka bca9c2eda4
Add validation for files with a `.(resource-name).yml` extension (#1780)
## Changes
We want to encourage a pattern of specifying only a single resource in a
YAML file when the `.(resource-type).yml` extension is used (for
example, `.job.yml`). This convention could allow us to bijectively map
a resource YAML file to its corresponding resource in the Databricks
workspace.

This PR:
1. Emits a recommendation diagnostic when we detect this convention is
being violated. We can promote this to a warning when we want to
encourage this pattern more strongly.
2. Visualises the recommendation diagnostics in the `bundle validate`
command.

**NOTE:** While this PR also shows the recommendation for `.yaml` files,
we do not encourage users to use this extension. We only support it here
since it's part of the YAML standard and some existing users might
already be using `.yaml`.

## Tests
Unit tests and manually. Here's what an example output looks like:
```
Recommendation: define a single job in a file with the .job.yml extension.
  at resources.jobs.bar
     resources.jobs.foo
  in foo.job.yml:13:7
     foo.job.yml:5:7

The following resources are defined or configured in this file:
  - bar (job)
  - foo (job)
```

---------

Co-authored-by: Lennart Kats (databricks) <lennart.kats@databricks.com>
2024-10-07 09:16:20 +00:00
Andrew Nester a8cff48c0b
Always prepend bundle remote paths with /Workspace (#1724)
## Changes
Due to platform changes, all libraries, notebooks and etc. paths used in
Databricks must be started with either /Workspace or /Volumes prefix.

This PR makes sure that all bundle paths are correctly prefixed.

Note: this change is a breaking change if user previously configured and
used `/Workspace/Workspace` folder in their workspace file system or
having `/Workspace/${workspace.root_path}...` pattern configured
anywhere in their bundle config

Fixes: #1751

AI:
- [x] Scan DABs config and error out on
`/Workspace/${workspace.root_path}...` pattern usage

## Tests
Added unit tests

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-10-02 15:34:00 +00:00
Pieter Noordhuis 80d55f4540
Add resource path field to bundle workspace configuration (#1800)
## Changes

Default workspace path for resources with a presence in the workspace
tree.

Note: this path is **not** created automatically (yet). We need this
only for dashboards (so far), so can take care of creation if one or
more dashboards are part of a deployment. This saves an API call for
deployments where this is not necessary.

## Tests

Expanded existing tests.
2024-10-02 13:55:40 +00:00
Lennart Kats (databricks) da3b4f7c72
Fix panic in `apply_presets.go` (#1796)
## Changes

This fixes the user-reported panic in `apply_presets.go`. I'm still
unsure how to reproduce this, since the CLI just reports `ob broken_job
is not defined` when I try to use `bundle deploy` with an empty job.
That said — we may as well be defensive here and I see we have lots of
checks for empty job/cluster/etc. settings scattered throughout our code
base so at least we're somewhat consistent.
2024-09-29 14:08:10 +00:00
Pieter Noordhuis 1d1aa0a416
Rename `RootPath` -> `BundleRootPath` (#1792)
## Changes

After introducing the `SyncRootPath` field on the bundle (#1694), the
previous `RootPath` became ambiguous. Does it mean the bundle root path
or the sync root path? This PR renames to field to `BundleRootPath` to
remove the ambiguity.

## Tests

n/a

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-09-27 10:03:05 +00:00
Pieter Noordhuis 56cd96cb93
Move trampoline code into trampoline package (#1793)
## Changes

Doing this to make room for PyDABs under `bundle/python`.

## Tests

n/a
2024-09-27 09:32:54 +00:00
Pieter Noordhuis a1dca56abf
Trim trailing whitespace (#1794)
## Changes

Trailing whitespace is trimmed per the VS Code settings for this
repository.

## Tests

n/a
2024-09-27 09:30:39 +00:00
Andrew Nester 66f2ba64a8
Simplified isFullVariableOverrideDef implementation (#1791)
## Changes
Simplified isFullVariableOverrideDef implementation

Follow up on https://github.com/databricks/cli/pull/1787
2024-09-26 12:55:07 +00:00
shreyas-goenka 495040e4cd
Modify SetLocation test utility to take full locations as argument (#1788)
I plan to use this in https://github.com/databricks/cli/pull/1780, to
set the line and column numbers as well for the locations.

gopatch file used:
```
@@
var x expression
var y expression
var z expression
@@
-bundletest.SetLocation(x, y, z)
+bundletest.SetLocation(x, y, []dyn.Location{{File: z}})
```
2024-09-25 16:13:48 +00:00
Andrew Nester b3a3071086
Fixed full variable override detection (#1787)
## Changes
Fixes #1786 

## Tests
All valid override combinations are added as test cases
2024-09-25 12:35:16 +00:00
Gleb Kanterov 3d9decdda9
Add JobTaskClusterSpec validate mutator (#1784)
## Changes
Add JobTaskClusterSpec validate mutator. It catches the case when tasks
don't which cluster to use.

For example, we can get this error with minor modifications to
`default-python` template:

```yaml
      tasks:
        - task_key: python_file_task
          spark_python_task:
            python_file: ../src/my_project_10/main.py
```

```
 % databricks bundle validate
Error: Missing required cluster or environment settings
  at resources.jobs.my_project_10_job.tasks[0]
  in resources/my_project_10_job.yml:17:11

Task "print_github_stars" requires a cluster or an environment to run.
Specify one of the following fields: job_cluster_key, environment_key, existing_cluster_id, new_cluster.
```

We implicitly rely on "one of" validation, which does not exist. Many
bundle fields can't co-exist, for instance, specifying:
`JobTask.{existing_cluster_id,job_cluster_key}`, `Library.{whl,pypi}`,
`JobTask.{notebook_task,python_wheel_task}`, etc.

## Tests
Unit tests

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
2024-09-25 11:30:14 +00:00
Gleb Kanterov 490259a14a
Refactor jobs path translation (#1782)
## Changes
Extract package for other modules to transform different kinds of paths
in job resources.

## Tests
Unit tests
2024-09-24 13:51:54 +00:00
Andrew Nester 56ed9bebf3
Added support for creating all-purpose clusters (#1698)
## Changes
Added support for creating all-purpose clusters

Example of configuration

```
bundle:
  name: clusters

resources:
  clusters:
    test_cluster:
      cluster_name: "Test Cluster"
      num_workers: 2
      node_type_id: "i3.xlarge"
      autoscale:
        min_workers: 2
        max_workers: 7
      spark_version: "13.3.x-scala2.12"
      spark_conf:
        "spark.executor.memory": "2g"

  jobs:
    test_job:
      name: "Test Job"
      tasks:
        - task_key: test_task
          existing_cluster_id: ${resources.clusters.test_cluster.id}
          notebook_task:
            notebook_path: "./src/test.py"

targets:
    development:
      mode: development
      compute_id: ${resources.clusters.test_cluster.id}

```

## Tests
Added unit, config and E2E tests
2024-09-23 10:42:34 +00:00
Andrew Nester bcab6ca37b
Fixed detecting full syntax variable override which includes type field (#1775)
## Changes
Fixes #1773 

## Tests
Confirmed manually
2024-09-18 10:23:07 +00:00
Lennart Kats (databricks) e220f9ddd6
Use the friendly name of service principals when shortening their name (#1770)
## Summary

Use the friendly name of service principals when shortening their name.

This change is helpful for the prefix in development mode. Instead of
adding a prefix like `[dev 1706906c-c0a2-4c25-9f57-3a7aa3cb8123]`, we'll
prefix like `[dev my_principal]`.
2024-09-16 18:35:07 +00:00
Andrew Nester 66307134c1
Fixed generated YAML missing 'default' for empty values (#1765)
## Changes
Fixed generated YAML missing 'default' for empty values

## Tests
Added unit test
2024-09-11 09:49:58 +00:00
shreyas-goenka 5d2c0e3885
Alias variables block in the `Target` struct (#1748)
## Changes
This PR aliases and overrides the schema associated with the variables
block in `target` to allow for directly specifying a variable value in
the JSON schema (without an levels of nesting). This is needed because
this direct value is resolved by dynamically parsing the configuration
tree.

ca6332a5a4/bundle/config/root.go (L424)

## Tests
Existing unit tests.
2024-09-10 14:49:34 +00:00
Andrew Nester 02e83877f4
Added listing cluster filtering for cluster lookups (#1754)
## Changes
We added a custom resolver for the cluster to add filtering for the
cluster source when we list all clusters.

Without the filtering listing could take a very long time (5-10 mins)
which leads to lookup timeouts.

## Tests
Existing unit tests passing
2024-09-06 11:34:57 +00:00
Pieter Noordhuis ceefa80d72
Pass copy of `dyn.Path` to callback function (#1747)
## Changes

Some call sites hold on to the `dyn.Path` provided to them by the
callback. It must therefore never be mutated after the callback returns,
or these mutations leak out into unknown scope.

This change means it is no longer possible for this failure mode to
happen.

## Tests

Unit test.
2024-09-05 11:05:16 +00:00
Andrew Nester 72030844c5
Fixed variable override in target with full variable syntax (#1749)
## Changes
This PR makes sure that both of this override syntax for variables work
correctly
```
targets:
  dev:
    variables:
      cluster1:
        spark_version: "14.2.x-scala2.11"
        node_type_id: "Standard_DS3_v2"
        num_workers: 4
        spark_conf:
          spark.speculation: false
          spark.databricks.delta.retentionDurationCheck.enabled: false
      cluster2:
        default:
          spark_version: "14.2.x-scala2.11"
          node_type_id: "Standard_DS3_v2"
          num_workers: 4
          spark_conf:
            spark.speculation: false
            spark.databricks.delta.retentionDurationCheck.enabled: false
```
## Tests
Added regression test

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-09-04 17:16:40 +00:00
Andrew Nester ca6332a5a4
Fixed complex variables are not being correctly merged from include files (#1746)
## Changes
Fixes an `Error: no value assigned to required variable <variable>.`
when the main complex variable definition is defined in one file but
target override is defined in separate file which is included in the
main one.

## Tests
Added regression test
2024-09-04 11:24:55 +00:00
Gleb Kanterov ed448815b4
PythonMutator: explain missing package error (#1736)
## Changes
Explain the error when the `databricks-pydabs` package is not installed
or the Python environment isn't correctly activated.

Example output:

```
Error: python mutator process failed: ".venv/bin/python3 -m databricks.bundles.build --phase load --input .../input.json --output .../output.json --diagnostics .../diagnostics.json: exit status 1", use --debug to enable logging

.../.venv/bin/python3: Error while finding module specification for 'databricks.bundles.build' (ModuleNotFoundError: No module named 'databricks')

Explanation: 'databricks-pydabs' library is not installed in the Python environment.

If using Python wheels, ensure that 'databricks-pydabs' is included in the dependencies, 
and that the wheel is installed in the Python environment:

  $ .venv/bin/pip install -e .

If using a virtual environment, ensure it is specified as the venv_path property in databricks.yml, 
or activate the environment before running CLI commands:

  experimental:
    pydabs:
      venv_path: .venv
```

## Tests
Unit tests
2024-09-02 09:49:30 +00:00
Andrew Nester 582558cac2
Do not suppress normalisation diagnostics for resolving variables (#1740)
## Changes

Tested on the following bundle configuration

```
bundle:
  name: clusters
  mode: development

variables:
  webhook_notifications:
    description: Webhook URL for notifications
    type: complex
    default:
      on_failure:
        id: 6a6c04c1-389c-4534-95af-b68b62a9dbe6

resources:
  jobs:
    test_job:
      name: "Andrew Nester Test Job"
      tasks:
        - task_key: test_task
          notebook_task:
            notebook_path: "./src/test.py"
          new_cluster:
            num_workers: 2
            node_type_id: "i3.xlarge"
            autoscale:
              min_workers: 2
              max_workers: 7
            spark_version: "12.2.x-scala2.12"
            spark_conf:
              "spark.executor.memory": "2g"
      webhook_notifications: ${var.webhook_notifications}

```

bundle validate output is below

```
andrew.nester@HFW9Y94129 wheel % databricks bundle validate
Warning: expected sequence, found map
  at resources.jobs.test_job.webhook_notifications.on_failure
  in bundle.yml:11:9

Name: clusters
Target: default
Workspace:
  User: andrew.nester@databricks.com
  Path: /Users/andrew.nester@databricks.com/.bundle/clusters/default
```

**Note** that error correctly points to the variable
2024-09-02 09:17:18 +00:00
shreyas-goenka 5d9910c8e0
Make lock optional in the JSON schema (#1738)
Fixes https://github.com/databricks/cli/issues/1561
2024-09-02 08:39:08 +00:00
Gleb Kanterov 70ce802518
PythonMutator: preserve normalize diagnostics (#1735)
## Changes
Preserve diagnostics if there are any errors or warnings when
PythonMutator normalizes output. If anything goes wrong during
conversion, diagnostics contain the relevant location and path.

## Tests
Unit tests
2024-08-30 13:29:00 +00:00
Lennart Kats (databricks) 85459c1963
Improve error handling for /Volumes paths in mode: development (#1716)
## Changes
* Provide a more helpful error when using an artifact_path based on
/Volumes
* Allow the use of short_names in /Volumes paths

## Example cases

Example of a valid /Volumes artifact_path:
* `artifact_path:
/Volumes/catalog/schema/${workspace.current_user.short_name}/libs`

Example of an invalid /Volumes path (when using `mode: development`):
* `artifact_path: /Volumes/catalog/schema/libs`
* Resulting error: `artifact_path should contain the current username or
${workspace.current_user.short_name} to ensure uniqueness when using
'mode: development'`
2024-08-28 12:14:19 +00:00
Lennart Kats (databricks) 84b47745e4
Ignore CLI version check on development builds of the CLI (#1714)
## Changes

This changes makes sure we ignore CLI version check on development
builds of the CLI.

Before:

```
$ cat databricks.yml | grep cli_version
  databricks_cli_version: ">= 0.223.1"
$ cli bundle deploy
Error: Databricks CLI version constraint not satisfied. Required: >= 0.223.1, current: 0.0.0-dev+06b169284737
```

after

```
...
$ cli bundle deploy
...
Warning: Ignoring Databricks CLI version constraint for development build. Required: >= 0.223.1, current: 0.0.0-dev+d52d6f08fcd5
```


## Tests
<!-- How is this tested? -->
2024-08-23 10:13:21 +00:00
Pieter Noordhuis 6e8cd835a3
Add paths field to bundle sync configuration (#1694)
## Changes

This field allows a user to configure paths to synchronize to the
workspace.

Allowed values are relative paths to files and directories anchored at
the directory where the field is set. If one or more values traverse up
the directory tree (to an ancestor of the bundle root directory), the
CLI will dynamically determine the root path to use to ensure that the
file tree structure remains intact.

For example, given a `databricks.yml` in `my_bundle` that includes:

```yaml
sync:
  paths:
    - ../common
    - .
```

Then upon synchronization, the workspace will look like:
```
.
├── common
│   └── lib.py
└── my_bundle
    ├── databricks.yml
    └── notebook.py
```

If not set behavior remains identical.

## Tests

* Newly added unit tests for the mutators and under `bundle/tests`.
* Manually confirmed a bundle without this configuration works the same.
* Manually confirmed a bundle with this configuration works.
2024-08-21 15:33:25 +00:00
shreyas-goenka f5df211320
Fix prefix preset used for UC schemas (#1704)
## Changes
In https://github.com/databricks/cli/pull/1490 we regressed and started
using the development mode prefix for UC schemas regardless of the mode
of the bundle target.

This PR fixes the regression and adds a regression test

## Tests
Failing integration tests pass now.
2024-08-21 12:53:54 +00:00
Witold Czaplewski 192f33bb13
[DAB] Add support for requirements libraries in Job Tasks (#1543)
## Changes
While experimenting with DAB I discovered that requirements libraries
are being ignored.

One thing worth mentioning is that `bundle validate` runs successfully,
but `bundle deploy` fails. This PR only covers the second part.


## Tests
<!-- How is this tested? -->
Added a unit test
2024-08-21 10:03:56 +00:00
Gleb Kanterov 44902fa350
Make `pydabs/venv_path` optional (#1687)
## Changes
Make `pydabs/venv_path` optional. When not specified, CLI detects the
Python interpreter using `python.DetectExecutable`, the same way as for
`artifacts`. `python.DetectExecutable` works correctly if a virtual
environment is activated or `python3` is available on PATH through other
means.

Extract the venv detection code from PyDABs into `libs/python/detect`.
This code will be used when we implement the `python/venv_path` section
in `databricks.yml`.

## Tests
Unit tests and manually

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
2024-08-20 13:26:57 +00:00
shreyas-goenka 242d4b51ed
Report all empty resources present in error diagnostic (#1685)
## Changes
This PR addressed post-merge feedback from
https://github.com/databricks/cli/pull/1673.

## Tests
Unit tests, and manually.
```
Error: experiment undefined-experiment is not defined
  at resources.experiments.undefined-experiment
  in databricks.yml:11:26

Error: job undefined-job is not defined
  at resources.jobs.undefined-job
  in databricks.yml:6:19

Error: pipeline undefined-pipeline is not defined
  at resources.pipelines.undefined-pipeline
  in databricks.yml:14:24

Name: undefined-job
Target: default

Found 3 errors
```
2024-08-20 00:22:00 +00:00
Lennart Kats (databricks) 78d0ac5c6a
Add configurable presets for name prefixes, tags, etc. (#1490)
## Changes

This adds configurable transformations based on the transformations
currently seen in `mode: development`.

Example databricks.yml showcasing how some transformations:

```
bundle:
  name: my_bundle

targets:
  dev:
    presets:
      prefix: "myprefix_"          # prefix all resource names with myprefix_
      pipelines_development: true  # set development to true by default for pipelines
      trigger_pause_status: PAUSED # set pause_status to PAUSED by default for all triggers and schedules
      jobs_max_concurrent_runs: 10 # set max_concurrent runs to 10 by default for all jobs
      tags:
        dev: true
```

## Tests

* Existing process_target_mode tests that were adapted to use this new
code
* Unit tests specific for the new mutator
* Unit tests for config loading and merging
* Manual e2e testing
2024-08-19 18:18:50 +00:00
Lennart Kats (databricks) 07627023f5
Pause continuous pipelines when 'mode: development' is used (#1590)
## Changes

This makes it so that the pipelines `continuous` property is set to
false by default when using `mode: development`.
2024-08-19 16:27:57 +00:00
Pieter Noordhuis 7de7583b37
Make fileset take optional list of paths to list (#1684)
## Changes

Before this change, the fileset library would take a single root path
and list all files in it. To support an allowlist of paths to list (much
like a Git `pathspec` without patterns; see [pathspec](pathspec)), this
change introduces an optional argument to `fileset.New` where the caller
can specify paths to list. If not specified, this argument defaults to
list `.` (i.e. list all files in the root).

The motivation for this change is that we wish to expose this pattern in
bundles. Users should be able to specify which paths to synchronize
instead of always only synchronizing the bundle root directory.

[pathspec]:
https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec

## Tests

New and existing unit tests.
2024-08-19 15:15:14 +00:00
Gleb Kanterov ab4e8099fb
Add `import` option for PyDABs (#1693)
## Changes
Add 'import' option for PyDABs

## Tests
Manually
2024-08-19 13:24:56 +00:00
Andrew Nester 54799a1918
Upgrade Go SDK to 0.44.0 (#1679)
## Changes
Upgrade Go SDK to 0.44.0

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-08-15 13:23:07 +00:00
Andrew Nester 48ff18e5fc
Upload local libraries even if they don't have artifact defined (#1664)
## Changes
Previously for all the libraries referenced in configuration DABs made
sure that there is corresponding artifact section.
But this is not really necessary and flexible, because local libraries
might be built outside of dabs context.
It also created difficult to follow logic in code where we back
referenced libraries to artifacts which was difficult to fllow


This PR does 3 things:
1. Allows all local libraries referenced in DABs config to be uploaded
to remote
2. Simplifies upload and glob references expand logic by doing this in
single place
3. Speed things up by uploading library only once and doing this in
parallel

## Tests
Added unit + integration tests + made sure that change is backward
compatible (no changes in existing tests)

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-08-14 09:03:44 +00:00
shreyas-goenka 7ae80de351
Stop tracking file path locations in bundle resources (#1673)
## Changes
Since locations are already tracked in the dynamic value tree, we no
longer need to track it at the resource/artifact level. This PR:
1. Removes use of `paths.Paths`. Uses dyn.Location instead.
2. Refactors the validation of resources not being empty valued to be
generic across all resource types.
  
## Tests
Existing unit tests.
2024-08-13 12:50:15 +00:00
Pieter Noordhuis f3ffded3bf
Merge job parameters based on their name (#1659)
## Changes

This change enables overriding the default value of job parameters in
target overrides.

This is the same approach we already take for job clusters and job
tasks.

Closes #1620.

## Tests

Mutator unit tests and lightweight end-to-end tests.
2024-08-06 16:12:18 +00:00
Andrew Nester 1fb8e324d5
Added test for negation pattern in sync include exclude section (#1637)
## Changes
Added test for negation pattern in sync include exclude section
2024-07-31 13:42:23 +00:00
shreyas-goenka 89c0af5bdc
Add resource for UC schemas to DABs (#1413)
## Changes
This PR adds support for UC Schemas to DABs. This allows users to define
schemas for tables and other assets their pipelines/workflows create as
part of the DAB, thus managing the life-cycle in the DAB.

The first version has a couple of intentional limitations:
1. The owner of the schema will be the deployment user. Changing the
owner of the schema is not allowed (yet). `run_as` will not be
restricted for DABs containing UC schemas. Let's limit the scope of
run_as to the compute identity used instead of ownership of data assets
like UC schemas.
2. API fields that are present in the update API but not the create API.
For example: enabling predictive optimization is not supported in the
create schema API and thus is not available in DABs at the moment.

## Tests
Manually and integration test. Manually verified the following work:
1. Development mode adds a "dev_" prefix.
2. Modified status is correctly computed in the `bundle summary`
command.
3. Grants work as expected, for assigning privileges.
4. Variable interpolation works for the schema ID.
2024-07-31 12:16:28 +00:00
shreyas-goenka a52b188e99
Use dynamic walking to validate unique resource keys (#1614)
## Changes
This PR:
1. Uses dynamic walking (via the `dyn.MapByPattern` func) to validate no
two resources have the same resource key. The allows us to remove this
validation at merge time.
2. Modifies `dyn.Mapping` to always return a sorted slice of pairs. This
makes traversal functions like `dyn.Walk` or `dyn.MapByPattern`
deterministic.

## Tests
Unit tests. Also manually.
2024-07-29 13:04:02 +00:00
shreyas-goenka 37b9df96e6
Support multiple paths for diagnostics (#1616)
## Changes
Some diagnostics can have multiple paths associated with them. For
instance, ensuring that unique resource keys are used across all
resources. This PR extends `diag.Diagnostic` to accept multiple paths.

This PR is symmetrical to
https://github.com/databricks/cli/pull/1610/files

## Tests
Unit tests
2024-07-25 15:16:27 +00:00
shreyas-goenka 4bf88b4209
Support multiple locations for diagnostics (#1610)
## Changes
This PR changes `diag.Diagnostics` to allow including multiple locations
associated with the diagnostic message. The diagnostics that now return
multiple locations with this PR are:
1. Warning for unknown keys in config.
2. Use of experimental.run_as
3. Accidental sync.exludes that exclude all files.

## Tests
Existing unit tests pass. New unit test case to assert on error message
when multiple locations are included.

Example output:
```
➜  bundle-playground-2 ~/cli2/cli/cli bundle validate              
Warning: You are using the legacy mode of run_as. The support for this mode is experimental and might be removed in a future release of the CLI. In order to run the DLT pipelines in your DAB as the run_as user this mode changes the owners of the pipelines to the run_as identity, which requires the user deploying the bundle to be a workspace admin, and also a Metastore admin if the pipeline target is in UC.
  at experimental.use_legacy_run_as
  in resources.yml:10:22
     databricks.yml:13:22

Name: fix run_if
Target: default
Workspace:
  User: shreyas.goenka@databricks.com
  Path: /Users/shreyas.goenka@databricks.com/.bundle/fix run_if/default

Found 1 warning
```
2024-07-23 17:20:11 +00:00
Pieter Noordhuis 6953a5d5af
Add read-only mode for extension aware workspace filer (#1609)
## Changes

By default, construct a read/write instance. If constructed in read-only
mode, the underlying filer is wrapped in a readahead cache.

## Tests

* Filer integration tests pass.
* Manual test that caching is enabled when running on WSFS.
2024-07-18 14:17:42 +00:00