Commit Graph

94 Commits

Author SHA1 Message Date
Gleb Kanterov aee3910f3d
PythonMutator: register product in user agent extra (#1533)
## Changes
Register user agent product following RFC 9110.

See
https://github.com/databricks/terraform-provider-databricks/pull/3520
for Terraform change.

## Tests
Unit tests
2024-07-01 07:46:37 +00:00
shreyas-goenka 068c7cfc2d
Return `dyn.InvalidValue` instead of `dyn.NilValue` when errors happen (#1514)
## Changes
With https://github.com/databricks/cli/pull/1507 and
https://github.com/databricks/cli/pull/1511 we are clarifying the
semantics associated with `dyn.InvalidValue` and `dyn.NilValue`. An
invalid value is the default zero value and is used to signals the
complete absence of the value.

A nil value, on the other hand, is a valid value for a piece of
configuration and signals explicitly setting a key to nil in the
configuration tree. In keeping with that theme, this PR returns
`dyn.InvalidValue` instead of `dyn.NilValue` at error sites. This change
is not expected to have a material change in behaviour and is being done
to set the right convention since we have well-defined semantics
associated with both `NilValue` and `InvalidValue`.

## Tests
Unit tests and integration tests pass. Also manually scanned the changes
and the associated call sites to verify the `NilValue` value itself was
not being relied upon.
2024-06-21 14:22:42 +00:00
Pieter Noordhuis 446a9d0c52
Properly deal with nil values in `convert.FromTyped` (#1511)
## Changes

When a configuration defines:
```yaml
run_as:
```

It first showed up as `run_as -> nil` in the dynamic configuration only
to later be converted to `run_as -> {}` while going through typed
conversion. We were using the presence of a key to initialize an empty
value. This is incorrect and it should have remained a nil value.

This conversion was happening in `convert.FromTyped` where any struct
always returned a map value. Instead, it should only return a map value
in any one of these cases: 1) the struct has elements, 2) the struct was
originally a map in the dynamic configuration, or 3) the struct was
initialized to a non-empty pointer value.

Stacked on top of #1516 and #1518.

## Tests

* Unit tests pass.
* Integration tests pass.
* Manually ran through bundle CRUD with a bundle without resources.
2024-06-21 13:43:21 +00:00
shreyas-goenka 44e3928d6a
Avoid multiple file tree traversals on bundle deploy (#1493)
## Changes
To run bundle deploy from DBR we use an abstraction over the workspace
import / export APIs to create a `filer.Filer` and abstract the file
system. Walking the file tree in such a filer is expensive and requires
multiple API calls. This PR remove the two duplicate file tree walks
that happen by caching the result.
2024-06-17 09:48:52 +00:00
Pieter Noordhuis c9b4f11947
Update error checks that use the `os` package to use `errors.Is` (#1461)
## Changes

From the [documentation](https://pkg.go.dev/os#IsNotExist) on the
functions in the `os` package:
> This function predates errors.Is. It only supports errors returned by
the os package.
> New code should use errors.Is(err, fs.ErrNotExist).

This issue surfaced while working on using a different `vfs.Path`
implementation that uses errors from the `fs` package. Calls to
`os.IsNotExist` didn't return true for errors that wrap
`fs.ErrNotExist`.

## Tests

n/a
2024-06-03 12:39:36 +00:00
Aravind Segu a33d0c8bf9
Add support for Lakehouse monitoring in bundles (#1307)
## Changes

This change adds support for Lakehouse monitoring in bundles.

The associated resource type name is "quality monitor".

## Testing

Unit tests.

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: Arpit Jasapara <87999496+arpitjasa-db@users.noreply.github.com>
2024-05-31 09:42:25 +00:00
Pieter Noordhuis 424499ec1d
Abstract over filesystem interaction with libs/vfs (#1452)
## Changes

Introduce `libs/vfs` for an implementation of `fs.FS` and friends that
_includes_ the absolute path it is anchored to.

This is needed for:
1. Intercepting file operations to inject custom logic (e.g., logging,
access control).
2. Traversing directories to find specific leaf directories (e.g.,
`.git`).
3. Converting virtual paths to OS-native paths.

Options 2 and 3 are not possible with the standard `fs.FS` interface.
They are needed such that we can provide an instance to the sync package
and still detect the containing `.git` directory and convert paths to
native paths.

This change focuses on making the following packages use `vfs.Path`:
* libs/fileset
* libs/git
* libs/sync

All entries returned by `fileset.All` are now slash-separated. This has
2 consequences:
* The sync snapshot now always uses slash-separated paths
* We don't need to call `filepath.FromSlash` as much as we did

## Tests

* All unit tests pass
* All integration tests pass
* Manually confirmed that a deployment made on Windows by a previous
version of the CLI can be deployed by a new version of the CLI while
retaining the validity of the local sync snapshot as well as the remote
deployment state.
2024-05-30 07:41:50 +00:00
Ilia Babanov 2035516fde
Don't merge-in remote resources during depolyments (#1432)
## Changes
`check_running_resources` now pulls the remote state without modifying
the bundle state, similar to how it was doing before. This avoids a
problem when we fail to compute deployment metadata for a deleted job
(which we shouldn't do in the first place)

`deploy_then_remove_resources_test` now also deploys and deletes a job
(in addition to a pipeline), which catches the error that this PR fixes.

## Tests
Unit and integ tests
2024-05-15 12:41:44 +00:00
shreyas-goenka 507053ee50
Annotate DLT pipelines when deployed using DABs (#1410)
## Changes
This PR annotates any pipelines that were deployed using DABs to have
`deployment.kind` set to "BUNDLE", mirroring the annotation for Jobs
(similar PR for jobs FYI: https://github.com/databricks/cli/pull/880).

Breakglass UI is not yet available for pipelines, so this annotation
will just be used for revenue attribution ATM.

Note: The API field has been deployed in all regions including GovCloud.

## Tests
Unit tests and manually.

Manually verified that the kind and metadata_file_path are being set by
DABs, and are returned by a GET API to a pipeline deployed using a DAB.
Example:
```
    "deployment": {
      "kind":"BUNDLE",
      "metadata_file_path":"/Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default/state/metadata.json"
    },
```
2024-05-01 08:37:03 +00:00
Ilia Babanov 153141d3ea
Don't fail while parsing outdated terraform state (#1404)
`terraform show -json` (`terraform.Show()`) fails if the state file
contains resources with fields that non longer conform to the provider
schemas.

This can happen when you deploy a bundle with one version of the CLI,
then updated the CLI to a version that uses different databricks
terraform provider, and try to run `bundle run` or `bundle summary`.
Those commands don't recreate local terraform state (only `terraform
apply` or `plan` do) and terraform itself fails while parsing it.
[Terraform
docs](https://developer.hashicorp.com/terraform/language/state#format)
point out that it's best to use `terraform show` after successful
`apply` or `plan`.

Here we parse the state ourselves. The state file format is internal to
terraform, but it's more stable than our resource schemas. We only parse
a subset of fields from the state, and only update ID and ModifiedStatus
of bundle resources in the `terraform.Load` mutator.
2024-05-01 08:22:35 +00:00
Andrew Nester 1872aa12b3
Added support for job environments (#1379)
## Changes
The main changes are:
1. Don't link artifacts to libraries anymore and instead just iterate
over all jobs and tasks when uploading artifacts and update local path
to remote
2. Iterating over `jobs.environments` to check if there are any local
libraries and checking that they exist locally
3. Added tests to check environments are handled correctly

End-to-end test will follow up

## Tests
Added regression test, existing tests (including integration one) pass
2024-04-22 11:44:34 +00:00
Pieter Noordhuis cd675ded9a
Update `testutil` helpers to return path (#1383)
## Changes

I spotted a few call sites where the path of a test file was synthesized
multiple times. It is easier to capture the path as a variable and reuse
it.
2024-04-19 15:05:36 +00:00
shreyas-goenka e008c2bd8c
Cleanup remote file path on bundle destroy (#1374)
## Changes
The sync struct initialization would recreate the deleted `file_path`.
This PR moves to not initializing the sync object to delete the
snapshot, thus fixing the lingering `file_path` after `bundle destroy`.

## Tests
Manually, and a integration test to prevent regression.
2024-04-19 11:48:04 +00:00
shreyas-goenka 3c14204e98
Followup improvements to the Docker setup script (#1369)
## Changes
This PR:
1. Uses bash to run the setup.sh script instead of the native busybox sh
shipped with alpine.
2. Verifies the checksums of the installed terraform CLI binaries.
 
## Tests
Manually. The docker image successfully builds.

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-04-18 20:52:11 +00:00
Andrew Nester 27f51c760f
Added validate mutator to surface additional bundle warnings (#1352)
## Changes
All these validators will return warnings as part of `bundle validate`
run

Added 2 mutators: 
1. To check that if tasks use job_cluster_key it is actually defined
2. To check if there are any files to sync as part of deployment

Also added `bundle.Parallel` to run them in parallel

To make sure mutators under bundle.Parallel do not mutate config,
introduced new `ReadOnlyMutator`, `ReadOnlyBundle` and `ReadOnlyConfig`.

Example 

```
databricks bundle validate -p deco-staging
Warning: unknown field: new_cluster
  at resources.jobs.my_job
  in bundle.yml:24:7

Warning: job_cluster_key high_cpu_workload_job_cluster is not defined
  at resources.jobs.my_job.tasks[0].job_cluster_key
  in bundle.yml:35:28

Warning: There are no files to sync, please check your your .gitignore and sync.exclude configuration
  at sync.exclude
  in bundle.yml:18:5

Name: test
Target: default
Workspace:
  Host: https://acme.databricks.com
  User: andrew.nester@databricks.com
  Path: /Users/andrew.nester@databricks.com/.bundle/test/default

Found 3 warnings
```

## Tests
Added unit tests
2024-04-18 15:13:16 +00:00
shreyas-goenka 5140a9a902
Add docker images for the CLI (#1353)
## Changes
This PR makes changes to support creating a docker image for the CLI
with the `terraform` dependencies built in. This is useful for customers
that operate in a network-restricted environment. Normally DABs makes
API calls to registry.terraform.io to setup the terraform dependencies,
with this setup the CLI/DABs will rely on the provider binaries bundled
in the docker image.

### Specifically this PR makes the following changes:
----------------
Modifies the CLI release workflow to publish the docker images in the
Github Container Registry. URL:
https://github.com/databricks/cli/pkgs/container/cli.

We use docker support in `goreleaser` to build and publish the images.
Using goreleaser ensures the CLI packaged in the docker image is the
same release artifact as the normal releases. For more information see:
1. https://goreleaser.com/cookbooks/multi-platform-docker-images
2. https://goreleaser.com/customization/docker/

Other choices made include:
1. Using `alpine` as the base image. The reason is `alpine` is a small
and lightweight linux distribution (~5MB) and an industry standard.
2. Not using [docker
manifest](https://docs.docker.com/reference/cli/docker/manifest) to
create a multi-arch build. This is because the functionality is still
experimental.

------------------

Make the `DATABRICKS_TF_VERSION` and `DATABRICKS_TF_PROVIDER_VERSION`
environment variables optional for using the terraform file mirror.
While it's not strictly necessary to make the docker image work, it's
the "right" behaviour and reduces complexity. The rationale is:
- These environment variables here are needed so the Databricks CLI does
not accidentally use the file mirror bundled with VSCode if it's
incompatible. This does not require the env vars to be mandatory.
context: https://github.com/databricks/cli/pull/1294
- This makes the `Dockerfile` and `setup.sh` simpler. We don't need an
[entrypoint.sh script to set the version environment
variables](https://medium.com/@leonardo5621_66451/learn-how-to-use-entrypoint-scripts-in-docker-images-fede010f172d).
This also makes using an interactive terminal with `docker run -it ...`
work out of the box.

 
## Tests


Tested manually. 

--------------------

To test the release pipeline I triggered a couple of dummy releases and
verified that the images are built successfully and uploaded to Github.
1. https://github.com/databricks/cli/pkgs/container/cli
3. workflow for release:
https://github.com/databricks/cli/actions/runs/8646106333

--------------------

I tested the docker container itself by setting up
[Charles](https://www.charlesproxy.com/) as an HTTP proxy and verifying
that no HTTP requests are made to `registry.terraform.io`

Before:
FYI, The Charles web proxy is hosted at localhost:8888.
```
shreyas.goenka@THW32HFW6T bundle-playground % rm -r .databricks 
shreyas.goenka@THW32HFW6T bundle-playground %  HTTP_PROXY="http://localhost:8888" HTTPS_PROXY="http://localhost:8888" cli bundle deploy
Uploading bundle files to /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default/files...
Deploying resources...
Updating deployment state...
Deployment complete!
```
<img width="1275" alt="Screenshot 2024-04-11 at 3 21 45 PM"
src="https://github.com/databricks/cli/assets/88374338/15f37324-afbd-47c0-a40e-330ab232656b">

After:
This time bundle deploy is run from inside the docker container. We use
`host.docker.internal` to map to localhost on the host machine, and -v
to mount the host file system as a volume.
```
shreyas.goenka@THW32HFW6T bundle-playground % docker run -v ~/projects/bundle-playground:/bundle -v ~/.databrickscfg:/root/.databrickscfg  -it --entrypoint /bin/sh -e HTTP_PROXY="http://host.docker.internal:8888" -e HTTPS_PROXY="http://host.docker.internal:8888" --network host ghcr.io/databricks/cli:latest-arm64           
/ # cd /bundle/
/bundle # rm -r .databricks/
/bundle # databricks bundle deploy
Uploading bundle files to /Users/shreyas.goenka@databricks.com/.bundle/bundle-playground/default/files...
Deploying resources...
Updating deployment state...
Deployment complete!
```

<img width="1275" alt="Screenshot 2024-04-11 at 3 22 54 PM"
src="https://github.com/databricks/cli/assets/88374338/2a8f097e-734b-4b3e-8075-c02e98a1b275">
2024-04-12 15:22:30 +00:00
Andrew Nester 77ff994d1b
Correctly transform libraries in for_each_task block (#1340)
## Changes
Now DABs correctly transforms and deploys libraries in for_each_task
block

```
tasks:
  - task_key: my_loop
     for_each_task:
       inputs: "[1,2,3]"
         task:
           task_key: my_loop_iteration 
           libraries:
             - pypi:
                  package: my_package
```

## Tests
Added regression test
2024-04-05 15:52:39 +00:00
dependabot[bot] f28a9d7107
Bump github.com/databricks/databricks-sdk-go from 0.36.0 to 0.37.0 (#1326)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/databricks/databricks-sdk-go&package-manager=go_modules&previous-version=0.36.0&new-version=0.37.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
2024-04-03 10:39:53 +00:00
Ilia Babanov 079c416f8d
Add `bundle debug terraform` command (#1294)
- Add `bundle debug terraform` command. It prints versions of the
Terraform and the Databricks Terraform provider. In the text mode it
also explains how to setup the CLI in environments with restricted
internet access.
- Use `DATABRICKS_TF_EXEC_PATH` env var to point Databricks CLI to the
Terraform binary. The CLI only uses it if `DATABRICKS_TF_VERSION`
matches the currently used terraform version.
- Use `DATABRICKS_TF_CLI_CONFIG_FILE` env var to point Terraform CLI
config that points to the filesystem mirror for the Databricks provider.
The CLI only uses it if `DATABRICKS_TF_PROVIDER_VERSION` matches the
currently used provider version.


Relevant PR on the VSCode extension side:
https://github.com/databricks/databricks-vscode/pull/1147

Example output of the `databricks bundle debug terraform`:
```
Terraform version: 1.5.5
Terraform URL: https://releases.hashicorp.com/terraform/1.5.5

Databricks Terraform Provider version: 1.38.0
Databricks Terraform Provider URL: https://github.com/databricks/terraform-provider-databricks/releases/tag/v1.38.0

Databricks CLI downloads its Terraform dependencies automatically.

If you run the CLI in an air-gapped environment, you can download the dependencies manually and set these environment variables:

  DATABRICKS_TF_VERSION=1.5.5
  DATABRICKS_TF_EXEC_PATH=/path/to/terraform/binary
  DATABRICKS_TF_PROVIDER_VERSION=1.38.0
  DATABRICKS_TF_CLI_CONFIG_FILE=/path/to/terraform/cli/config.tfrc

Here is an example *.tfrc configuration file:

  disable_checkpoint = true
  provider_installation {
    filesystem_mirror {
      path = "/path/to/a/folder/with/databricks/terraform/provider"
    }
  }

The filesystem mirror path should point to the folder with the Databricks Terraform Provider. The folder should have this structure: /registry.terraform.io/databricks/databricks/terraform-provider-databricks_1.38.0_ARCH.zip

For more information about filesystem mirrors, see the Terraform documentation: https://developer.hashicorp.com/terraform/cli/config/config-file#filesystem_mirror
```

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-04-02 12:56:27 +00:00
Pieter Noordhuis 00d76d5afa
Move path field to bundle type (#1316)
## Changes

The bundle path was previously stored on the `config.Root` type under
the assumption that the first configuration file being loaded would set
it. This is slightly counterintuitive and we know what the path is upon
construction of the bundle. The new location for this property reflects
this.

## Tests

Unit tests pass.
2024-03-27 09:03:24 +00:00
Pieter Noordhuis ed194668db
Return `diag.Diagnostics` from mutators (#1305)
## Changes

This diagnostics type allows us to capture multiple warnings as well as
errors in the return value. This is a preparation for returning
additional warnings from mutators in case we detect non-fatal problems.

* All return statements that previously returned an error now return
`diag.FromErr`
* All return statements that previously returned `fmt.Errorf` now return
`diag.Errorf`
* All `err != nil` checks now use `diags.HasError()` or `diags.Error()`

## Tests

* Existing tests pass.
* I confirmed no call site under `./bundle` or `./cmd/bundle` uses
`errors.Is` on the return value from mutators. This is relevant because
we cannot wrap errors with `%w` when calling `diag.Errorf` (like
`fmt.Errorf`; context in https://github.com/golang/go/issues/47641).
2024-03-25 14:18:47 +00:00
Andrew Nester 1b0ac61093
Added deployment state for bundles (#1267)
## Changes
This PR introduces new structure (and a file) being used locally and
synced remotely to Databricks workspace to track bundle deployment
related metadata.

The state is pulled from remote, updated and pushed back remotely as
part of `bundle deploy` command.

This state can be used for deployment sequencing as it's `Version` field
is monotonically increasing on each deployment.

Currently, it only tracks files being synced as part of the deployment.

This helps fix the issue with files not being removed during deployments
on CI/CD as sync snapshot was never present there.

Fixes #943 

## Tests
Added E2E (regression) test for files removal on CI/CD

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-03-18 14:41:58 +00:00
Pieter Noordhuis c05c0cd941
Include `dyn.Path` as argument to the visit callback function (#1260)
## Changes

This change means the callback supplied to `dyn.Foreach` can introspect
the path of the value it is being called for. It also prepares for
allowing visiting path patterns where the exact path is not known
upfront.

## Tests

Unit tests.
2024-03-07 13:56:50 +00:00
Ilia Babanov d12f88e24d
Fix summary command when internal terraform config doesn't exist (#1242)
Check if `bundle.tf.json` doesn't exist and create it before executing
`terraform init` (inside `terraform.Load`)

Fixes a problem when during `terraform.Load` it fails with: 
```
Error: Failed to load plugin schemas

Error while loading schemas for plugin components: Failed to obtain provider
schema: Could not load the schema for provider
registry.terraform.io/databricks/databricks: failed to instantiate provider
"registry.terraform.io/databricks/databricks" to obtain schema: unavailable
provider "registry.terraform.io/databricks/databricks"..
```
2024-03-01 08:25:12 +00:00
Pieter Noordhuis f70ec359dc
Use `dyn.Value` as input to generating Terraform JSON (#1218)
## Changes

This builds on #1098 and uses the `dyn.Value` representation of the
bundle configuration to generate the Terraform JSON definition of
resources in the bundle.

The existing code (in `BundleToTerraform`) was not great and in an
effort to slightly improve this, I added a package `tfdyn` that includes
dedicated files for each resource type. Every resource type has its own
conversion type that takes the `dyn.Value` of the bundle-side resource
and converts it into Terraform resources (e.g. a job and optionally its
permissions).

Because we now use a `dyn.Value` as input, we can represent and emit
zero-values that have so far been omitted. For example, setting
`num_workers: 0` in your bundle configuration now propagates all the way
to the Terraform JSON definition.

## Tests

* Unit tests for every converter. I reused the test inputs from
`convert_test.go`.
* Equivalence tests in every existing test case checks that the
resulting JSON is identical.
* I manually compared the TF JSON file generated by the CLI from the
main branch and from this PR on all of our bundles and bundle examples
(internal and external) and found the output doesn't change (with the
exception of the odd zero-value being included by the version in this
PR).
2024-02-16 20:54:38 +00:00
Pieter Noordhuis 87dd46a3f8
Use dynamic configuration model in bundles (#1098)
## Changes

This is a fundamental change to how we load and process bundle
configuration. We now depend on the configuration being represented as a
`dyn.Value`. This representation is functionally equivalent to Go's
`any` (it is variadic) and allows us to capture metadata associated with
a value, such as where it was defined (e.g. file, line, and column). It
also allows us to represent Go's zero values properly (e.g. empty
string, integer equal to 0, or boolean false).

Using this representation allows us to let the configuration model
deviate from the typed structure we have been relying on so far
(`config.Root`). We need to deviate from these types when using
variables for fields that are not a string themselves. For example,
using `${var.num_workers}` for an integer `workers` field was impossible
until now (though not implemented in this change).

The loader for a `dyn.Value` includes functionality to capture any and
all type mismatches between the user-defined configuration and the
expected types. These mismatches can be surfaced as validation errors in
future PRs.

Given that many mutators expect the typed struct to be the source of
truth, this change converts between the dynamic representation and the
typed representation on mutator entry and exit. Existing mutators can
continue to modify the typed representation and these modifications are
reflected in the dynamic representation (see `MarkMutatorEntry` and
`MarkMutatorExit` in `bundle/config/root.go`).

Required changes included in this change:
* The existing interpolation package is removed in favor of
`libs/dyn/dynvar`.
* Functionality to merge job clusters, job tasks, and pipeline clusters
are now all broken out into their own mutators.

To be implemented later:
* Allow variable references for non-string types.
* Surface diagnostics about the configuration provided by the user in
the validation output.
* Some mutators use a resource's configuration file path to resolve
related relative paths. These depend on `bundle/config/paths.Path` being
set and populated through `ConfigureConfigFilePath`. Instead, they
should interact with the dynamically typed configuration directly. Doing
this also unlocks being able to differentiate different base paths used
within a job (e.g. a task override with a relative path defined in a
directory other than the base job).

## Tests

* Existing unit tests pass (some have been modified to accommodate)
* Integration tests pass
2024-02-16 19:41:58 +00:00
Pieter Noordhuis 788ec81785
Use `any` as type for data sources and resources in `tf/schema` (#1216)
## Changes

We plan to use the any-equivalent of a `dyn.Value` such that we can use
variable references for non-string fields (e.g.
`${databricks_job.some_job.id}` where an integer is expected), as well
as properly emit zero values for primitive types (e.g. 0 for integers or
false for booleans).

This change is in preparation for the above.

## Tests

Unit tests.
2024-02-16 12:46:24 +00:00
Andrew Nester 80670eceed
Added `bundle deployment bind` and `unbind` command (#1131)
## Changes
Added `bundle deployment bind` and `unbind` command.

This command allows to bind bundle-defined resources to existing
resources in Databricks workspace so they become DABs-managed.

## Tests
Manually + added E2E test
2024-02-14 18:04:45 +00:00
Pieter Noordhuis 4073e45d4b
Use mockery to generate mocks compatible with testify/mock (#1190)
## Changes

This is the same approach we use in the Go SDK.

## Tests

Tests pass.
2024-02-08 15:18:53 +00:00
Pieter Noordhuis f7d1a5862d
Use allowlist for Git-related fields to include in metadata (#1187)
## Changes

When new fields are added they should not automatically propagate to the
bundle metadata.

## Tests

Test passes.
2024-02-08 12:23:14 +00:00
Andrew Nester 6edab93233
Added warning when trying to deploy bundle with `--fail-if-running` and running resources (#1163)
## Changes
Deploying bundle when there are bundle resources running at the same
time can be disruptive for jobs and pipelines in progress.

With this change during deployment phase (before uploading any
resources) if there is `--fail-if-running` specified DABs will check if
there are any resources running and if so, will fail the deployment

## Tests
Manual + add tests
2024-02-07 11:17:17 +00:00
Ilia Babanov 9c3e4fda7c
Add "bundle summary" command (#1123)
The plan is to use the new command in the Databricks VSCode extension to
render "modified" UI state in the bundle resource tree elements, plus
use resource IDs to generate links for the resources

### New revision
- Renamed `remote-state` to `summary`
- Added "modified statuses" to all resources. Currently we don't set
"updated" status - it's either nothing, or created/deleted
- Added tests for the `TerraformToBundle` command
2024-01-25 11:32:47 +00:00
Lennart Kats (databricks) 9a1f078bd9
Improve error when bundle root is not writable (#1093)
## Changes

This improves the error when deploying to a bundle root that the current
user doesn't have write access to. This can come up slightly more often
since the change of https://github.com/databricks/cli/pull/1091.

Before this change:

```
$ databricks bundle deploy --target prod
Building my_project...
Error: no such directory: /Users/lennart.kats@databricks.com/.bundle/my_project/prod/state
```

After this change:

```
$ databricks bundle deploy --target prod
Building my_project...
Error: cannot write to deployment root (this can indicate a previous deploy was done with a different identity): /Users/lennart.kats@databricks.com/.bundle/my_project/prod
```

Note that this change uses the "no such directory" error returned from
the filer.
2023-12-28 13:15:21 +00:00
Lennart Kats (databricks) 875c9d2db1
Tune output of bundle deploy command (#1047)
## Changes

Update the output of the `deploy` command to be more concise and
consistent:
```
$ databricks bundle deploy
Building my_project...
Uploading my_project-0.0.1+20231207.205106-py3-none-any.whl...
Uploading bundle files to /Users/lennart.kats@databricks.com/.bundle/my_project/dev/files...
Deploying resources...
Updating deployment state...
Deployment complete!
```

This does away with the intermediate success messages, makes consistent
use of `...`, and only prints the success message at the very end after
everything is completed.

Below is the original output for comparison:

```
$ databricks bundle deploy
Detecting Python wheel project...
Found Python wheel project at /tmp/output/my_project
Building my_project...
Build succeeded
Uploading my_project-0.0.1+20231207.205134-py3-none-any.whl...
Upload succeeded
Starting upload of bundle files
Uploaded bundle files at /Users/lennart.kats@databricks.com/.bundle/my_project/dev/files!

Starting resource deployment
Resource deployment completed!
```
2023-12-21 08:00:37 +00:00
shreyas-goenka 2d93f62f21
Set metadata fields required to enable break-glass UI for jobs (#880)
## Changes

This PR sets the following fields for all jobs that are deployed from a
DAB
1. `deployment`: This provides the platform with the path to a file to
read the metadata from.
2. `edit_mode`: This tells the platform to display the break-glass UI
for jobs deployed from a DAB. Setting this is required to re-lock the UI
after a user clicks "disconnect from source".
3. `format = MULTI_TASK`. This makes the Terraform provider always use
jobs API 2.1 for creating/updating the job. Required because
`deployment` and `edit_mode` are only available in API 2.1.

## Tests

Unit test and manually. Manually verified that deployments trigger the
break glass UI. Manually verified there is no Terraform drift when all
three fields are set.

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-12-19 07:38:52 +00:00
Pieter Noordhuis 37671d9f54
Fix passthrough of pipeline notifications (#1058)
## Changes

Notifications weren't passed along because of a plural vs singular
mismatch.

## Tests

* Added unit test coverage.
* Manually confirmed it now works in an example bundle.
2023-12-12 11:36:06 +00:00
Pieter Noordhuis 6187803007
Correctly overwrite local state if remote state is newer (#1008)
## Changes

A bug in the code that pulls the remote state could cause the local
state to be empty instead of a copy of the remote state. This happened
only if the local state was present and stale when compared to the
remote version.

We correctly checked for the state serial to see if the local state had
to be replaced but didn't seek back on the remote state before writing
it out. Because the staleness check would read the remote state in full,
copying from the same reader would immediately yield an EOF.

## Tests

* Unit tests for state pull and push mutators that rely on a mocked
filer.
* An integration test that deploys the same bundle from multiple paths,
triggering the staleness logic.

Both failed prior to the fix and now pass.
2023-11-24 11:15:46 +00:00
Andrew Nester 48e293c72c
Pass `USERPROFILE` environment variable to Terraform (#1001)
## Changes
It appears that `USERPROFILE` env variable indicates where Azure CLI
stores configuration data (aka `.azure` folder).

https://learn.microsoft.com/en-us/cli/azure/azure-cli-configuration#cli-configuration-file

Passing it to terraform executable allows it to correctly authenticate
using Azure CLI.

Fixes #983 

## Tests
Ran deployment on Window VM before and after the fix.
2023-11-22 09:16:28 +00:00
Pieter Noordhuis 489d6fa1b8
Replace direct calls with `bundle.Apply` (#990)
## Changes

Some test call sites called directly into the mutator's `Apply` function
instead of `bundle.Apply`. Calling into `bundle.Apply` is preferred
because that's where we can run pre/post logic common across all
mutators.

## Tests

Pass.
2023-11-15 14:19:18 +00:00
Pieter Noordhuis d80c35f66a
Rename variable `bundle -> b` (#989)
## Changes

All calls to apply a mutator must go through `bundle.Apply`. This
conflicts with the existing use of the variable `bundle`. This change
un-aliases the variable from the package name by renaming all variables
to `b`.

## Tests

Pass.
2023-11-15 14:03:36 +00:00
shreyas-goenka 0c837e5772
Make `file_path` and `artifact_path` fields consistent with json tag (#987)
## Changes
This PR:
1. Renames `FilesPath` -> `FilePath` and `ArtifactsPath` ->
`ArtifactPath` in the bundle and metadata configuration to make them
consistant with the json tags.
2. Fixes development / production mode error messages to point to
`file_path` and `artifact_path`

## Tests
Existing unit tests. This is a strightforward renaming of the fields.
2023-11-15 13:37:26 +00:00
shreyas-goenka b6aa4631f1
Fix metadata computation for empty bundle (#939)
## Changes
This PR fixes metadata computation for empty bundle. Before we would
error because the `terraform.Load()` mutator errors on a empty / no
state file.

## Tests
Failing integration tests now pass.
2023-11-02 11:00:30 +00:00
shreyas-goenka 5a8cd0c5bc
Persist deployment metadata in WSFS (#845)
## Changes

This PR introduces a metadata struct that stores a subset of bundle
configuration that we wish to expose to other Databricks services that
wish to integrate with bundles.

This metadata file is uploaded to a file
`${bundle.workspace.state_path}/metadata.json` in the WSFS destination
of the bundle deployment.

Documentation for emitted metadata fields:
* `version`: Version for the metadata file schema
* `config.bundle.git.branch`: Name of the git branch the bundle was
deployed from.
* `config.bundle.git.origin_url`: URL for git remote "origin"
* `config.bundle.git.bundle_root_path`: Relative path of the bundle root
from the root of the git repository. Is set to "." if they are the same.
* `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this
bundle was deployed from. Note, the deployment might not exactly match
this commit version if there are changes that have not been committed to
git at deploy time,
* `file_path`: Path in workspace where we sync bundle files to. 
* `resources.jobs.[job-ref].id`: Id of the job
* `resources.jobs.[job-ref].relative_path`: Relative path of the yaml
config file from the bundle root where this job was defined.

Example metadata object when bundle root and git root are the same:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "master",
        "origin_url": "www.host.com",
        "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0",
        "bundle_root_path": "."
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```

Example metadata when the git root is one level above the bundle repo:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "dev-branch",
        "origin_url": "www.my-repo.com",
        "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b",
        "bundle_root_path": "pipeline-progress"
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```


This unblocks integration to the jobs break glass UI for bundles.

## Tests
Unit tests and integration tests.
2023-10-27 12:55:43 +00:00
Arpit Jasapara 24cc67563e
Support Unity Catalog Registered Models in bundles (#846)
## Changes
<!-- Summary of your changes that are easy to understand -->
Add UC Registered Models support to Databricks Asset Bundles as new
resource `registered_model`. Also added UC Permission support via new
resource `grant`.

## Tests
<!-- How is this tested? -->
Tested via unit tests and manual testing with [example
PR](https://github.com/databricks/bundle-examples-internal/pull/80) and
[custom Terraform
provider](https://github.com/databricks/terraform-provider-databricks/pull/2771).
<img width="698" alt="Screenshot 2023-10-08 at 4 57 23 PM"
src="https://github.com/databricks/cli/assets/87999496/bcf605a9-7894-443b-865a-f7e240037815">
<img width="1109" alt="Screenshot 2023-10-08 at 4 56 47 PM"
src="https://github.com/databricks/cli/assets/87999496/e4d6e424-cd70-4809-8843-6939ed2e172f">
<img width="1091" alt="Screenshot 2023-10-08 at 4 56 57 PM"
src="https://github.com/databricks/cli/assets/87999496/88ebaabb-67db-4a11-88a5-df087e2e41c0">

---------

Signed-off-by: Arpit Jasapara <arpit.jasapara@databricks.com>
Co-authored-by: Andrew Nester <andrew.nester.dev@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-10-16 15:32:49 +00:00
shreyas-goenka fe32c46dc8
Make bundle deploy work if no resources are defined (#767)
## Changes

This PR sets "resource" to nil in the terraform representation if no
resources are defined in the bundle configuration. This solves two
problems:

1. Makes bundle deploy work without any resources specified. 
2. Previously if a `resources` block was removed after a deployment,
that would fail with an error. Now the resources would get destroyed as
expected.

Also removes `TerraformHasNoResources` which is no longer needed.

## Tests
New e2e tests.
2023-09-13 22:50:37 +00:00
Pieter Noordhuis 4ccc70aeac
Consolidate environment variable interaction (#747)
## Changes

There are a couple places throughout the code base where interaction
with environment variables takes place. Moreover, more than one of these
would try to read a value from more than one environment variable as
fallback (for backwards compatibility). This change consolidates those
accesses.

The majority of diffs in this change are mechanical (i.e. add an
argument or replace a call).

This change:
* Moves common environment variable lookups for bundles to
`bundles/env`.
* Adds a `libs/env` package that wraps `os.LookupEnv` and `os.Getenv`
and allows for overrides to take place in a `context.Context`. By
scoping overrides to a `context.Context` we can avoid `t.Setenv` in
testing and unlock parallel test execution for integration tests.
* Updates call sites to pass through a `context.Context` where needed.
* For bundles, introduces `DATABRICKS_BUNDLE_ROOT` as new primary
variable instead of `BUNDLE_ROOT`. This was the last environment
variable that did not use the `DATABRICKS_` prefix.

## Tests

Unit tests pass.
2023-09-11 08:18:43 +00:00
Andrew Nester f7566b8264
Close local Terraform state file when pushing to remote (#752)
## Changes
Close local Terraform state file when pushing to remote

Should help fix E2E test cleanup
```
testing.go:1225: TempDir RemoveAll cleanup: remove 
C:\Users\RUNNER~1\AppData\Local\Temp\TestAccPythonWheelTaskDeployAndRun1395546390\001\.databricks\bundle\default\terraform\terraform.tfstate: 
The process cannot access the file because it is being used by another process.
```
2023-09-08 10:47:17 +00:00
Arpit Jasapara 50eaf16307
Support Model Serving Endpoints in bundles (#682)
## Changes
<!-- Summary of your changes that are easy to understand -->
Add Model Serving Endpoints to Databricks Bundles

## Tests
<!-- How is this tested? -->
Unit tests and manual testing via
https://github.com/databricks/bundle-examples-internal/pull/76
<img width="1570" alt="Screenshot 2023-08-28 at 7 46 23 PM"
src="https://github.com/databricks/cli/assets/87999496/7030ebd8-b0e2-4ad1-a9e3-5ff8454f1175">
<img width="747" alt="Screenshot 2023-08-28 at 7 47 01 PM"
src="https://github.com/databricks/cli/assets/87999496/fb9b54d7-54e2-43ce-9148-68fb620c809a">

Signed-off-by: Arpit Jasapara <arpit.jasapara@databricks.com>
2023-09-07 21:54:31 +00:00
Andrew Nester 10e0836749
Added end-to-end test for deploying and running Python wheel task (#741)
## Changes
Added end-to-end test for deploying and running Python wheel task

## Tests
Test successfully passed on all environments, takes about 9-10 minutes
to pass.

```
Deleted snapshot file at /var/folders/nt/xjv68qzs45319w4k36dhpylc0000gp/T/TestAccPythonWheelTaskDeployAndRun1845899209/002/.databricks/bundle/default/sync-snapshots/1f7cc766ffe038d6.json
Successfully deleted files!
2023/09/06 17:50:50 INFO Releasing deployment lock mutator=destroy mutator=seq mutator=seq mutator=deferred mutator=lock:release
--- PASS: TestAccPythonWheelTaskDeployAndRun (508.16s)
PASS
coverage: 77.9% of statements in ./...
ok      github.com/databricks/cli/internal/bundle       508.810s        coverage: 77.9% of statements in ./...
```

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-09-07 14:08:16 +00:00
Pieter Noordhuis c0ebfb8101
Fix conversion of job parameters (#744)
## Changes

Another example of singular/plural conversion.

Longer term solution is we do a full sweep of the type using reflection
to make sure we cover all fields.

## Tests

Unit test passes.
2023-09-07 12:48:59 +00:00