Commit Graph

26 Commits

Author SHA1 Message Date
Andrew Nester 56ed9bebf3
Added support for creating all-purpose clusters (#1698)
## Changes
Added support for creating all-purpose clusters

Example of configuration

```
bundle:
  name: clusters

resources:
  clusters:
    test_cluster:
      cluster_name: "Test Cluster"
      num_workers: 2
      node_type_id: "i3.xlarge"
      autoscale:
        min_workers: 2
        max_workers: 7
      spark_version: "13.3.x-scala2.12"
      spark_conf:
        "spark.executor.memory": "2g"

  jobs:
    test_job:
      name: "Test Job"
      tasks:
        - task_key: test_task
          existing_cluster_id: ${resources.clusters.test_cluster.id}
          notebook_task:
            notebook_path: "./src/test.py"

targets:
    development:
      mode: development
      compute_id: ${resources.clusters.test_cluster.id}

```

## Tests
Added unit, config and E2E tests
2024-09-23 10:42:34 +00:00
shreyas-goenka 7ae80de351
Stop tracking file path locations in bundle resources (#1673)
## Changes
Since locations are already tracked in the dynamic value tree, we no
longer need to track it at the resource/artifact level. This PR:
1. Removes use of `paths.Paths`. Uses dyn.Location instead.
2. Refactors the validation of resources not being empty valued to be
generic across all resource types.
  
## Tests
Existing unit tests.
2024-08-13 12:50:15 +00:00
shreyas-goenka 89c0af5bdc
Add resource for UC schemas to DABs (#1413)
## Changes
This PR adds support for UC Schemas to DABs. This allows users to define
schemas for tables and other assets their pipelines/workflows create as
part of the DAB, thus managing the life-cycle in the DAB.

The first version has a couple of intentional limitations:
1. The owner of the schema will be the deployment user. Changing the
owner of the schema is not allowed (yet). `run_as` will not be
restricted for DABs containing UC schemas. Let's limit the scope of
run_as to the compute identity used instead of ownership of data assets
like UC schemas.
2. API fields that are present in the update API but not the create API.
For example: enabling predictive optimization is not supported in the
create schema API and thus is not available in DABs at the moment.

## Tests
Manually and integration test. Manually verified the following work:
1. Development mode adds a "dev_" prefix.
2. Modified status is correctly computed in the `bundle summary`
command.
3. Grants work as expected, for assigning privileges.
4. Variable interpolation works for the schema ID.
2024-07-31 12:16:28 +00:00
Aravind Segu a33d0c8bf9
Add support for Lakehouse monitoring in bundles (#1307)
## Changes

This change adds support for Lakehouse monitoring in bundles.

The associated resource type name is "quality monitor".

## Testing

Unit tests.

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: Arpit Jasapara <87999496+arpitjasa-db@users.noreply.github.com>
2024-05-31 09:42:25 +00:00
Andrew Nester a014d50a6a
Fixed panic when loading incorrectly defined jobs (#1402)
## Changes
If only key was defined for a job in YAML config, validate previously
failed with segfault.

This PR validates that jobs are correctly defined and returns an error
if not.

## Tests
Added regression test
2024-05-17 10:10:17 +00:00
Pieter Noordhuis 87dd46a3f8
Use dynamic configuration model in bundles (#1098)
## Changes

This is a fundamental change to how we load and process bundle
configuration. We now depend on the configuration being represented as a
`dyn.Value`. This representation is functionally equivalent to Go's
`any` (it is variadic) and allows us to capture metadata associated with
a value, such as where it was defined (e.g. file, line, and column). It
also allows us to represent Go's zero values properly (e.g. empty
string, integer equal to 0, or boolean false).

Using this representation allows us to let the configuration model
deviate from the typed structure we have been relying on so far
(`config.Root`). We need to deviate from these types when using
variables for fields that are not a string themselves. For example,
using `${var.num_workers}` for an integer `workers` field was impossible
until now (though not implemented in this change).

The loader for a `dyn.Value` includes functionality to capture any and
all type mismatches between the user-defined configuration and the
expected types. These mismatches can be surfaced as validation errors in
future PRs.

Given that many mutators expect the typed struct to be the source of
truth, this change converts between the dynamic representation and the
typed representation on mutator entry and exit. Existing mutators can
continue to modify the typed representation and these modifications are
reflected in the dynamic representation (see `MarkMutatorEntry` and
`MarkMutatorExit` in `bundle/config/root.go`).

Required changes included in this change:
* The existing interpolation package is removed in favor of
`libs/dyn/dynvar`.
* Functionality to merge job clusters, job tasks, and pipeline clusters
are now all broken out into their own mutators.

To be implemented later:
* Allow variable references for non-string types.
* Surface diagnostics about the configuration provided by the user in
the validation output.
* Some mutators use a resource's configuration file path to resolve
related relative paths. These depend on `bundle/config/paths.Path` being
set and populated through `ConfigureConfigFilePath`. Instead, they
should interact with the dynamically typed configuration directly. Doing
this also unlocks being able to differentiate different base paths used
within a job (e.g. a task override with a relative path defined in a
directory other than the base job).

## Tests

* Existing unit tests pass (some have been modified to accommodate)
* Integration tests pass
2024-02-16 19:41:58 +00:00
Andrew Nester 80670eceed
Added `bundle deployment bind` and `unbind` command (#1131)
## Changes
Added `bundle deployment bind` and `unbind` command.

This command allows to bind bundle-defined resources to existing
resources in Databricks workspace so they become DABs-managed.

## Tests
Manually + added E2E test
2024-02-14 18:04:45 +00:00
Ilia Babanov 9c3e4fda7c
Add "bundle summary" command (#1123)
The plan is to use the new command in the Databricks VSCode extension to
render "modified" UI state in the bundle resource tree elements, plus
use resource IDs to generate links for the resources

### New revision
- Renamed `remote-state` to `summary`
- Added "modified statuses" to all resources. Currently we don't set
"updated" status - it's either nothing, or created/deleted
- Added tests for the `TerraformToBundle` command
2024-01-25 11:32:47 +00:00
Arpit Jasapara 24cc67563e
Support Unity Catalog Registered Models in bundles (#846)
## Changes
<!-- Summary of your changes that are easy to understand -->
Add UC Registered Models support to Databricks Asset Bundles as new
resource `registered_model`. Also added UC Permission support via new
resource `grant`.

## Tests
<!-- How is this tested? -->
Tested via unit tests and manual testing with [example
PR](https://github.com/databricks/bundle-examples-internal/pull/80) and
[custom Terraform
provider](https://github.com/databricks/terraform-provider-databricks/pull/2771).
<img width="698" alt="Screenshot 2023-10-08 at 4 57 23 PM"
src="https://github.com/databricks/cli/assets/87999496/bcf605a9-7894-443b-865a-f7e240037815">
<img width="1109" alt="Screenshot 2023-10-08 at 4 56 47 PM"
src="https://github.com/databricks/cli/assets/87999496/e4d6e424-cd70-4809-8843-6939ed2e172f">
<img width="1091" alt="Screenshot 2023-10-08 at 4 56 57 PM"
src="https://github.com/databricks/cli/assets/87999496/88ebaabb-67db-4a11-88a5-df087e2e41c0">

---------

Signed-off-by: Arpit Jasapara <arpit.jasapara@databricks.com>
Co-authored-by: Andrew Nester <andrew.nester.dev@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-10-16 15:32:49 +00:00
Andrew Nester 30c4d2e8a7
Fixed merging task libraries from targets (#868)
## Changes
Previous we (erroneously) kept the reference and merged into the
original tasks and not the copies which we later used to replace
existing tasks. Thus the merging of slices and references was incorrect.

Fixes #864 

## Tests
Added a regression test
2023-10-16 08:48:32 +00:00
hectorcast-db 36f30c8b47
Update Go SDK to 0.23.0 and use custom marshaller (#772)
## Changes
Update Go SDK to 0.23.0 and use custom marshaller.
## Tests
* Run unit tests

* Run nightly

* Manual test:
```
./cli jobs create --json @myjob.json
```
with 
```
{
    "name": "my-job-marshal-test-go",
    "tasks": [{
        "task_key": "testgomarshaltask",
        "new_cluster": {
            "num_workers": 0,
            "spark_version": "10.4.x-scala2.12",
            "node_type_id": "Standard_DS3_v2"
        },
        "libraries": [
            {
                "jar": "dbfs:/max/jars/exampleJarTask.jar"
            }
        ],
        "spark_jar_task": {
            "main_class_name":  "com.databricks.quickstart.exampleTask"
        }
    }]
}
```
Main branch:
```
Error: Cluster validation error: Missing required field: settings.cluster_spec.new_cluster.size
```
This branch:
```
{
  "job_id":<jobid>
}
```

---------

Co-authored-by: Miles Yucht <miles@databricks.com>
2023-10-16 06:56:06 +00:00
Pieter Noordhuis ee30277119
Enable target overrides for pipeline clusters (#792)
## Changes

This is a follow-up to #658 and #779 for jobs.

This change applies label normalization the same way the backend does.

## Tests

Unit and config loading tests.
2023-09-21 19:21:20 +00:00
Andrew Nester 43e2eefc27
Enable environment overrides for job tasks (#779)
## Changes
Follow up for https://github.com/databricks/cli/pull/658

When a job definition has multiple job tasks using the same key, it's
considered invalid. Instead we should combine those definitions with the
same key into one. This is consistent with environment overrides. This
way, the override ends up in the original job tasks, and we've got a
clear way to put them all together.

## Tests
Added unit tests
2023-09-18 14:13:50 +00:00
Arpit Jasapara 50eaf16307
Support Model Serving Endpoints in bundles (#682)
## Changes
<!-- Summary of your changes that are easy to understand -->
Add Model Serving Endpoints to Databricks Bundles

## Tests
<!-- How is this tested? -->
Unit tests and manual testing via
https://github.com/databricks/bundle-examples-internal/pull/76
<img width="1570" alt="Screenshot 2023-08-28 at 7 46 23 PM"
src="https://github.com/databricks/cli/assets/87999496/7030ebd8-b0e2-4ad1-a9e3-5ff8454f1175">
<img width="747" alt="Screenshot 2023-08-28 at 7 47 01 PM"
src="https://github.com/databricks/cli/assets/87999496/fb9b54d7-54e2-43ce-9148-68fb620c809a">

Signed-off-by: Arpit Jasapara <arpit.jasapara@databricks.com>
2023-09-07 21:54:31 +00:00
Andrew Nester 83443bae8d
Make resource and artifact paths in bundle config relative to config folder (#708)
# Warning: breaking change

## Changes
Instead of having paths in bundle config files be relative to bundle
root even if the config file is nested, this PR makes such paths
relative to the folder where the config is located.

When bundle is initialised, these paths will be transformed to relative
paths based on bundle root. For example,
we have file structure like this
```
- mybundle
| - bundle.yml
| - subfolder
| -- resource.yml
| -- my.whl
```

Previously, we had to reference `my.whl` in resource.yml like this,
which was confusing because resource.yml is in the same subfolder
```
sync:
  include:
    - ./subfolder/*.whl
...
tasks:
  - task_key: name
    libraries:
      - whl: ./subfolder/my.whl
...
```

After the change we can reference it like this (which is in line with
the current behaviour for notebooks)

```
sync:
  include:
    - ./*.whl
...
tasks:
  - task_key: name
    libraries:
      - whl: ./my.whl
...
```

## Tests
Existing `translate_path_tests` successfully passed after refactoring.

Added a couple of uses cases for `Libraries` paths.

Added a bundle config tests with include config and sync section

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-09-04 09:55:01 +00:00
Andrew Nester 56dcd3f0a7
Renamed `environments` to `targets` in bundle configuration (#670)
## Changes
Renamed Environments to Targets in bundle.yml.

The change is backward-compatible and customers can continue to use
`environments` in the time being.

## Tests
Added tests which checks that both `environments` and `targets` sections
in bundle.yml works correctly
2023-08-17 15:22:32 +00:00
Pieter Noordhuis 97699b849f
Enable environment overrides for job clusters (#658)
## Changes

While they are a slice, we can identify a job cluster by its job cluster
key. A job definition with multiple job clusters with the same key is
always invalid. We can therefore merge definitions with the same key
into one. This is compatible with how environment overrides are applied;
merging a slice means appending to it. The override will end up in the
job cluster slice of the original, which gives us a deterministic way to
merge them.

Since the alternative is an invalid configuration, this doesn't change
behavior.

## Tests

New test coverage.
2023-08-14 06:43:45 +00:00
Serge Smertin 9581187c9e
Update to Go SDK v0.8.0 (#351)
## Changes

- Update to Go SDK v0.8.0
- Fix all breaking changes

## Tests

- make test
2023-04-21 10:30:20 +02:00
Pieter Noordhuis 31ccebd62a
Store relative path to configuration file for every resource (#322)
## Changes

If a configuration file is located in a subdirectory of the bundle root,
files referenced from that configuration file should be relative to its
configuration file's directory instead of the bundle root.

## Tests

* New tests in `bundle/config/mutator/translate_paths_test.go`.
* Existing tests under `bundle/tests` pass and are augmented to assert
on paths.

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2023-04-12 16:17:13 +02:00
shreyas-goenka 8de7d32ed1
Add readonly bundle tag for internal fields (#302)
This PR adds a bundle: "readonly" struct tag to the json schema
generator. This allows us to skip generating json schema for internal
readonly fields

Tested using unit test
2023-04-04 12:16:07 +02:00
Pieter Noordhuis 66ca9ec266
Add permissions block to each resource (#264)
Example:

```yaml
resources:
  jobs:
    my_job:
      name: "[${bundle.environment}] My job"
      permissions:
        - level: CAN_VIEW
          group_name: users
```
2023-03-21 10:58:16 +01:00
Pieter Noordhuis 58563b1ea9
Add resources for mlflow models and experiments (#263)
Manually confirmed that both can be deployed.
2023-03-20 21:28:43 +01:00
Pieter Noordhuis d5474c9673
Revert "Rename jobs -> workflows" (#118)
This reverts PR #111.

This reverts commit 230811031f.
2022-12-01 22:39:15 +01:00
Pieter Noordhuis 230811031f
Rename jobs -> workflows (#111) 2022-12-01 09:35:21 +01:00
Pieter Noordhuis 195eb7f0f9
Add job and pipeline structs (#94) 2022-11-18 11:12:24 +01:00
Pieter Noordhuis e47fa61951
Skeleton for configuration loading and mutation (#92)
Load a tree of configuration files anchored at `bundle.yml` into the
`config.Root` struct.

All mutations (from setting defaults to merging files) are observable
through the `mutator.Mutator` interface.
2022-11-18 10:57:31 +01:00