Commit Graph

10 Commits

Author SHA1 Message Date
shreyas-goenka 4e8e027380
Sort tasks by `task_key` before generating the Terraform configuration (#1776)
## Changes
Sort the tasks in the resultant `bundle.tf.json`. This is important
because configuration from one task can leak into another if the tasks
are not sorted.

For more details see:
1.
https://github.com/databricks/terraform-provider-databricks/issues/3951
2.
https://github.com/databricks/terraform-provider-databricks/issues/4011

## Tests
Unit test and manually.

For manual testing I used the following configuration:
```
resources:
  jobs:
    foo:
      tasks: 
        - task_key: task-Z
          notebook_task: 
            notebook_path: nb.py
            source: GIT
          existing_cluster_id: 0715-133738-ju0ma84z

        - task_key: task-1
          notebook_task: 
            notebook_path: ${workspace.file_path}/local.py
            source: WORKSPACE
          existing_cluster_id: 0715-133738-ju0ma84z
          depends_on: 
            - task_key: task-Z


      git_source: 
        git_provider: gitHub
        git_url: https://github.com/shreyas-goenka/job-source-tmp.git
        git_branch: main
```

Steps (1):
1. Deploy this bundle.
2. Comment out "source: GIT"
3. Deploy again

Before:
Deploying this bundle twice would fail. This is because the "source:
GIT" would carry over to the next deployment.

After:
There was no error on the subsequent deployment.

Steps (2):
1. Deploy once
2. Deploy again

Before:
Works correctly but leads to a update API call every time.

After:
No diff is detected by terraform.
2024-09-26 13:22:22 +00:00
Andrew Nester 56ed9bebf3
Added support for creating all-purpose clusters (#1698)
## Changes
Added support for creating all-purpose clusters

Example of configuration

```
bundle:
  name: clusters

resources:
  clusters:
    test_cluster:
      cluster_name: "Test Cluster"
      num_workers: 2
      node_type_id: "i3.xlarge"
      autoscale:
        min_workers: 2
        max_workers: 7
      spark_version: "13.3.x-scala2.12"
      spark_conf:
        "spark.executor.memory": "2g"

  jobs:
    test_job:
      name: "Test Job"
      tasks:
        - task_key: test_task
          existing_cluster_id: ${resources.clusters.test_cluster.id}
          notebook_task:
            notebook_path: "./src/test.py"

targets:
    development:
      mode: development
      compute_id: ${resources.clusters.test_cluster.id}

```

## Tests
Added unit, config and E2E tests
2024-09-23 10:42:34 +00:00
shreyas-goenka 89c0af5bdc
Add resource for UC schemas to DABs (#1413)
## Changes
This PR adds support for UC Schemas to DABs. This allows users to define
schemas for tables and other assets their pipelines/workflows create as
part of the DAB, thus managing the life-cycle in the DAB.

The first version has a couple of intentional limitations:
1. The owner of the schema will be the deployment user. Changing the
owner of the schema is not allowed (yet). `run_as` will not be
restricted for DABs containing UC schemas. Let's limit the scope of
run_as to the compute identity used instead of ownership of data assets
like UC schemas.
2. API fields that are present in the update API but not the create API.
For example: enabling predictive optimization is not supported in the
create schema API and thus is not available in DABs at the moment.

## Tests
Manually and integration test. Manually verified the following work:
1. Development mode adds a "dev_" prefix.
2. Modified status is correctly computed in the `bundle summary`
command.
3. Grants work as expected, for assigning privileges.
4. Variable interpolation works for the schema ID.
2024-07-31 12:16:28 +00:00
shreyas-goenka 068c7cfc2d
Return `dyn.InvalidValue` instead of `dyn.NilValue` when errors happen (#1514)
## Changes
With https://github.com/databricks/cli/pull/1507 and
https://github.com/databricks/cli/pull/1511 we are clarifying the
semantics associated with `dyn.InvalidValue` and `dyn.NilValue`. An
invalid value is the default zero value and is used to signals the
complete absence of the value.

A nil value, on the other hand, is a valid value for a piece of
configuration and signals explicitly setting a key to nil in the
configuration tree. In keeping with that theme, this PR returns
`dyn.InvalidValue` instead of `dyn.NilValue` at error sites. This change
is not expected to have a material change in behaviour and is being done
to set the right convention since we have well-defined semantics
associated with both `NilValue` and `InvalidValue`.

## Tests
Unit tests and integration tests pass. Also manually scanned the changes
and the associated call sites to verify the `NilValue` value itself was
not being relied upon.
2024-06-21 14:22:42 +00:00
Aravind Segu a33d0c8bf9
Add support for Lakehouse monitoring in bundles (#1307)
## Changes

This change adds support for Lakehouse monitoring in bundles.

The associated resource type name is "quality monitor".

## Testing

Unit tests.

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: Arpit Jasapara <87999496+arpitjasa-db@users.noreply.github.com>
2024-05-31 09:42:25 +00:00
Andrew Nester 1872aa12b3
Added support for job environments (#1379)
## Changes
The main changes are:
1. Don't link artifacts to libraries anymore and instead just iterate
over all jobs and tasks when uploading artifacts and update local path
to remote
2. Iterating over `jobs.environments` to check if there are any local
libraries and checking that they exist locally
3. Added tests to check environments are handled correctly

End-to-end test will follow up

## Tests
Added regression test, existing tests (including integration one) pass
2024-04-22 11:44:34 +00:00
Andrew Nester 77ff994d1b
Correctly transform libraries in for_each_task block (#1340)
## Changes
Now DABs correctly transforms and deploys libraries in for_each_task
block

```
tasks:
  - task_key: my_loop
     for_each_task:
       inputs: "[1,2,3]"
         task:
           task_key: my_loop_iteration 
           libraries:
             - pypi:
                  package: my_package
```

## Tests
Added regression test
2024-04-05 15:52:39 +00:00
dependabot[bot] f28a9d7107
Bump github.com/databricks/databricks-sdk-go from 0.36.0 to 0.37.0 (#1326)
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/databricks/databricks-sdk-go&package-manager=go_modules&previous-version=0.36.0&new-version=0.37.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
2024-04-03 10:39:53 +00:00
Pieter Noordhuis c05c0cd941
Include `dyn.Path` as argument to the visit callback function (#1260)
## Changes

This change means the callback supplied to `dyn.Foreach` can introspect
the path of the value it is being called for. It also prepares for
allowing visiting path patterns where the exact path is not known
upfront.

## Tests

Unit tests.
2024-03-07 13:56:50 +00:00
Pieter Noordhuis f70ec359dc
Use `dyn.Value` as input to generating Terraform JSON (#1218)
## Changes

This builds on #1098 and uses the `dyn.Value` representation of the
bundle configuration to generate the Terraform JSON definition of
resources in the bundle.

The existing code (in `BundleToTerraform`) was not great and in an
effort to slightly improve this, I added a package `tfdyn` that includes
dedicated files for each resource type. Every resource type has its own
conversion type that takes the `dyn.Value` of the bundle-side resource
and converts it into Terraform resources (e.g. a job and optionally its
permissions).

Because we now use a `dyn.Value` as input, we can represent and emit
zero-values that have so far been omitted. For example, setting
`num_workers: 0` in your bundle configuration now propagates all the way
to the Terraform JSON definition.

## Tests

* Unit tests for every converter. I reused the test inputs from
`convert_test.go`.
* Equivalence tests in every existing test case checks that the
resulting JSON is identical.
* I manually compared the TF JSON file generated by the CLI from the
main branch and from this PR on all of our bundles and bundle examples
(internal and external) and found the output doesn't change (with the
exception of the odd zero-value being included by the version in this
PR).
2024-02-16 20:54:38 +00:00