Compare commits

...

23 Commits

Author SHA1 Message Date
Pieter Noordhuis a24f268f8a
Merge branch 'main' into dependabot/go_modules/golang.org/x/mod-0.21.0 2024-09-26 08:01:07 +02:00
Pieter Noordhuis 12d27318b4
Update go.mod 2024-09-26 08:00:57 +02:00
shreyas-goenka 495040e4cd
Modify SetLocation test utility to take full locations as argument (#1788)
I plan to use this in https://github.com/databricks/cli/pull/1780, to
set the line and column numbers as well for the locations.

gopatch file used:
```
@@
var x expression
var y expression
var z expression
@@
-bundletest.SetLocation(x, y, z)
+bundletest.SetLocation(x, y, []dyn.Location{{File: z}})
```
2024-09-25 16:13:48 +00:00
Pieter Noordhuis 7f1121d8d8
Pin Go toolchain to 1.22.7 (#1790)
## Changes

Relates to https://github.com/databricks/cli/pull/1758.

More information about toolchains:
* https://go.dev/blog/toolchain
* https://go.dev/doc/toolchain

We need to specify the toolchain as we need to bump Go to 1.22.0 for the
`mod` upgrade and want to use the latest toolchain on the 1.22 series.

## Tests

The previous release was made with Go 1.22.7 so we should continue to
use it.
2024-09-25 15:45:28 +00:00
shreyas-goenka a4ba0bbe9f
Add sub-extension to resource files in built-in templates (#1777)
## Changes
We want to encourage a pattern of only specifying a single resource in a
YAML file when an `.<resource-type>.yml` (like `.job.yml`) is used. This
convention could allow us to bijectively map a resource YAML file to
it's corresponding resource in the Databricks workspace.

This PR simply makes the built-in templates compliant to this format.

## Tests
Existing tests.
2024-09-25 12:58:14 +00:00
Andrew Nester b3a3071086
Fixed full variable override detection (#1787)
## Changes
Fixes #1786 

## Tests
All valid override combinations are added as test cases
2024-09-25 12:35:16 +00:00
Gleb Kanterov 3d9decdda9
Add JobTaskClusterSpec validate mutator (#1784)
## Changes
Add JobTaskClusterSpec validate mutator. It catches the case when tasks
don't which cluster to use.

For example, we can get this error with minor modifications to
`default-python` template:

```yaml
      tasks:
        - task_key: python_file_task
          spark_python_task:
            python_file: ../src/my_project_10/main.py
```

```
 % databricks bundle validate
Error: Missing required cluster or environment settings
  at resources.jobs.my_project_10_job.tasks[0]
  in resources/my_project_10_job.yml:17:11

Task "print_github_stars" requires a cluster or an environment to run.
Specify one of the following fields: job_cluster_key, environment_key, existing_cluster_id, new_cluster.
```

We implicitly rely on "one of" validation, which does not exist. Many
bundle fields can't co-exist, for instance, specifying:
`JobTask.{existing_cluster_id,job_cluster_key}`, `Library.{whl,pypi}`,
`JobTask.{notebook_task,python_wheel_task}`, etc.

## Tests
Unit tests

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
2024-09-25 11:30:14 +00:00
Gleb Kanterov 490259a14a
Refactor jobs path translation (#1782)
## Changes
Extract package for other modules to transform different kinds of paths
in job resources.

## Tests
Unit tests
2024-09-24 13:51:54 +00:00
shreyas-goenka 0cc35ca056
Assert tokens are redacted in origin URL when username is not specified (#1785)
TSIA
2024-09-23 12:42:30 +00:00
Andrew Nester 56ed9bebf3
Added support for creating all-purpose clusters (#1698)
## Changes
Added support for creating all-purpose clusters

Example of configuration

```
bundle:
  name: clusters

resources:
  clusters:
    test_cluster:
      cluster_name: "Test Cluster"
      num_workers: 2
      node_type_id: "i3.xlarge"
      autoscale:
        min_workers: 2
        max_workers: 7
      spark_version: "13.3.x-scala2.12"
      spark_conf:
        "spark.executor.memory": "2g"

  jobs:
    test_job:
      name: "Test Job"
      tasks:
        - task_key: test_task
          existing_cluster_id: ${resources.clusters.test_cluster.id}
          notebook_task:
            notebook_path: "./src/test.py"

targets:
    development:
      mode: development
      compute_id: ${resources.clusters.test_cluster.id}

```

## Tests
Added unit, config and E2E tests
2024-09-23 10:42:34 +00:00
Ilia Babanov ac80d3dfcb
Add verbose flag to the "bundle deploy" command (#1774)
## Changes
- Extract sync output logic from `cmd/sync` into `lib/sync`
- Add hidden `verbose` flag to the `bundle deploy` command, it's false
by default and hidden from the `--help` output
- Pass output handler to the `deploy/files/upload` mutator if the
verbose option is true

The was an idea to use in-place output overriding each past file sync
event in the output, bit that wont work for the extension, since it
doesn't display deploy logs in the terminal.

Example output:
```
~/tmp/defpy: ~/cli/cli bundle deploy --sync-progress
Building defpy...
Uploading defpy-0.0.1+20240917.112755-py3-none-any.whl...
Uploading bundle files to /Users/ilia.babanov@databricks.com/.bundle/defpy/dev/files...
Action: PUT: requirements-dev.txt, resources/defpy_pipeline.yml, pytest.ini, src/defpy/main.py, src/defpy/__init__.py, src/dlt_pipeline.ipynb, tests/main_test.py, src/notebook.ipynb, setup.py, resources/defpy_job.yml, .vscode/extensions.json, .vscode/settings.json, fixtures/.gitkeep, .vscode/__builtins__.pyi, README.md, .gitignore, databricks.yml
Uploaded tests
Uploaded resources
Uploaded fixtures
Uploaded .vscode
Uploaded src/defpy
Uploaded requirements-dev.txt
Uploaded .gitignore
Uploaded fixtures/.gitkeep
Uploaded src/defpy/__init__.py
Uploaded databricks.yml
Uploaded README.md
Uploaded setup.py
Uploaded .vscode/__builtins__.pyi
Uploaded .vscode/extensions.json
Uploaded src/dlt_pipeline.ipynb
Uploaded .vscode/settings.json
Uploaded resources/defpy_job.yml
Uploaded pytest.ini
Uploaded src/defpy/main.py
Uploaded tests/main_test.py
Uploaded resources/defpy_pipeline.yml
Uploaded src/notebook.ipynb
Initial Sync Complete
Deploying resources...
Updating deployment state...
Deployment complete!
```

Output example in the extension:
<img width="1843" alt="Screenshot 2024-09-19 at 11 07 48"
src="https://github.com/user-attachments/assets/0fafd095-cdc6-44b8-b482-27a38ada0330">


## Tests
Manually for the `sync` and `bundle deploy` commands + vscode extension
sync and deploy flows
2024-09-23 10:09:11 +00:00
Lennart Kats (databricks) 7665c639bd
Use Unity Catalog for pipelines in the default-python template (#1766)
## Summary

Enables Unity Catalog for pipelines in the default template. Pipelines
will default to non-Unity Catalog pipelines if a catalog is not
specified.

*Small caveat*: there are cases where admins lock down the default
catalog of a workspace and don't allow the creation of a new schema
there. If that happens, the pipeline would fail at runtime with a clear
error indicating what happened. ("PERMISSION_DENIED: User does not have
CREATE SCHEMA on Catalog 'main'."). I've seen this with an internal
Databricks workspace, where creating new non-UC schemas wasn't locked
down, but creation in the `main` was.

## Testing

- Validated on a non-UC + UC workspace. The catalog selection logic here
is the same as applied for the SQL templates.
2024-09-23 09:52:04 +00:00
Lennart Kats (databricks) 6c57683dc6
Reduce time until the prompt is shown for bundle run (#1727)
## Summary

Makes the `databricks bundle run` command use local state before showing
the menu prompt, which makes it show more quickly. For large/busy
workspaces this means the prompt can show 2-3 seconds earlier.
2024-09-21 06:36:47 +00:00
Andrew Nester cf989a7e10
Upgrade to TF provider 1.52 (#1781)
## Changes
Upgrade to TF provider 1.52

We also temporarily skip generating plugin framework structs to unblock
upgrade as generation does not work yet and need to be fixed separately
2024-09-19 11:21:32 +00:00
Andrew Nester e2c1d51d84
[Release] Release v0.228.1 (#1778)
Bundles:
* Added listing cluster filtering for cluster lookups
([#1754](https://github.com/databricks/cli/pull/1754)).
* Expand library globs relative to the sync root
([#1756](https://github.com/databricks/cli/pull/1756)).
* Fixed generated YAML missing 'default' for empty values
([#1765](https://github.com/databricks/cli/pull/1765)).
* Use periodic triggers in all templates
([#1739](https://github.com/databricks/cli/pull/1739)).
* Use the friendly name of service principals when shortening their name
([#1770](https://github.com/databricks/cli/pull/1770)).
* Fixed detecting full syntax variable override which includes type
field ([#1775](https://github.com/databricks/cli/pull/1775)).

Internal:
* Pass copy of `dyn.Path` to callback function
([#1747](https://github.com/databricks/cli/pull/1747)).
* Make bundle JSON schema modular with `$defs`
([#1700](https://github.com/databricks/cli/pull/1700)).
* Alias variables block in the `Target` struct
([#1748](https://github.com/databricks/cli/pull/1748)).
* Add end to end integration tests for bundle JSON schema
([#1726](https://github.com/databricks/cli/pull/1726)).
* Fix artifact upload integration tests
([#1767](https://github.com/databricks/cli/pull/1767)).

API Changes:
 * Added `databricks quality-monitors regenerate-dashboard` command.

OpenAPI commit d05898328669a3f8ab0c2ecee37db2673d3ea3f7 (2024-09-04)
Dependency updates:
* Bump golang.org/x/term from 0.23.0 to 0.24.0
([#1757](https://github.com/databricks/cli/pull/1757)).
* Bump golang.org/x/oauth2 from 0.22.0 to 0.23.0
([#1761](https://github.com/databricks/cli/pull/1761)).
* Bump golang.org/x/text from 0.17.0 to 0.18.0
([#1759](https://github.com/databricks/cli/pull/1759)).
* Bump github.com/databricks/databricks-sdk-go from 0.45.0 to 0.46.0
([#1760](https://github.com/databricks/cli/pull/1760)).
2024-09-18 11:26:16 +00:00
Andrew Nester bcab6ca37b
Fixed detecting full syntax variable override which includes type field (#1775)
## Changes
Fixes #1773 

## Tests
Confirmed manually
2024-09-18 10:23:07 +00:00
Lennart Kats (databricks) e220f9ddd6
Use the friendly name of service principals when shortening their name (#1770)
## Summary

Use the friendly name of service principals when shortening their name.

This change is helpful for the prefix in development mode. Instead of
adding a prefix like `[dev 1706906c-c0a2-4c25-9f57-3a7aa3cb8123]`, we'll
prefix like `[dev my_principal]`.
2024-09-16 18:35:07 +00:00
Lennart Kats (databricks) f2dee890b8
Use periodic triggers in all templates (#1739)
## Summary

Simplifies template by using the periodic trigger syntax instead of the
cron schedule syntax. Periodic triggers are simpler to configure,
simpler to read, and make sure that workloads are spread out through the
day. We only recommend cron syntax for advanced cases or when more
control is needed.

## Testing

* Templates validation via unit tests
* Manual validation that the new triggers work as expected in dev/prod
2024-09-12 08:33:00 +00:00
Pieter Noordhuis fb077a85d2
Fix artifact upload integration tests (#1767)
## Changes

I didn't run integration tests on #1756.

## Tests

Manually confirmed integration tests pass.
2024-09-11 12:16:18 +00:00
Andrew Nester 66307134c1
Fixed generated YAML missing 'default' for empty values (#1765)
## Changes
Fixed generated YAML missing 'default' for empty values

## Tests
Added unit test
2024-09-11 09:49:58 +00:00
shreyas-goenka c61358407f
Add end to end integration tests for bundle JSON schema (#1726) 2024-09-11 09:15:56 +00:00
shreyas-goenka 5d2c0e3885
Alias variables block in the `Target` struct (#1748)
## Changes
This PR aliases and overrides the schema associated with the variables
block in `target` to allow for directly specifying a variable value in
the JSON schema (without an levels of nesting). This is needed because
this direct value is resolved by dynamically parsing the configuration
tree.

ca6332a5a4/bundle/config/root.go (L424)

## Tests
Existing unit tests.
2024-09-10 14:49:34 +00:00
shreyas-goenka 28b39cd3f7
Make bundle JSON schema modular with `$defs` (#1700)
## Changes
This PR makes sweeping changes to the way we generate and test the
bundle JSON schema. The main benefits are:

1. More modular JSON schema. Every definition in the schema now is one
level deep and points to references instead of inlining the entire
schema for a field. This unblocks PyDABs from taking a dependency on the
JSON schema.

2. Generate the JSON schema during CLI code generation. Directly stream
it instead of computing it at runtime whenever a user calls `databricks
bundle schema`. This is nice because we no longer need to embed a
partial OpenAPI spec in the CLI. Down the line, we can add a `Schema()`
method to every struct in the Databricks Go SDK and remove the
dependency on the OpenAPI spec altogether. It'll become more important
once we decouple Go SDK structs and methods from the underlying APIs.

3. Add enum values for Go SDK fields in the JSON schema. Better
autocompletion and validation for these fields. As a follow-up, we can
add enum values for non-Go SDK enums as well (created internal ticket to
track).

4. Use "packageName.structName" as a key to read JSON schemas from the
OpenAPI spec for Go SDK structs. Before, we would use an unrolled
presentation of the JSON schema (stored in `bundle_descriptions.json`),
which was complex to parse and include in the final JSON schema output.
This also means loading values from the OpenAPI spec for `target` schema
works automatically and no longer needs custom code.
5. Support recursive types (eg: `for_each_task`). With us now using
$refs everywhere it's trivial to support.
6. Using complex variables would be invalid according to the schema
generated before this PR. Now that bug is fixed. In the future adding
more custom rules will be easier as well due to the single level nature
of the JSON schema.


Since this is a complete change of approach in how we generate the JSON
schema, there are a few (very minor) regressions worth calling out.
1. We'll lose a few custom descriptions for non Go SDK structs that were
a part of `bundle_descriptions.json`. Support for those can be added in
the future as a followup.
2. Since now the final JSON schema is a static artefact, we lose some
lead time for the signal that JSON schema integration tests are failing.
It's okay though since we have a lot of coverage via the existing unit
tests.

## Tests
Unit tests. End to end tests are being added in this PR:
https://github.com/databricks/cli/pull/1726

Previous unit tests were all deleted because they were bloated. Effort
was made to make the new unit tests provide (almost) equivalent
coverage.
2024-09-10 13:55:18 +00:00
135 changed files with 9058 additions and 10193 deletions

View File

@ -11,10 +11,10 @@
"toolchain": {
"required": ["go"],
"post_generate": [
"go run ./bundle/internal/bundle/schema/main.go ./bundle/schema/docs/bundle_descriptions.json",
"go run ./bundle/internal/schema/*.go ./bundle/schema/jsonschema.json",
"echo 'bundle/internal/tf/schema/\\*.go linguist-generated=true' >> ./.gitattributes",
"echo 'go.sum linguist-generated=true' >> ./.gitattributes",
"echo 'bundle/schema/docs/bundle_descriptions.json linguist-generated=true' >> ./.gitattributes"
"echo 'bundle/schema/jsonschema.json linguist-generated=true' >> ./.gitattributes"
]
}
}

2
.gitattributes vendored
View File

@ -120,4 +120,4 @@ cmd/workspace/workspace-conf/workspace-conf.go linguist-generated=true
cmd/workspace/workspace/workspace.go linguist-generated=true
bundle/internal/tf/schema/\*.go linguist-generated=true
go.sum linguist-generated=true
bundle/schema/docs/bundle_descriptions.json linguist-generated=true
bundle/schema/jsonschema.json linguist-generated=true

View File

@ -33,7 +33,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.22.7
- name: Setup Python
uses: actions/setup-python@v5
@ -68,7 +68,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.22.7
# No need to download cached dependencies when running gofmt.
cache: false
@ -100,18 +100,25 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.22.7
# Github repo: https://github.com/ajv-validator/ajv-cli
- name: Install ajv-cli
run: npm install -g ajv-cli@5.0.0
# Assert that the generated bundle schema is a valid JSON schema by using
# ajv-cli to validate it against a sample configuration file.
# ajv-cli to validate it against bundle configuration files.
# By default the ajv-cli runs in strict mode which will fail if the schema
# itself is not valid. Strict mode is more strict than the JSON schema
# specification. See for details: https://ajv.js.org/options.html#strict-mode-options
- name: Validate bundle schema
run: |
go run main.go bundle schema > schema.json
ajv -s schema.json -d ./bundle/tests/basic/databricks.yml
for file in ./bundle/internal/schema/testdata/pass/*.yml; do
ajv test -s schema.json -d $file --valid
done
for file in ./bundle/internal/schema/testdata/fail/*.yml; do
ajv test -s schema.json -d $file --invalid
done

View File

@ -21,7 +21,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.22.7
# The default cache key for this action considers only the `go.sum` file.
# We include .goreleaser.yaml here to differentiate from the cache used by the push action

View File

@ -22,7 +22,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: 1.22.x
go-version: 1.22.7
# The default cache key for this action considers only the `go.sum` file.
# We include .goreleaser.yaml here to differentiate from the cache used by the push action

View File

@ -1,5 +1,32 @@
# Version changelog
## [Release] Release v0.228.1
Bundles:
* Added listing cluster filtering for cluster lookups ([#1754](https://github.com/databricks/cli/pull/1754)).
* Expand library globs relative to the sync root ([#1756](https://github.com/databricks/cli/pull/1756)).
* Fixed generated YAML missing 'default' for empty values ([#1765](https://github.com/databricks/cli/pull/1765)).
* Use periodic triggers in all templates ([#1739](https://github.com/databricks/cli/pull/1739)).
* Use the friendly name of service principals when shortening their name ([#1770](https://github.com/databricks/cli/pull/1770)).
* Fixed detecting full syntax variable override which includes type field ([#1775](https://github.com/databricks/cli/pull/1775)).
Internal:
* Pass copy of `dyn.Path` to callback function ([#1747](https://github.com/databricks/cli/pull/1747)).
* Make bundle JSON schema modular with `` ([#1700](https://github.com/databricks/cli/pull/1700)).
* Alias variables block in the `Target` struct ([#1748](https://github.com/databricks/cli/pull/1748)).
* Add end to end integration tests for bundle JSON schema ([#1726](https://github.com/databricks/cli/pull/1726)).
* Fix artifact upload integration tests ([#1767](https://github.com/databricks/cli/pull/1767)).
API Changes:
* Added `databricks quality-monitors regenerate-dashboard` command.
OpenAPI commit d05898328669a3f8ab0c2ecee37db2673d3ea3f7 (2024-09-04)
Dependency updates:
* Bump golang.org/x/term from 0.23.0 to 0.24.0 ([#1757](https://github.com/databricks/cli/pull/1757)).
* Bump golang.org/x/oauth2 from 0.22.0 to 0.23.0 ([#1761](https://github.com/databricks/cli/pull/1761)).
* Bump golang.org/x/text from 0.17.0 to 0.18.0 ([#1759](https://github.com/databricks/cli/pull/1759)).
* Bump github.com/databricks/databricks-sdk-go from 0.45.0 to 0.46.0 ([#1760](https://github.com/databricks/cli/pull/1760)).
## [Release] Release v0.228.0
CLI:

View File

@ -10,6 +10,7 @@ import (
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/internal/bundletest"
"github.com/databricks/cli/internal/testutil"
"github.com/databricks/cli/libs/dyn"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
@ -36,7 +37,7 @@ func TestExpandGlobs_Nominal(t *testing.T) {
},
}
bundletest.SetLocation(b, "artifacts", filepath.Join(tmpDir, "databricks.yml"))
bundletest.SetLocation(b, "artifacts", []dyn.Location{{File: filepath.Join(tmpDir, "databricks.yml")}})
ctx := context.Background()
diags := bundle.Apply(ctx, b, bundle.Seq(
@ -77,7 +78,7 @@ func TestExpandGlobs_InvalidPattern(t *testing.T) {
},
}
bundletest.SetLocation(b, "artifacts", filepath.Join(tmpDir, "databricks.yml"))
bundletest.SetLocation(b, "artifacts", []dyn.Location{{File: filepath.Join(tmpDir, "databricks.yml")}})
ctx := context.Background()
diags := bundle.Apply(ctx, b, bundle.Seq(
@ -125,7 +126,7 @@ func TestExpandGlobs_NoMatches(t *testing.T) {
},
}
bundletest.SetLocation(b, "artifacts", filepath.Join(tmpDir, "databricks.yml"))
bundletest.SetLocation(b, "artifacts", []dyn.Location{{File: filepath.Join(tmpDir, "databricks.yml")}})
ctx := context.Background()
diags := bundle.Apply(ctx, b, bundle.Seq(

View File

@ -38,8 +38,11 @@ type Bundle struct {
// Annotated readonly as this should be set at the target level.
Mode Mode `json:"mode,omitempty" bundle:"readonly"`
// Overrides the compute used for jobs and other supported assets.
ComputeID string `json:"compute_id,omitempty"`
// DEPRECATED: Overrides the compute used for jobs and other supported assets.
ComputeId string `json:"compute_id,omitempty"`
// Overrides the cluster used for jobs and other supported assets.
ClusterId string `json:"cluster_id,omitempty"`
// Deployment section specifies deployment related configuration for bundle
Deployment Deployment `json:"deployment,omitempty"`

View File

@ -25,6 +25,20 @@ func ConvertJobToValue(job *jobs.Job) (dyn.Value, error) {
value["tasks"] = dyn.NewValue(tasks, []dyn.Location{{Line: jobOrder.Get("tasks")}})
}
// We're processing job.Settings.Parameters separately to retain empty default values.
if len(job.Settings.Parameters) > 0 {
params := make([]dyn.Value, 0)
for _, parameter := range job.Settings.Parameters {
p := map[string]dyn.Value{
"name": dyn.NewValue(parameter.Name, []dyn.Location{{Line: 0}}), // We use Line: 0 to ensure that the name goes first.
"default": dyn.NewValue(parameter.Default, []dyn.Location{{Line: 1}}),
}
params = append(params, dyn.NewValue(p, []dyn.Location{}))
}
value["parameters"] = dyn.NewValue(params, []dyn.Location{{Line: jobOrder.Get("parameters")}})
}
return yamlsaver.ConvertToMapValue(job.Settings, jobOrder, []string{"format", "new_cluster", "existing_cluster_id"}, value)
}

View File

@ -160,6 +160,21 @@ func (m *applyPresets) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnos
// the Databricks UI and via the SQL API.
}
// Clusters: Prefix, Tags
for _, c := range r.Clusters {
c.ClusterName = prefix + c.ClusterName
if c.CustomTags == nil {
c.CustomTags = make(map[string]string)
}
for _, tag := range tags {
normalisedKey := b.Tagging.NormalizeKey(tag.Key)
normalisedValue := b.Tagging.NormalizeValue(tag.Value)
if _, ok := c.CustomTags[normalisedKey]; !ok {
c.CustomTags[normalisedKey] = normalisedValue
}
}
}
return nil
}

View File

@ -0,0 +1,87 @@
package mutator
import (
"context"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn"
)
type computeIdToClusterId struct{}
func ComputeIdToClusterId() bundle.Mutator {
return &computeIdToClusterId{}
}
func (m *computeIdToClusterId) Name() string {
return "ComputeIdToClusterId"
}
func (m *computeIdToClusterId) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
var diags diag.Diagnostics
// The "compute_id" key is set; rewrite it to "cluster_id".
err := b.Config.Mutate(func(v dyn.Value) (dyn.Value, error) {
v, d := rewriteComputeIdToClusterId(v, dyn.NewPath(dyn.Key("bundle")))
diags = diags.Extend(d)
// Check if the "compute_id" key is set in any target overrides.
return dyn.MapByPattern(v, dyn.NewPattern(dyn.Key("targets"), dyn.AnyKey()), func(p dyn.Path, v dyn.Value) (dyn.Value, error) {
v, d := rewriteComputeIdToClusterId(v, dyn.Path{})
diags = diags.Extend(d)
return v, nil
})
})
diags = diags.Extend(diag.FromErr(err))
return diags
}
func rewriteComputeIdToClusterId(v dyn.Value, p dyn.Path) (dyn.Value, diag.Diagnostics) {
var diags diag.Diagnostics
computeIdPath := p.Append(dyn.Key("compute_id"))
computeId, err := dyn.GetByPath(v, computeIdPath)
// If the "compute_id" key is not set, we don't need to do anything.
if err != nil {
return v, nil
}
if computeId.Kind() == dyn.KindInvalid {
return v, nil
}
diags = diags.Append(diag.Diagnostic{
Severity: diag.Warning,
Summary: "compute_id is deprecated, please use cluster_id instead",
Locations: computeId.Locations(),
Paths: []dyn.Path{computeIdPath},
})
clusterIdPath := p.Append(dyn.Key("cluster_id"))
nv, err := dyn.SetByPath(v, clusterIdPath, computeId)
if err != nil {
return dyn.InvalidValue, diag.FromErr(err)
}
// Drop the "compute_id" key.
vout, err := dyn.Walk(nv, func(p dyn.Path, v dyn.Value) (dyn.Value, error) {
switch len(p) {
case 0:
return v, nil
case 1:
if p[0] == dyn.Key("compute_id") {
return v, dyn.ErrDrop
}
return v, nil
case 2:
if p[1] == dyn.Key("compute_id") {
return v, dyn.ErrDrop
}
}
return v, dyn.ErrSkip
})
diags = diags.Extend(diag.FromErr(err))
return vout, diags
}

View File

@ -0,0 +1,57 @@
package mutator_test
import (
"context"
"testing"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/libs/diag"
"github.com/stretchr/testify/assert"
)
func TestComputeIdToClusterId(t *testing.T) {
b := &bundle.Bundle{
Config: config.Root{
Bundle: config.Bundle{
ComputeId: "compute-id",
},
},
}
diags := bundle.Apply(context.Background(), b, mutator.ComputeIdToClusterId())
assert.NoError(t, diags.Error())
assert.Equal(t, "compute-id", b.Config.Bundle.ClusterId)
assert.Empty(t, b.Config.Bundle.ComputeId)
assert.Len(t, diags, 1)
assert.Equal(t, "compute_id is deprecated, please use cluster_id instead", diags[0].Summary)
assert.Equal(t, diag.Warning, diags[0].Severity)
}
func TestComputeIdToClusterIdInTargetOverride(t *testing.T) {
b := &bundle.Bundle{
Config: config.Root{
Targets: map[string]*config.Target{
"dev": {
ComputeId: "compute-id-dev",
},
},
},
}
diags := bundle.Apply(context.Background(), b, mutator.ComputeIdToClusterId())
assert.NoError(t, diags.Error())
assert.Empty(t, b.Config.Targets["dev"].ComputeId)
diags = diags.Extend(bundle.Apply(context.Background(), b, mutator.SelectTarget("dev")))
assert.NoError(t, diags.Error())
assert.Equal(t, "compute-id-dev", b.Config.Bundle.ClusterId)
assert.Empty(t, b.Config.Bundle.ComputeId)
assert.Len(t, diags, 1)
assert.Equal(t, "compute_id is deprecated, please use cluster_id instead", diags[0].Summary)
assert.Equal(t, diag.Warning, diags[0].Severity)
}

View File

@ -10,6 +10,7 @@ import (
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/bundle/internal/bundletest"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/databricks-sdk-go/service/compute"
"github.com/databricks/databricks-sdk-go/service/pipelines"
"github.com/stretchr/testify/require"
@ -105,8 +106,8 @@ func TestExpandGlobPathsInPipelines(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, "resources.pipelines.pipeline.libraries[3]", filepath.Join(dir, "relative", "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
bundletest.SetLocation(b, "resources.pipelines.pipeline.libraries[3]", []dyn.Location{{File: filepath.Join(dir, "relative", "resource.yml")}})
m := ExpandPipelineGlobPaths()
diags := bundle.Apply(context.Background(), b, m)

View File

@ -23,6 +23,7 @@ func DefaultMutators() []bundle.Mutator {
VerifyCliVersion(),
EnvironmentsToTargets(),
ComputeIdToClusterId(),
InitializeVariables(),
DefineDefaultTarget(),
LoadGitDetails(),

View File

@ -39,22 +39,22 @@ func overrideJobCompute(j *resources.Job, compute string) {
func (m *overrideCompute) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
if b.Config.Bundle.Mode != config.Development {
if b.Config.Bundle.ComputeID != "" {
if b.Config.Bundle.ClusterId != "" {
return diag.Errorf("cannot override compute for an target that does not use 'mode: development'")
}
return nil
}
if v := env.Get(ctx, "DATABRICKS_CLUSTER_ID"); v != "" {
b.Config.Bundle.ComputeID = v
b.Config.Bundle.ClusterId = v
}
if b.Config.Bundle.ComputeID == "" {
if b.Config.Bundle.ClusterId == "" {
return nil
}
r := b.Config.Resources
for i := range r.Jobs {
overrideJobCompute(r.Jobs[i], b.Config.Bundle.ComputeID)
overrideJobCompute(r.Jobs[i], b.Config.Bundle.ClusterId)
}
return nil

View File

@ -20,7 +20,7 @@ func TestOverrideDevelopment(t *testing.T) {
Config: config.Root{
Bundle: config.Bundle{
Mode: config.Development,
ComputeID: "newClusterID",
ClusterId: "newClusterID",
},
Resources: config.Resources{
Jobs: map[string]*resources.Job{
@ -144,7 +144,7 @@ func TestOverrideProduction(t *testing.T) {
b := &bundle.Bundle{
Config: config.Root{
Bundle: config.Bundle{
ComputeID: "newClusterID",
ClusterId: "newClusterID",
},
Resources: config.Resources{
Jobs: map[string]*resources.Job{

View File

@ -0,0 +1,115 @@
package paths
import (
"github.com/databricks/cli/bundle/libraries"
"github.com/databricks/cli/libs/dyn"
)
type jobRewritePattern struct {
pattern dyn.Pattern
kind PathKind
skipRewrite func(string) bool
}
func noSkipRewrite(string) bool {
return false
}
func jobTaskRewritePatterns(base dyn.Pattern) []jobRewritePattern {
return []jobRewritePattern{
{
base.Append(dyn.Key("notebook_task"), dyn.Key("notebook_path")),
PathKindNotebook,
noSkipRewrite,
},
{
base.Append(dyn.Key("spark_python_task"), dyn.Key("python_file")),
PathKindWorkspaceFile,
noSkipRewrite,
},
{
base.Append(dyn.Key("dbt_task"), dyn.Key("project_directory")),
PathKindDirectory,
noSkipRewrite,
},
{
base.Append(dyn.Key("sql_task"), dyn.Key("file"), dyn.Key("path")),
PathKindWorkspaceFile,
noSkipRewrite,
},
{
base.Append(dyn.Key("libraries"), dyn.AnyIndex(), dyn.Key("whl")),
PathKindLibrary,
noSkipRewrite,
},
{
base.Append(dyn.Key("libraries"), dyn.AnyIndex(), dyn.Key("jar")),
PathKindLibrary,
noSkipRewrite,
},
{
base.Append(dyn.Key("libraries"), dyn.AnyIndex(), dyn.Key("requirements")),
PathKindWorkspaceFile,
noSkipRewrite,
},
}
}
func jobRewritePatterns() []jobRewritePattern {
// Base pattern to match all tasks in all jobs.
base := dyn.NewPattern(
dyn.Key("resources"),
dyn.Key("jobs"),
dyn.AnyKey(),
dyn.Key("tasks"),
dyn.AnyIndex(),
)
// Compile list of patterns and their respective rewrite functions.
jobEnvironmentsPatterns := []jobRewritePattern{
{
dyn.NewPattern(
dyn.Key("resources"),
dyn.Key("jobs"),
dyn.AnyKey(),
dyn.Key("environments"),
dyn.AnyIndex(),
dyn.Key("spec"),
dyn.Key("dependencies"),
dyn.AnyIndex(),
),
PathKindWithPrefix,
func(s string) bool {
return !libraries.IsLibraryLocal(s)
},
},
}
taskPatterns := jobTaskRewritePatterns(base)
forEachPatterns := jobTaskRewritePatterns(base.Append(dyn.Key("for_each_task"), dyn.Key("task")))
allPatterns := append(taskPatterns, jobEnvironmentsPatterns...)
allPatterns = append(allPatterns, forEachPatterns...)
return allPatterns
}
// VisitJobPaths visits all paths in job resources and applies a function to each path.
func VisitJobPaths(value dyn.Value, fn VisitFunc) (dyn.Value, error) {
var err error
var newValue = value
for _, rewritePattern := range jobRewritePatterns() {
newValue, err = dyn.MapByPattern(newValue, rewritePattern.pattern, func(p dyn.Path, v dyn.Value) (dyn.Value, error) {
if rewritePattern.skipRewrite(v.MustString()) {
return v, nil
}
return fn(p, rewritePattern.kind, v)
})
if err != nil {
return dyn.InvalidValue, err
}
}
return newValue, nil
}

View File

@ -0,0 +1,168 @@
package paths
import (
"testing"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/libs/dyn"
assert "github.com/databricks/cli/libs/dyn/dynassert"
"github.com/databricks/databricks-sdk-go/service/compute"
"github.com/databricks/databricks-sdk-go/service/jobs"
"github.com/stretchr/testify/require"
)
func TestVisitJobPaths(t *testing.T) {
task0 := jobs.Task{
NotebookTask: &jobs.NotebookTask{
NotebookPath: "abc",
},
}
task1 := jobs.Task{
SparkPythonTask: &jobs.SparkPythonTask{
PythonFile: "abc",
},
}
task2 := jobs.Task{
DbtTask: &jobs.DbtTask{
ProjectDirectory: "abc",
},
}
task3 := jobs.Task{
SqlTask: &jobs.SqlTask{
File: &jobs.SqlTaskFile{
Path: "abc",
},
},
}
task4 := jobs.Task{
Libraries: []compute.Library{
{Whl: "dist/foo.whl"},
},
}
task5 := jobs.Task{
Libraries: []compute.Library{
{Jar: "dist/foo.jar"},
},
}
task6 := jobs.Task{
Libraries: []compute.Library{
{Requirements: "requirements.txt"},
},
}
job0 := &resources.Job{
JobSettings: &jobs.JobSettings{
Tasks: []jobs.Task{
task0,
task1,
task2,
task3,
task4,
task5,
task6,
},
},
}
root := config.Root{
Resources: config.Resources{
Jobs: map[string]*resources.Job{
"job0": job0,
},
},
}
actual := visitJobPaths(t, root)
expected := []dyn.Path{
dyn.MustPathFromString("resources.jobs.job0.tasks[0].notebook_task.notebook_path"),
dyn.MustPathFromString("resources.jobs.job0.tasks[1].spark_python_task.python_file"),
dyn.MustPathFromString("resources.jobs.job0.tasks[2].dbt_task.project_directory"),
dyn.MustPathFromString("resources.jobs.job0.tasks[3].sql_task.file.path"),
dyn.MustPathFromString("resources.jobs.job0.tasks[4].libraries[0].whl"),
dyn.MustPathFromString("resources.jobs.job0.tasks[5].libraries[0].jar"),
dyn.MustPathFromString("resources.jobs.job0.tasks[6].libraries[0].requirements"),
}
assert.ElementsMatch(t, expected, actual)
}
func TestVisitJobPaths_environments(t *testing.T) {
environment0 := jobs.JobEnvironment{
Spec: &compute.Environment{
Dependencies: []string{
"dist_0/*.whl",
"dist_1/*.whl",
},
},
}
job0 := &resources.Job{
JobSettings: &jobs.JobSettings{
Environments: []jobs.JobEnvironment{
environment0,
},
},
}
root := config.Root{
Resources: config.Resources{
Jobs: map[string]*resources.Job{
"job0": job0,
},
},
}
actual := visitJobPaths(t, root)
expected := []dyn.Path{
dyn.MustPathFromString("resources.jobs.job0.environments[0].spec.dependencies[0]"),
dyn.MustPathFromString("resources.jobs.job0.environments[0].spec.dependencies[1]"),
}
assert.ElementsMatch(t, expected, actual)
}
func TestVisitJobPaths_foreach(t *testing.T) {
task0 := jobs.Task{
ForEachTask: &jobs.ForEachTask{
Task: jobs.Task{
NotebookTask: &jobs.NotebookTask{
NotebookPath: "abc",
},
},
},
}
job0 := &resources.Job{
JobSettings: &jobs.JobSettings{
Tasks: []jobs.Task{
task0,
},
},
}
root := config.Root{
Resources: config.Resources{
Jobs: map[string]*resources.Job{
"job0": job0,
},
},
}
actual := visitJobPaths(t, root)
expected := []dyn.Path{
dyn.MustPathFromString("resources.jobs.job0.tasks[0].for_each_task.task.notebook_task.notebook_path"),
}
assert.ElementsMatch(t, expected, actual)
}
func visitJobPaths(t *testing.T, root config.Root) []dyn.Path {
var actual []dyn.Path
err := root.Mutate(func(value dyn.Value) (dyn.Value, error) {
return VisitJobPaths(value, func(p dyn.Path, kind PathKind, v dyn.Value) (dyn.Value, error) {
actual = append(actual, p)
return v, nil
})
})
require.NoError(t, err)
return actual
}

View File

@ -0,0 +1,26 @@
package paths
import "github.com/databricks/cli/libs/dyn"
type PathKind int
const (
// PathKindLibrary is a path to a library file
PathKindLibrary = iota
// PathKindNotebook is a path to a notebook file
PathKindNotebook
// PathKindWorkspaceFile is a path to a regular workspace file,
// notebooks are not allowed because they are uploaded a special
// kind of workspace object.
PathKindWorkspaceFile
// PathKindWithPrefix is a path that starts with './'
PathKindWithPrefix
// PathKindDirectory is a path to directory
PathKindDirectory
)
type VisitFunc func(path dyn.Path, kind PathKind, value dyn.Value) (dyn.Value, error)

View File

@ -33,7 +33,7 @@ func (m *populateCurrentUser) Apply(ctx context.Context, b *bundle.Bundle) diag.
}
b.Config.Workspace.CurrentUser = &config.User{
ShortName: auth.GetShortUserName(me.UserName),
ShortName: auth.GetShortUserName(me),
User: me,
}

View File

@ -13,6 +13,7 @@ import (
"github.com/databricks/cli/libs/tags"
sdkconfig "github.com/databricks/databricks-sdk-go/config"
"github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/compute"
"github.com/databricks/databricks-sdk-go/service/iam"
"github.com/databricks/databricks-sdk-go/service/jobs"
"github.com/databricks/databricks-sdk-go/service/ml"
@ -119,6 +120,9 @@ func mockBundle(mode config.Mode) *bundle.Bundle {
Schemas: map[string]*resources.Schema{
"schema1": {CreateSchema: &catalog.CreateSchema{Name: "schema1"}},
},
Clusters: map[string]*resources.Cluster{
"cluster1": {ClusterSpec: &compute.ClusterSpec{ClusterName: "cluster1", SparkVersion: "13.2.x", NumWorkers: 1}},
},
},
},
// Use AWS implementation for testing.
@ -177,6 +181,9 @@ func TestProcessTargetModeDevelopment(t *testing.T) {
// Schema 1
assert.Equal(t, "dev_lennart_schema1", b.Config.Resources.Schemas["schema1"].Name)
// Clusters
assert.Equal(t, "[dev lennart] cluster1", b.Config.Resources.Clusters["cluster1"].ClusterName)
}
func TestProcessTargetModeDevelopmentTagNormalizationForAws(t *testing.T) {
@ -281,6 +288,7 @@ func TestProcessTargetModeDefault(t *testing.T) {
assert.Equal(t, "servingendpoint1", b.Config.Resources.ModelServingEndpoints["servingendpoint1"].Name)
assert.Equal(t, "registeredmodel1", b.Config.Resources.RegisteredModels["registeredmodel1"].Name)
assert.Equal(t, "qualityMonitor1", b.Config.Resources.QualityMonitors["qualityMonitor1"].TableName)
assert.Equal(t, "cluster1", b.Config.Resources.Clusters["cluster1"].ClusterName)
}
func TestProcessTargetModeProduction(t *testing.T) {
@ -312,6 +320,7 @@ func TestProcessTargetModeProduction(t *testing.T) {
b.Config.Resources.Experiments["experiment2"].Permissions = permissions
b.Config.Resources.Models["model1"].Permissions = permissions
b.Config.Resources.ModelServingEndpoints["servingendpoint1"].Permissions = permissions
b.Config.Resources.Clusters["cluster1"].Permissions = permissions
diags = validateProductionMode(context.Background(), b, false)
require.NoError(t, diags.Error())
@ -322,6 +331,7 @@ func TestProcessTargetModeProduction(t *testing.T) {
assert.Equal(t, "servingendpoint1", b.Config.Resources.ModelServingEndpoints["servingendpoint1"].Name)
assert.Equal(t, "registeredmodel1", b.Config.Resources.RegisteredModels["registeredmodel1"].Name)
assert.Equal(t, "qualityMonitor1", b.Config.Resources.QualityMonitors["qualityMonitor1"].TableName)
assert.Equal(t, "cluster1", b.Config.Resources.Clusters["cluster1"].ClusterName)
}
func TestProcessTargetModeProductionOkForPrincipal(t *testing.T) {

View File

@ -9,6 +9,7 @@ import (
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/internal/bundletest"
"github.com/databricks/cli/libs/dyn"
"github.com/stretchr/testify/assert"
)
@ -33,12 +34,12 @@ func TestRewriteSyncPathsRelative(t *testing.T) {
},
}
bundletest.SetLocation(b, "sync.paths[0]", "./databricks.yml")
bundletest.SetLocation(b, "sync.paths[1]", "./databricks.yml")
bundletest.SetLocation(b, "sync.include[0]", "./file.yml")
bundletest.SetLocation(b, "sync.include[1]", "./a/file.yml")
bundletest.SetLocation(b, "sync.exclude[0]", "./a/b/file.yml")
bundletest.SetLocation(b, "sync.exclude[1]", "./a/b/c/file.yml")
bundletest.SetLocation(b, "sync.paths[0]", []dyn.Location{{File: "./databricks.yml"}})
bundletest.SetLocation(b, "sync.paths[1]", []dyn.Location{{File: "./databricks.yml"}})
bundletest.SetLocation(b, "sync.include[0]", []dyn.Location{{File: "./file.yml"}})
bundletest.SetLocation(b, "sync.include[1]", []dyn.Location{{File: "./a/file.yml"}})
bundletest.SetLocation(b, "sync.exclude[0]", []dyn.Location{{File: "./a/b/file.yml"}})
bundletest.SetLocation(b, "sync.exclude[1]", []dyn.Location{{File: "./a/b/c/file.yml"}})
diags := bundle.Apply(context.Background(), b, mutator.RewriteSyncPaths())
assert.NoError(t, diags.Error())
@ -72,12 +73,12 @@ func TestRewriteSyncPathsAbsolute(t *testing.T) {
},
}
bundletest.SetLocation(b, "sync.paths[0]", "/tmp/dir/databricks.yml")
bundletest.SetLocation(b, "sync.paths[1]", "/tmp/dir/databricks.yml")
bundletest.SetLocation(b, "sync.include[0]", "/tmp/dir/file.yml")
bundletest.SetLocation(b, "sync.include[1]", "/tmp/dir/a/file.yml")
bundletest.SetLocation(b, "sync.exclude[0]", "/tmp/dir/a/b/file.yml")
bundletest.SetLocation(b, "sync.exclude[1]", "/tmp/dir/a/b/c/file.yml")
bundletest.SetLocation(b, "sync.paths[0]", []dyn.Location{{File: "/tmp/dir/databricks.yml"}})
bundletest.SetLocation(b, "sync.paths[1]", []dyn.Location{{File: "/tmp/dir/databricks.yml"}})
bundletest.SetLocation(b, "sync.include[0]", []dyn.Location{{File: "/tmp/dir/file.yml"}})
bundletest.SetLocation(b, "sync.include[1]", []dyn.Location{{File: "/tmp/dir/a/file.yml"}})
bundletest.SetLocation(b, "sync.exclude[0]", []dyn.Location{{File: "/tmp/dir/a/b/file.yml"}})
bundletest.SetLocation(b, "sync.exclude[1]", []dyn.Location{{File: "/tmp/dir/a/b/c/file.yml"}})
diags := bundle.Apply(context.Background(), b, mutator.RewriteSyncPaths())
assert.NoError(t, diags.Error())

View File

@ -32,6 +32,7 @@ func allResourceTypes(t *testing.T) []string {
// the dyn library gives us the correct list of all resources supported. Please
// also update this check when adding a new resource
require.Equal(t, []string{
"clusters",
"experiments",
"jobs",
"model_serving_endpoints",
@ -133,6 +134,7 @@ func TestRunAsErrorForUnsupportedResources(t *testing.T) {
// some point in the future. These resources are (implicitly) on the deny list, since
// they are not on the allow list below.
allowList := []string{
"clusters",
"jobs",
"models",
"registered_models",

View File

@ -9,6 +9,7 @@ import (
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/internal/bundletest"
"github.com/databricks/cli/libs/dyn"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
@ -184,7 +185,7 @@ func TestSyncInferRoot_Error(t *testing.T) {
},
}
bundletest.SetLocation(b, "sync.paths", "databricks.yml")
bundletest.SetLocation(b, "sync.paths", []dyn.Location{{File: "databricks.yml"}})
ctx := context.Background()
diags := bundle.Apply(ctx, b, mutator.SyncInferRoot())

View File

@ -4,97 +4,11 @@ import (
"fmt"
"slices"
"github.com/databricks/cli/bundle/libraries"
"github.com/databricks/cli/bundle/config/mutator/paths"
"github.com/databricks/cli/libs/dyn"
)
type jobRewritePattern struct {
pattern dyn.Pattern
fn rewriteFunc
skipRewrite func(string) bool
}
func noSkipRewrite(string) bool {
return false
}
func rewritePatterns(t *translateContext, base dyn.Pattern) []jobRewritePattern {
return []jobRewritePattern{
{
base.Append(dyn.Key("notebook_task"), dyn.Key("notebook_path")),
t.translateNotebookPath,
noSkipRewrite,
},
{
base.Append(dyn.Key("spark_python_task"), dyn.Key("python_file")),
t.translateFilePath,
noSkipRewrite,
},
{
base.Append(dyn.Key("dbt_task"), dyn.Key("project_directory")),
t.translateDirectoryPath,
noSkipRewrite,
},
{
base.Append(dyn.Key("sql_task"), dyn.Key("file"), dyn.Key("path")),
t.translateFilePath,
noSkipRewrite,
},
{
base.Append(dyn.Key("libraries"), dyn.AnyIndex(), dyn.Key("whl")),
t.translateNoOp,
noSkipRewrite,
},
{
base.Append(dyn.Key("libraries"), dyn.AnyIndex(), dyn.Key("jar")),
t.translateNoOp,
noSkipRewrite,
},
{
base.Append(dyn.Key("libraries"), dyn.AnyIndex(), dyn.Key("requirements")),
t.translateFilePath,
noSkipRewrite,
},
}
}
func (t *translateContext) jobRewritePatterns() []jobRewritePattern {
// Base pattern to match all tasks in all jobs.
base := dyn.NewPattern(
dyn.Key("resources"),
dyn.Key("jobs"),
dyn.AnyKey(),
dyn.Key("tasks"),
dyn.AnyIndex(),
)
// Compile list of patterns and their respective rewrite functions.
jobEnvironmentsPatterns := []jobRewritePattern{
{
dyn.NewPattern(
dyn.Key("resources"),
dyn.Key("jobs"),
dyn.AnyKey(),
dyn.Key("environments"),
dyn.AnyIndex(),
dyn.Key("spec"),
dyn.Key("dependencies"),
dyn.AnyIndex(),
),
t.translateNoOpWithPrefix,
func(s string) bool {
return !libraries.IsLibraryLocal(s)
},
},
}
taskPatterns := rewritePatterns(t, base)
forEachPatterns := rewritePatterns(t, base.Append(dyn.Key("for_each_task"), dyn.Key("task")))
allPatterns := append(taskPatterns, jobEnvironmentsPatterns...)
allPatterns = append(allPatterns, forEachPatterns...)
return allPatterns
}
func (t *translateContext) applyJobTranslations(v dyn.Value) (dyn.Value, error) {
var err error
@ -111,8 +25,7 @@ func (t *translateContext) applyJobTranslations(v dyn.Value) (dyn.Value, error)
}
}
for _, rewritePattern := range t.jobRewritePatterns() {
v, err = dyn.MapByPattern(v, rewritePattern.pattern, func(p dyn.Path, v dyn.Value) (dyn.Value, error) {
return paths.VisitJobPaths(v, func(p dyn.Path, kind paths.PathKind, v dyn.Value) (dyn.Value, error) {
key := p[2].Key()
// Skip path translation if the job is using git source.
@ -125,16 +38,28 @@ func (t *translateContext) applyJobTranslations(v dyn.Value) (dyn.Value, error)
return dyn.InvalidValue, fmt.Errorf("unable to determine directory for job %s: %w", key, err)
}
sv := v.MustString()
if rewritePattern.skipRewrite(sv) {
return v, nil
}
return t.rewriteRelativeTo(p, v, rewritePattern.fn, dir, fallback[key])
})
rewritePatternFn, err := t.getRewritePatternFn(kind)
if err != nil {
return dyn.InvalidValue, err
}
return t.rewriteRelativeTo(p, v, rewritePatternFn, dir, fallback[key])
})
}
return v, nil
func (t *translateContext) getRewritePatternFn(kind paths.PathKind) (rewriteFunc, error) {
switch kind {
case paths.PathKindLibrary:
return t.translateNoOp, nil
case paths.PathKindNotebook:
return t.translateNotebookPath, nil
case paths.PathKindWorkspaceFile:
return t.translateFilePath, nil
case paths.PathKindDirectory:
return t.translateDirectoryPath, nil
case paths.PathKindWithPrefix:
return t.translateNoOpWithPrefix, nil
}
return nil, fmt.Errorf("unsupported path kind: %d", kind)
}

View File

@ -82,7 +82,7 @@ func TestTranslatePathsSkippedWithGitSource(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
require.NoError(t, diags.Error())
@ -210,7 +210,7 @@ func TestTranslatePaths(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
require.NoError(t, diags.Error())
@ -346,8 +346,8 @@ func TestTranslatePathsInSubdirectories(t *testing.T) {
},
}
bundletest.SetLocation(b, "resources.jobs", filepath.Join(dir, "job/resource.yml"))
bundletest.SetLocation(b, "resources.pipelines", filepath.Join(dir, "pipeline/resource.yml"))
bundletest.SetLocation(b, "resources.jobs", []dyn.Location{{File: filepath.Join(dir, "job/resource.yml")}})
bundletest.SetLocation(b, "resources.pipelines", []dyn.Location{{File: filepath.Join(dir, "pipeline/resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
require.NoError(t, diags.Error())
@ -408,7 +408,7 @@ func TestTranslatePathsOutsideSyncRoot(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "../resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "../resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.ErrorContains(t, diags.Error(), "is not contained in sync root path")
@ -439,7 +439,7 @@ func TestJobNotebookDoesNotExistError(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "fake.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "fake.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.EqualError(t, diags.Error(), "notebook ./doesnt_exist.py not found")
@ -470,7 +470,7 @@ func TestJobFileDoesNotExistError(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "fake.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "fake.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.EqualError(t, diags.Error(), "file ./doesnt_exist.py not found")
@ -501,7 +501,7 @@ func TestPipelineNotebookDoesNotExistError(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "fake.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "fake.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.EqualError(t, diags.Error(), "notebook ./doesnt_exist.py not found")
@ -532,7 +532,7 @@ func TestPipelineFileDoesNotExistError(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "fake.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "fake.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.EqualError(t, diags.Error(), "file ./doesnt_exist.py not found")
@ -567,7 +567,7 @@ func TestJobSparkPythonTaskWithNotebookSourceError(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.ErrorContains(t, diags.Error(), `expected a file for "resources.jobs.job.tasks[0].spark_python_task.python_file" but got a notebook`)
@ -602,7 +602,7 @@ func TestJobNotebookTaskWithFileSourceError(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.ErrorContains(t, diags.Error(), `expected a notebook for "resources.jobs.job.tasks[0].notebook_task.notebook_path" but got a file`)
@ -637,7 +637,7 @@ func TestPipelineNotebookLibraryWithFileSourceError(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.ErrorContains(t, diags.Error(), `expected a notebook for "resources.pipelines.pipeline.libraries[0].notebook.path" but got a file`)
@ -672,7 +672,7 @@ func TestPipelineFileLibraryWithNotebookSourceError(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
assert.ErrorContains(t, diags.Error(), `expected a file for "resources.pipelines.pipeline.libraries[0].file.path" but got a notebook`)
@ -710,7 +710,7 @@ func TestTranslatePathJobEnvironments(t *testing.T) {
},
}
bundletest.SetLocation(b, "resources.jobs", filepath.Join(dir, "job/resource.yml"))
bundletest.SetLocation(b, "resources.jobs", []dyn.Location{{File: filepath.Join(dir, "job/resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
require.NoError(t, diags.Error())
@ -753,8 +753,8 @@ func TestTranslatePathWithComplexVariables(t *testing.T) {
},
}
bundletest.SetLocation(b, "variables", filepath.Join(dir, "variables/variables.yml"))
bundletest.SetLocation(b, "resources.jobs", filepath.Join(dir, "job/resource.yml"))
bundletest.SetLocation(b, "variables", []dyn.Location{{File: filepath.Join(dir, "variables/variables.yml")}})
bundletest.SetLocation(b, "resources.jobs", []dyn.Location{{File: filepath.Join(dir, "job/resource.yml")}})
ctx := context.Background()
// Assign the variables to the dynamic configuration.

View File

@ -19,6 +19,7 @@ type Resources struct {
RegisteredModels map[string]*resources.RegisteredModel `json:"registered_models,omitempty"`
QualityMonitors map[string]*resources.QualityMonitor `json:"quality_monitors,omitempty"`
Schemas map[string]*resources.Schema `json:"schemas,omitempty"`
Clusters map[string]*resources.Cluster `json:"clusters,omitempty"`
}
type ConfigResource interface {

View File

@ -0,0 +1,39 @@
package resources
import (
"context"
"github.com/databricks/cli/libs/log"
"github.com/databricks/databricks-sdk-go"
"github.com/databricks/databricks-sdk-go/marshal"
"github.com/databricks/databricks-sdk-go/service/compute"
)
type Cluster struct {
ID string `json:"id,omitempty" bundle:"readonly"`
Permissions []Permission `json:"permissions,omitempty"`
ModifiedStatus ModifiedStatus `json:"modified_status,omitempty" bundle:"internal"`
*compute.ClusterSpec
}
func (s *Cluster) UnmarshalJSON(b []byte) error {
return marshal.Unmarshal(b, s)
}
func (s Cluster) MarshalJSON() ([]byte, error) {
return marshal.Marshal(s)
}
func (s *Cluster) Exists(ctx context.Context, w *databricks.WorkspaceClient, id string) (bool, error) {
_, err := w.Clusters.GetByClusterId(ctx, id)
if err != nil {
log.Debugf(ctx, "cluster %s does not exist", id)
return false, err
}
return true, nil
}
func (s *Cluster) TerraformResourceName() string {
return "databricks_cluster"
}

View File

@ -366,9 +366,9 @@ func (r *Root) MergeTargetOverrides(name string) error {
}
}
// Merge `compute_id`. This field must be overwritten if set, not merged.
if v := target.Get("compute_id"); v.Kind() != dyn.KindInvalid {
root, err = dyn.SetByPath(root, dyn.NewPath(dyn.Key("bundle"), dyn.Key("compute_id")), v)
// Merge `cluster_id`. This field must be overwritten if set, not merged.
if v := target.Get("cluster_id"); v.Kind() != dyn.KindInvalid {
root, err = dyn.SetByPath(root, dyn.NewPath(dyn.Key("bundle"), dyn.Key("cluster_id")), v)
if err != nil {
return err
}
@ -409,15 +409,51 @@ func (r *Root) MergeTargetOverrides(name string) error {
var variableKeywords = []string{"default", "lookup"}
// isFullVariableOverrideDef checks if the given value is a full syntax varaible override.
// A full syntax variable override is a map with only one of the following
// keys: "default", "lookup".
// A full syntax variable override is a map with either 1 of 2 keys.
// If it's 2 keys, the keys should be "default" and "type".
// If it's 1 key, the key should be one of the following keys: "default", "lookup".
func isFullVariableOverrideDef(v dyn.Value) bool {
mv, ok := v.AsMap()
if !ok {
return false
}
if mv.Len() != 1 {
// If the map has more than 3 keys, it is not a full variable override.
if mv.Len() > 3 {
return false
}
// If the map has 3 keys, they should be "description", "type" and "default" or "lookup"
if mv.Len() == 3 {
if _, ok := mv.GetByString("type"); ok {
if _, ok := mv.GetByString("description"); ok {
if _, ok := mv.GetByString("default"); ok {
return true
}
}
}
return false
}
// If the map has 2 keys, one of them should be "default" or "lookup" and the other is "type" or "description"
if mv.Len() == 2 {
if _, ok := mv.GetByString("type"); ok {
if _, ok := mv.GetByString("default"); ok {
return true
}
}
if _, ok := mv.GetByString("description"); ok {
if _, ok := mv.GetByString("default"); ok {
return true
}
if _, ok := mv.GetByString("lookup"); ok {
return true
}
}
return false
}

View File

@ -6,6 +6,7 @@ import (
"testing"
"github.com/databricks/cli/bundle/config/variable"
"github.com/databricks/cli/libs/dyn"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
@ -139,7 +140,7 @@ func TestRootMergeTargetOverridesWithVariables(t *testing.T) {
},
Targets: map[string]*Target{
"development": {
Variables: map[string]*variable.Variable{
Variables: map[string]*variable.TargetVariable{
"foo": {
Default: "bar",
Description: "wrong",
@ -169,3 +170,87 @@ func TestRootMergeTargetOverridesWithVariables(t *testing.T) {
assert.Equal(t, "complex var", root.Variables["complex"].Description)
}
func TestIsFullVariableOverrideDef(t *testing.T) {
testCases := []struct {
value dyn.Value
expected bool
}{
{
value: dyn.V(map[string]dyn.Value{
"type": dyn.V("string"),
"default": dyn.V("foo"),
"description": dyn.V("foo var"),
}),
expected: true,
},
{
value: dyn.V(map[string]dyn.Value{
"type": dyn.V("string"),
"lookup": dyn.V("foo"),
"description": dyn.V("foo var"),
}),
expected: false,
},
{
value: dyn.V(map[string]dyn.Value{
"type": dyn.V("string"),
"default": dyn.V("foo"),
}),
expected: true,
},
{
value: dyn.V(map[string]dyn.Value{
"type": dyn.V("string"),
"lookup": dyn.V("foo"),
}),
expected: false,
},
{
value: dyn.V(map[string]dyn.Value{
"description": dyn.V("string"),
"default": dyn.V("foo"),
}),
expected: true,
},
{
value: dyn.V(map[string]dyn.Value{
"description": dyn.V("string"),
"lookup": dyn.V("foo"),
}),
expected: true,
},
{
value: dyn.V(map[string]dyn.Value{
"default": dyn.V("foo"),
}),
expected: true,
},
{
value: dyn.V(map[string]dyn.Value{
"lookup": dyn.V("foo"),
}),
expected: true,
},
{
value: dyn.V(map[string]dyn.Value{
"type": dyn.V("string"),
}),
expected: false,
},
{
value: dyn.V(map[string]dyn.Value{
"type": dyn.V("string"),
"default": dyn.V("foo"),
"description": dyn.V("foo var"),
"lookup": dyn.V("foo"),
}),
expected: false,
},
}
for i, tc := range testCases {
assert.Equal(t, tc.expected, isFullVariableOverrideDef(tc.value), "test case %d", i)
}
}

View File

@ -24,8 +24,11 @@ type Target struct {
// name prefix of deployed resources.
Presets Presets `json:"presets,omitempty"`
// Overrides the compute used for jobs and other supported assets.
ComputeID string `json:"compute_id,omitempty"`
// DEPRECATED: Overrides the compute used for jobs and other supported assets.
ComputeId string `json:"compute_id,omitempty"`
// Overrides the cluster used for jobs and other supported assets.
ClusterId string `json:"cluster_id,omitempty"`
Bundle *Bundle `json:"bundle,omitempty"`
@ -38,7 +41,26 @@ type Target struct {
// Override default values or lookup name for defined variables
// Does not permit defining new variables or redefining existing ones
// in the scope of an target
Variables map[string]*variable.Variable `json:"variables,omitempty"`
//
// There are two valid ways to define a variable override in a target:
// 1. Direct value override. We normalize this to the variable.Variable
// struct format when loading the configuration YAML:
//
// variables:
// foo: "value"
//
// 2. Override matching the variable.Variable struct.
//
// variables:
// foo:
// default: "value"
//
// OR
//
// variables:
// foo:
// lookup: "resource_name"
Variables map[string]*variable.TargetVariable `json:"variables,omitempty"`
Git Git `json:"git,omitempty"`

View File

@ -0,0 +1,161 @@
package validate
import (
"context"
"fmt"
"strings"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/databricks-sdk-go/service/jobs"
)
// JobTaskClusterSpec validates that job tasks have cluster spec defined
// if task requires a cluster
func JobTaskClusterSpec() bundle.ReadOnlyMutator {
return &jobTaskClusterSpec{}
}
type jobTaskClusterSpec struct {
}
func (v *jobTaskClusterSpec) Name() string {
return "validate:job_task_cluster_spec"
}
func (v *jobTaskClusterSpec) Apply(ctx context.Context, rb bundle.ReadOnlyBundle) diag.Diagnostics {
diags := diag.Diagnostics{}
jobsPath := dyn.NewPath(dyn.Key("resources"), dyn.Key("jobs"))
for resourceName, job := range rb.Config().Resources.Jobs {
resourcePath := jobsPath.Append(dyn.Key(resourceName))
for taskIndex, task := range job.Tasks {
taskPath := resourcePath.Append(dyn.Key("tasks"), dyn.Index(taskIndex))
diags = diags.Extend(validateJobTask(rb, task, taskPath))
}
}
return diags
}
func validateJobTask(rb bundle.ReadOnlyBundle, task jobs.Task, taskPath dyn.Path) diag.Diagnostics {
diags := diag.Diagnostics{}
var specified []string
var unspecified []string
if task.JobClusterKey != "" {
specified = append(specified, "job_cluster_key")
} else {
unspecified = append(unspecified, "job_cluster_key")
}
if task.EnvironmentKey != "" {
specified = append(specified, "environment_key")
} else {
unspecified = append(unspecified, "environment_key")
}
if task.ExistingClusterId != "" {
specified = append(specified, "existing_cluster_id")
} else {
unspecified = append(unspecified, "existing_cluster_id")
}
if task.NewCluster != nil {
specified = append(specified, "new_cluster")
} else {
unspecified = append(unspecified, "new_cluster")
}
if task.ForEachTask != nil {
forEachTaskPath := taskPath.Append(dyn.Key("for_each_task"), dyn.Key("task"))
diags = diags.Extend(validateJobTask(rb, task.ForEachTask.Task, forEachTaskPath))
}
if isComputeTask(task) && len(specified) == 0 {
if task.NotebookTask != nil {
// notebook tasks without cluster spec will use notebook environment
} else {
// path might be not very helpful, adding user-specified task key clarifies the context
detail := fmt.Sprintf(
"Task %q requires a cluster or an environment to run.\nSpecify one of the following fields: %s.",
task.TaskKey,
strings.Join(unspecified, ", "),
)
diags = diags.Append(diag.Diagnostic{
Severity: diag.Error,
Summary: "Missing required cluster or environment settings",
Detail: detail,
Locations: rb.Config().GetLocations(taskPath.String()),
Paths: []dyn.Path{taskPath},
})
}
}
return diags
}
// isComputeTask returns true if the task runs on a cluster or serverless GC
func isComputeTask(task jobs.Task) bool {
if task.NotebookTask != nil {
// if warehouse_id is set, it's SQL notebook that doesn't need cluster or serverless GC
if task.NotebookTask.WarehouseId != "" {
return false
} else {
// task settings don't require specifying a cluster/serverless GC, but task itself can run on one
// we handle that case separately in validateJobTask
return true
}
}
if task.PythonWheelTask != nil {
return true
}
if task.DbtTask != nil {
return true
}
if task.SparkJarTask != nil {
return true
}
if task.SparkSubmitTask != nil {
return true
}
if task.SparkPythonTask != nil {
return true
}
if task.SqlTask != nil {
return false
}
if task.PipelineTask != nil {
// while pipelines use clusters, pipeline tasks don't, they only trigger pipelines
return false
}
if task.RunJobTask != nil {
return false
}
if task.ConditionTask != nil {
return false
}
// for each task doesn't use clusters, underlying task(s) can though
if task.ForEachTask != nil {
return false
}
return false
}

View File

@ -0,0 +1,203 @@
package validate
import (
"context"
"testing"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/databricks-sdk-go/service/compute"
"github.com/databricks/databricks-sdk-go/service/jobs"
"github.com/stretchr/testify/assert"
)
func TestJobTaskClusterSpec(t *testing.T) {
expectedSummary := "Missing required cluster or environment settings"
type testCase struct {
name string
task jobs.Task
errorPath string
errorDetail string
errorSummary string
}
testCases := []testCase{
{
name: "valid notebook task",
task: jobs.Task{
// while a cluster is needed, it will use notebook environment to create one
NotebookTask: &jobs.NotebookTask{},
},
},
{
name: "valid notebook task (job_cluster_key)",
task: jobs.Task{
JobClusterKey: "cluster1",
NotebookTask: &jobs.NotebookTask{},
},
},
{
name: "valid notebook task (new_cluster)",
task: jobs.Task{
NewCluster: &compute.ClusterSpec{},
NotebookTask: &jobs.NotebookTask{},
},
},
{
name: "valid notebook task (existing_cluster_id)",
task: jobs.Task{
ExistingClusterId: "cluster1",
NotebookTask: &jobs.NotebookTask{},
},
},
{
name: "valid SQL notebook task",
task: jobs.Task{
NotebookTask: &jobs.NotebookTask{
WarehouseId: "warehouse1",
},
},
},
{
name: "valid python wheel task",
task: jobs.Task{
JobClusterKey: "cluster1",
PythonWheelTask: &jobs.PythonWheelTask{},
},
},
{
name: "valid python wheel task (environment_key)",
task: jobs.Task{
EnvironmentKey: "environment1",
PythonWheelTask: &jobs.PythonWheelTask{},
},
},
{
name: "valid dbt task",
task: jobs.Task{
JobClusterKey: "cluster1",
DbtTask: &jobs.DbtTask{},
},
},
{
name: "valid spark jar task",
task: jobs.Task{
JobClusterKey: "cluster1",
SparkJarTask: &jobs.SparkJarTask{},
},
},
{
name: "valid spark submit",
task: jobs.Task{
NewCluster: &compute.ClusterSpec{},
SparkSubmitTask: &jobs.SparkSubmitTask{},
},
},
{
name: "valid spark python task",
task: jobs.Task{
JobClusterKey: "cluster1",
SparkPythonTask: &jobs.SparkPythonTask{},
},
},
{
name: "valid SQL task",
task: jobs.Task{
SqlTask: &jobs.SqlTask{},
},
},
{
name: "valid pipeline task",
task: jobs.Task{
PipelineTask: &jobs.PipelineTask{},
},
},
{
name: "valid run job task",
task: jobs.Task{
RunJobTask: &jobs.RunJobTask{},
},
},
{
name: "valid condition task",
task: jobs.Task{
ConditionTask: &jobs.ConditionTask{},
},
},
{
name: "valid for each task",
task: jobs.Task{
ForEachTask: &jobs.ForEachTask{
Task: jobs.Task{
JobClusterKey: "cluster1",
NotebookTask: &jobs.NotebookTask{},
},
},
},
},
{
name: "invalid python wheel task",
task: jobs.Task{
PythonWheelTask: &jobs.PythonWheelTask{},
TaskKey: "my_task",
},
errorPath: "resources.jobs.job1.tasks[0]",
errorDetail: `Task "my_task" requires a cluster or an environment to run.
Specify one of the following fields: job_cluster_key, environment_key, existing_cluster_id, new_cluster.`,
errorSummary: expectedSummary,
},
{
name: "invalid for each task",
task: jobs.Task{
ForEachTask: &jobs.ForEachTask{
Task: jobs.Task{
PythonWheelTask: &jobs.PythonWheelTask{},
TaskKey: "my_task",
},
},
},
errorPath: "resources.jobs.job1.tasks[0].for_each_task.task",
errorDetail: `Task "my_task" requires a cluster or an environment to run.
Specify one of the following fields: job_cluster_key, environment_key, existing_cluster_id, new_cluster.`,
errorSummary: expectedSummary,
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
job := &resources.Job{
JobSettings: &jobs.JobSettings{
Tasks: []jobs.Task{tc.task},
},
}
b := createBundle(map[string]*resources.Job{"job1": job})
diags := bundle.ApplyReadOnly(context.Background(), bundle.ReadOnly(b), JobTaskClusterSpec())
if tc.errorPath != "" || tc.errorDetail != "" || tc.errorSummary != "" {
assert.Len(t, diags, 1)
assert.Len(t, diags[0].Paths, 1)
diag := diags[0]
assert.Equal(t, tc.errorPath, diag.Paths[0].String())
assert.Equal(t, tc.errorSummary, diag.Summary)
assert.Equal(t, tc.errorDetail, diag.Detail)
} else {
assert.ElementsMatch(t, []string{}, diags)
}
})
}
}
func createBundle(jobs map[string]*resources.Job) *bundle.Bundle {
return &bundle.Bundle{
Config: config.Root{
Resources: config.Resources{
Jobs: jobs,
},
},
}
}

View File

@ -34,6 +34,7 @@ func (v *validate) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics
JobClusterKeyDefined(),
FilesToSync(),
ValidateSyncPatterns(),
JobTaskClusterSpec(),
))
}

View File

@ -16,6 +16,11 @@ const (
VariableTypeComplex VariableType = "complex"
)
// We alias it here to override the JSON schema associated with a variable value
// in a target override. This is because we allow for directly specifying the value
// in addition to the variable.Variable struct format in a target override.
type TargetVariable Variable
// An input variable for the bundle config
type Variable struct {
// A type of the variable. This is used to validate the value of the variable

View File

@ -8,9 +8,12 @@ import (
"github.com/databricks/cli/libs/cmdio"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/log"
"github.com/databricks/cli/libs/sync"
)
type upload struct{}
type upload struct {
outputHandler sync.OutputHandler
}
func (m *upload) Name() string {
return "files.Upload"
@ -18,11 +21,18 @@ func (m *upload) Name() string {
func (m *upload) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
cmdio.LogString(ctx, fmt.Sprintf("Uploading bundle files to %s...", b.Config.Workspace.FilePath))
sync, err := GetSync(ctx, bundle.ReadOnly(b))
opts, err := GetSyncOptions(ctx, bundle.ReadOnly(b))
if err != nil {
return diag.FromErr(err)
}
opts.OutputHandler = m.outputHandler
sync, err := sync.New(ctx, *opts)
if err != nil {
return diag.FromErr(err)
}
defer sync.Close()
b.Files, err = sync.RunOnce(ctx)
if err != nil {
return diag.FromErr(err)
@ -32,6 +42,6 @@ func (m *upload) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
return nil
}
func Upload() bundle.Mutator {
return &upload{}
func Upload(outputHandler sync.OutputHandler) bundle.Mutator {
return &upload{outputHandler}
}

View File

@ -9,6 +9,7 @@ import (
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/bundle/internal/bundletest"
"github.com/databricks/cli/bundle/metadata"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/databricks-sdk-go/service/jobs"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
@ -55,9 +56,9 @@ func TestComputeMetadataMutator(t *testing.T) {
},
}
bundletest.SetLocation(b, "resources.jobs.my-job-1", "a/b/c")
bundletest.SetLocation(b, "resources.jobs.my-job-2", "d/e/f")
bundletest.SetLocation(b, "resources.pipelines.my-pipeline", "abc")
bundletest.SetLocation(b, "resources.jobs.my-job-1", []dyn.Location{{File: "a/b/c"}})
bundletest.SetLocation(b, "resources.jobs.my-job-2", []dyn.Location{{File: "d/e/f"}})
bundletest.SetLocation(b, "resources.pipelines.my-pipeline", []dyn.Location{{File: "abc"}})
expectedMetadata := metadata.Metadata{
Version: metadata.Version,

View File

@ -231,6 +231,13 @@ func BundleToTerraform(config *config.Root) *schema.Root {
tfroot.Resource.QualityMonitor[k] = &dst
}
for k, src := range config.Resources.Clusters {
noResources = false
var dst schema.ResourceCluster
conv(src, &dst)
tfroot.Resource.Cluster[k] = &dst
}
// We explicitly set "resource" to nil to omit it from a JSON encoding.
// This is required because the terraform CLI requires >= 1 resources defined
// if the "resource" property is used in a .tf.json file.
@ -394,6 +401,16 @@ func TerraformToBundle(state *resourcesState, config *config.Root) error {
}
cur.ID = instance.Attributes.ID
config.Resources.Schemas[resource.Name] = cur
case "databricks_cluster":
if config.Resources.Clusters == nil {
config.Resources.Clusters = make(map[string]*resources.Cluster)
}
cur := config.Resources.Clusters[resource.Name]
if cur == nil {
cur = &resources.Cluster{ModifiedStatus: resources.ModifiedStatusDeleted}
}
cur.ID = instance.Attributes.ID
config.Resources.Clusters[resource.Name] = cur
case "databricks_permissions":
case "databricks_grants":
// Ignore; no need to pull these back into the configuration.
@ -443,6 +460,11 @@ func TerraformToBundle(state *resourcesState, config *config.Root) error {
src.ModifiedStatus = resources.ModifiedStatusCreated
}
}
for _, src := range config.Resources.Clusters {
if src.ModifiedStatus == "" && src.ID == "" {
src.ModifiedStatus = resources.ModifiedStatusCreated
}
}
return nil
}

View File

@ -663,6 +663,14 @@ func TestTerraformToBundleEmptyLocalResources(t *testing.T) {
{Attributes: stateInstanceAttributes{ID: "1"}},
},
},
{
Type: "databricks_cluster",
Mode: "managed",
Name: "test_cluster",
Instances: []stateResourceInstance{
{Attributes: stateInstanceAttributes{ID: "1"}},
},
},
},
}
err := TerraformToBundle(&tfState, &config)
@ -692,6 +700,9 @@ func TestTerraformToBundleEmptyLocalResources(t *testing.T) {
assert.Equal(t, "1", config.Resources.Schemas["test_schema"].ID)
assert.Equal(t, resources.ModifiedStatusDeleted, config.Resources.Schemas["test_schema"].ModifiedStatus)
assert.Equal(t, "1", config.Resources.Clusters["test_cluster"].ID)
assert.Equal(t, resources.ModifiedStatusDeleted, config.Resources.Clusters["test_cluster"].ModifiedStatus)
AssertFullResourceCoverage(t, &config)
}
@ -754,6 +765,13 @@ func TestTerraformToBundleEmptyRemoteResources(t *testing.T) {
},
},
},
Clusters: map[string]*resources.Cluster{
"test_cluster": {
ClusterSpec: &compute.ClusterSpec{
ClusterName: "test_cluster",
},
},
},
},
}
var tfState = resourcesState{
@ -786,6 +804,9 @@ func TestTerraformToBundleEmptyRemoteResources(t *testing.T) {
assert.Equal(t, "", config.Resources.Schemas["test_schema"].ID)
assert.Equal(t, resources.ModifiedStatusCreated, config.Resources.Schemas["test_schema"].ModifiedStatus)
assert.Equal(t, "", config.Resources.Clusters["test_cluster"].ID)
assert.Equal(t, resources.ModifiedStatusCreated, config.Resources.Clusters["test_cluster"].ModifiedStatus)
AssertFullResourceCoverage(t, &config)
}
@ -888,6 +909,18 @@ func TestTerraformToBundleModifiedResources(t *testing.T) {
},
},
},
Clusters: map[string]*resources.Cluster{
"test_cluster": {
ClusterSpec: &compute.ClusterSpec{
ClusterName: "test_cluster",
},
},
"test_cluster_new": {
ClusterSpec: &compute.ClusterSpec{
ClusterName: "test_cluster_new",
},
},
},
},
}
var tfState = resourcesState{
@ -1020,6 +1053,22 @@ func TestTerraformToBundleModifiedResources(t *testing.T) {
{Attributes: stateInstanceAttributes{ID: "2"}},
},
},
{
Type: "databricks_cluster",
Mode: "managed",
Name: "test_cluster",
Instances: []stateResourceInstance{
{Attributes: stateInstanceAttributes{ID: "1"}},
},
},
{
Type: "databricks_cluster",
Mode: "managed",
Name: "test_cluster_old",
Instances: []stateResourceInstance{
{Attributes: stateInstanceAttributes{ID: "2"}},
},
},
},
}
err := TerraformToBundle(&tfState, &config)
@ -1081,6 +1130,13 @@ func TestTerraformToBundleModifiedResources(t *testing.T) {
assert.Equal(t, "", config.Resources.Schemas["test_schema_new"].ID)
assert.Equal(t, resources.ModifiedStatusCreated, config.Resources.Schemas["test_schema_new"].ModifiedStatus)
assert.Equal(t, "1", config.Resources.Clusters["test_cluster"].ID)
assert.Equal(t, "", config.Resources.Clusters["test_cluster"].ModifiedStatus)
assert.Equal(t, "2", config.Resources.Clusters["test_cluster_old"].ID)
assert.Equal(t, resources.ModifiedStatusDeleted, config.Resources.Clusters["test_cluster_old"].ModifiedStatus)
assert.Equal(t, "", config.Resources.Clusters["test_cluster_new"].ID)
assert.Equal(t, resources.ModifiedStatusCreated, config.Resources.Clusters["test_cluster_new"].ModifiedStatus)
AssertFullResourceCoverage(t, &config)
}

View File

@ -58,6 +58,8 @@ func (m *interpolateMutator) Apply(ctx context.Context, b *bundle.Bundle) diag.D
path = dyn.NewPath(dyn.Key("databricks_quality_monitor")).Append(path[2:]...)
case dyn.Key("schemas"):
path = dyn.NewPath(dyn.Key("databricks_schema")).Append(path[2:]...)
case dyn.Key("clusters"):
path = dyn.NewPath(dyn.Key("databricks_cluster")).Append(path[2:]...)
default:
// Trigger "key not found" for unknown resource types.
return dyn.GetByPath(root, path)

View File

@ -31,6 +31,7 @@ func TestInterpolate(t *testing.T) {
"other_model_serving": "${resources.model_serving_endpoints.other_model_serving.id}",
"other_registered_model": "${resources.registered_models.other_registered_model.id}",
"other_schema": "${resources.schemas.other_schema.id}",
"other_cluster": "${resources.clusters.other_cluster.id}",
},
Tasks: []jobs.Task{
{
@ -67,6 +68,7 @@ func TestInterpolate(t *testing.T) {
assert.Equal(t, "${databricks_model_serving.other_model_serving.id}", j.Tags["other_model_serving"])
assert.Equal(t, "${databricks_registered_model.other_registered_model.id}", j.Tags["other_registered_model"])
assert.Equal(t, "${databricks_schema.other_schema.id}", j.Tags["other_schema"])
assert.Equal(t, "${databricks_cluster.other_cluster.id}", j.Tags["other_cluster"])
m := b.Config.Resources.Models["my_model"]
assert.Equal(t, "my_model", m.Model.Name)

View File

@ -0,0 +1,52 @@
package tfdyn
import (
"context"
"fmt"
"github.com/databricks/cli/bundle/internal/tf/schema"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/dyn/convert"
"github.com/databricks/cli/libs/log"
"github.com/databricks/databricks-sdk-go/service/compute"
)
func convertClusterResource(ctx context.Context, vin dyn.Value) (dyn.Value, error) {
// Normalize the output value to the target schema.
vout, diags := convert.Normalize(compute.ClusterSpec{}, vin)
for _, diag := range diags {
log.Debugf(ctx, "cluster normalization diagnostic: %s", diag.Summary)
}
return vout, nil
}
type clusterConverter struct{}
func (clusterConverter) Convert(ctx context.Context, key string, vin dyn.Value, out *schema.Resources) error {
vout, err := convertClusterResource(ctx, vin)
if err != nil {
return err
}
// We always set no_wait as it allows DABs not to wait for cluster to be started.
vout, err = dyn.Set(vout, "no_wait", dyn.V(true))
if err != nil {
return err
}
// Add the converted resource to the output.
out.Cluster[key] = vout.AsAny()
// Configure permissions for this resource.
if permissions := convertPermissionsResource(ctx, vin); permissions != nil {
permissions.JobId = fmt.Sprintf("${databricks_cluster.%s.id}", key)
out.Permissions["cluster_"+key] = permissions
}
return nil
}
func init() {
registerConverter("clusters", clusterConverter{})
}

View File

@ -0,0 +1,97 @@
package tfdyn
import (
"context"
"testing"
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/bundle/internal/tf/schema"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/dyn/convert"
"github.com/databricks/databricks-sdk-go/service/compute"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestConvertCluster(t *testing.T) {
var src = resources.Cluster{
ClusterSpec: &compute.ClusterSpec{
NumWorkers: 3,
SparkVersion: "13.3.x-scala2.12",
ClusterName: "cluster",
SparkConf: map[string]string{
"spark.executor.memory": "2g",
},
AwsAttributes: &compute.AwsAttributes{
Availability: "ON_DEMAND",
},
AzureAttributes: &compute.AzureAttributes{
Availability: "SPOT",
},
DataSecurityMode: "USER_ISOLATION",
NodeTypeId: "m5.xlarge",
Autoscale: &compute.AutoScale{
MinWorkers: 1,
MaxWorkers: 10,
},
},
Permissions: []resources.Permission{
{
Level: "CAN_RUN",
UserName: "jack@gmail.com",
},
{
Level: "CAN_MANAGE",
ServicePrincipalName: "sp",
},
},
}
vin, err := convert.FromTyped(src, dyn.NilValue)
require.NoError(t, err)
ctx := context.Background()
out := schema.NewResources()
err = clusterConverter{}.Convert(ctx, "my_cluster", vin, out)
require.NoError(t, err)
cluster := out.Cluster["my_cluster"]
assert.Equal(t, map[string]any{
"num_workers": int64(3),
"spark_version": "13.3.x-scala2.12",
"cluster_name": "cluster",
"spark_conf": map[string]any{
"spark.executor.memory": "2g",
},
"aws_attributes": map[string]any{
"availability": "ON_DEMAND",
},
"azure_attributes": map[string]any{
"availability": "SPOT",
},
"data_security_mode": "USER_ISOLATION",
"no_wait": true,
"node_type_id": "m5.xlarge",
"autoscale": map[string]any{
"min_workers": int64(1),
"max_workers": int64(10),
},
}, cluster)
// Assert equality on the permissions
assert.Equal(t, &schema.ResourcePermissions{
JobId: "${databricks_cluster.my_cluster.id}",
AccessControl: []schema.ResourcePermissionsAccessControl{
{
PermissionLevel: "CAN_RUN",
UserName: "jack@gmail.com",
},
{
PermissionLevel: "CAN_MANAGE",
ServicePrincipalName: "sp",
},
},
}, out.Permissions["cluster_my_cluster"])
}

View File

@ -1,42 +0,0 @@
package main
import (
"encoding/json"
"fmt"
"log"
"os"
"github.com/databricks/cli/bundle/schema"
)
func main() {
if len(os.Args) != 2 {
fmt.Println("Usage: go run main.go <output-file>")
os.Exit(1)
}
// Output file, to write the generated schema descriptions to.
outputFile := os.Args[1]
// Input file, the databricks openapi spec.
inputFile := os.Getenv("DATABRICKS_OPENAPI_SPEC")
if inputFile == "" {
log.Fatal("DATABRICKS_OPENAPI_SPEC environment variable not set")
}
// Generate the schema descriptions.
docs, err := schema.UpdateBundleDescriptions(inputFile)
if err != nil {
log.Fatal(err)
}
result, err := json.MarshalIndent(docs, "", " ")
if err != nil {
log.Fatal(err)
}
// Write the schema descriptions to the output file.
err = os.WriteFile(outputFile, result, 0644)
if err != nil {
log.Fatal(err)
}
}

View File

@ -8,15 +8,13 @@ import (
// SetLocation sets the location of all values in the bundle to the given path.
// This is useful for testing where we need to associate configuration
// with the path it is loaded from.
func SetLocation(b *bundle.Bundle, prefix string, filePath string) {
func SetLocation(b *bundle.Bundle, prefix string, locations []dyn.Location) {
start := dyn.MustPathFromString(prefix)
b.Config.Mutate(func(root dyn.Value) (dyn.Value, error) {
return dyn.Walk(root, func(p dyn.Path, v dyn.Value) (dyn.Value, error) {
// If the path has the given prefix, set the location.
if p.HasPrefix(start) {
return v.WithLocations([]dyn.Location{{
File: filePath,
}}), nil
return v.WithLocations(locations), nil
}
// The path is not nested under the given prefix.

View File

@ -0,0 +1,109 @@
package main
import (
"encoding/json"
"fmt"
"log"
"os"
"reflect"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/variable"
"github.com/databricks/cli/libs/jsonschema"
)
func interpolationPattern(s string) string {
return fmt.Sprintf(`\$\{(%s(\.[a-zA-Z]+([-_]?[a-zA-Z0-9]+)*(\[[0-9]+\])*)+)\}`, s)
}
func addInterpolationPatterns(typ reflect.Type, s jsonschema.Schema) jsonschema.Schema {
if typ == reflect.TypeOf(config.Root{}) || typ == reflect.TypeOf(variable.Variable{}) {
return s
}
// The variables block in a target override allows for directly specifying
// the value of the variable.
if typ == reflect.TypeOf(variable.TargetVariable{}) {
return jsonschema.Schema{
AnyOf: []jsonschema.Schema{
// We keep the original schema so that autocomplete suggestions
// continue to work.
s,
// All values are valid for a variable value, be it primitive types
// like string/bool or complex ones like objects/arrays. Thus we override
// the schema to allow all valid JSON values.
{},
},
}
}
switch s.Type {
case jsonschema.ArrayType, jsonschema.ObjectType:
// arrays and objects can have complex variable values specified.
return jsonschema.Schema{
AnyOf: []jsonschema.Schema{
s,
{
Type: jsonschema.StringType,
Pattern: interpolationPattern("var"),
}},
}
case jsonschema.IntegerType, jsonschema.NumberType, jsonschema.BooleanType:
// primitives can have variable values, or references like ${bundle.xyz}
// or ${workspace.xyz}
return jsonschema.Schema{
AnyOf: []jsonschema.Schema{
s,
{Type: jsonschema.StringType, Pattern: interpolationPattern("resources")},
{Type: jsonschema.StringType, Pattern: interpolationPattern("bundle")},
{Type: jsonschema.StringType, Pattern: interpolationPattern("workspace")},
{Type: jsonschema.StringType, Pattern: interpolationPattern("artifacts")},
{Type: jsonschema.StringType, Pattern: interpolationPattern("var")},
},
}
default:
return s
}
}
func main() {
if len(os.Args) != 2 {
fmt.Println("Usage: go run main.go <output-file>")
os.Exit(1)
}
// Output file, where the generated JSON schema will be written to.
outputFile := os.Args[1]
// Input file, the databricks openapi spec.
inputFile := os.Getenv("DATABRICKS_OPENAPI_SPEC")
if inputFile == "" {
log.Fatal("DATABRICKS_OPENAPI_SPEC environment variable not set")
}
p, err := newParser(inputFile)
if err != nil {
log.Fatal(err)
}
// Generate the JSON schema from the bundle Go struct.
s, err := jsonschema.FromType(reflect.TypeOf(config.Root{}), []func(reflect.Type, jsonschema.Schema) jsonschema.Schema{
p.addDescriptions,
p.addEnums,
addInterpolationPatterns,
})
if err != nil {
log.Fatal(err)
}
b, err := json.MarshalIndent(s, "", " ")
if err != nil {
log.Fatal(err)
}
// Write the schema descriptions to the output file.
err = os.WriteFile(outputFile, b, 0644)
if err != nil {
log.Fatal(err)
}
}

View File

@ -0,0 +1,123 @@
package main
import (
"encoding/json"
"fmt"
"os"
"path"
"reflect"
"strings"
"github.com/databricks/cli/libs/jsonschema"
)
type Components struct {
Schemas map[string]jsonschema.Schema `json:"schemas,omitempty"`
}
type Specification struct {
Components Components `json:"components"`
}
type openapiParser struct {
ref map[string]jsonschema.Schema
}
func newParser(path string) (*openapiParser, error) {
b, err := os.ReadFile(path)
if err != nil {
return nil, err
}
spec := Specification{}
err = json.Unmarshal(b, &spec)
if err != nil {
return nil, err
}
p := &openapiParser{}
p.ref = spec.Components.Schemas
return p, nil
}
// This function checks if the input type:
// 1. Is a Databricks Go SDK type.
// 2. Has a Databricks Go SDK type embedded in it.
//
// If the above conditions are met, the function returns the JSON schema
// corresponding to the Databricks Go SDK type from the OpenAPI spec.
func (p *openapiParser) findRef(typ reflect.Type) (jsonschema.Schema, bool) {
typs := []reflect.Type{typ}
// Check for embedded Databricks Go SDK types.
if typ.Kind() == reflect.Struct {
for i := 0; i < typ.NumField(); i++ {
if !typ.Field(i).Anonymous {
continue
}
// Deference current type if it's a pointer.
ctyp := typ.Field(i).Type
for ctyp.Kind() == reflect.Ptr {
ctyp = ctyp.Elem()
}
typs = append(typs, ctyp)
}
}
for _, ctyp := range typs {
// Skip if it's not a Go SDK type.
if !strings.HasPrefix(ctyp.PkgPath(), "github.com/databricks/databricks-sdk-go") {
continue
}
pkgName := path.Base(ctyp.PkgPath())
k := fmt.Sprintf("%s.%s", pkgName, ctyp.Name())
// Skip if the type is not in the openapi spec.
_, ok := p.ref[k]
if !ok {
continue
}
// Return the first Go SDK type found in the openapi spec.
return p.ref[k], true
}
return jsonschema.Schema{}, false
}
// Use the OpenAPI spec to load descriptions for the given type.
func (p *openapiParser) addDescriptions(typ reflect.Type, s jsonschema.Schema) jsonschema.Schema {
ref, ok := p.findRef(typ)
if !ok {
return s
}
s.Description = ref.Description
for k, v := range s.Properties {
if refProp, ok := ref.Properties[k]; ok {
v.Description = refProp.Description
}
}
return s
}
// Use the OpenAPI spec add enum values for the given type.
func (p *openapiParser) addEnums(typ reflect.Type, s jsonschema.Schema) jsonschema.Schema {
ref, ok := p.findRef(typ)
if !ok {
return s
}
s.Enum = append(s.Enum, ref.Enum...)
for k, v := range s.Properties {
if refProp, ok := ref.Properties[k]; ok {
v.Enum = append(v.Enum, refProp.Enum...)
}
}
return s
}

View File

@ -0,0 +1,3 @@
bundle:
# expected type is 'string'
name: 1234

View File

@ -0,0 +1,4 @@
resources:
jobs:
myjob:
format: INVALID_VALUE

View File

@ -0,0 +1,6 @@
resources:
models:
mymodel:
latest_versions:
- creation_timestamp: 123
status: INVALID_VALUE

View File

@ -0,0 +1,8 @@
resources:
jobs:
outer:
name: outer job
tasks:
- task_key: run job task 1
run_job_task:
job_id: ${invalid.reference}

View File

@ -0,0 +1,5 @@
resources:
models:
mymodel:
latest_versions:
- creation_timestamp: ${invalid.reference}

View File

@ -0,0 +1,9 @@
resources:
jobs:
foo:
name: my job
tasks:
# All tasks need to have a task_key.
- notebook_task:
notebook_path: /Users/abc/notebooks/inner
existing_cluster_id: abcd

View File

@ -0,0 +1,5 @@
resources:
jobs:
myjob:
# unknown fields should cause schema failure.
unknown_field: "value"

View File

@ -0,0 +1,6 @@
resources:
models:
mymodel:
creation_timestamp: 123
description: "my model"
unknown: "value"

View File

@ -0,0 +1 @@
unknown: value

View File

@ -0,0 +1,11 @@
artifacts:
abc:
path: /Workspace/a/b/c
type: wheel
files:
- source: ./x.whl
resources:
jobs:
foo:
name: ${artifacts.abc.type}

View File

@ -0,0 +1,2 @@
bundle:
name: basic

View File

@ -0,0 +1,4 @@
targets:
development:
variables:
myvar: value

View File

@ -0,0 +1,63 @@
bundle:
name: a job
workspace:
host: "https://myworkspace.com"
root_path: /abc
presets:
name_prefix: "[DEV]"
jobs_max_concurrent_runs: 10
variables:
simplevar:
default: true
description: "simplevar description"
complexvar:
default:
key1: value1
key2: value2
key3:
- value3
- value4
description: "complexvar description"
run_as:
service_principal_name: myserviceprincipal
resources:
jobs:
myjob:
name: myjob
continuous:
pause_status: PAUSED
edit_mode: EDITABLE
max_concurrent_runs: 10
description: "my job description"
email_notifications:
no_alert_for_skipped_runs: true
environments:
- environment_key: venv
spec:
dependencies:
- python=3.7
client: "myclient"
format: MULTI_TASK
tags:
foo: bar
bar: baz
tasks:
- task_key: mytask
notebook_task:
notebook_path: ${var.simplevar}
existing_cluster_id: abcd
- task_key: mytask2
for_each_task:
inputs: av
concurrency: 10
task:
task_key: inside_for_each
notebook_task:
notebook_path: ${var.complexvar.key3[0]}
- ${var.complexvar}

View File

@ -0,0 +1,72 @@
bundle:
name: ML
workspace:
host: "https://myworkspace.com"
root_path: /abc
presets:
name_prefix: "[DEV]"
jobs_max_concurrent_runs: 10
variables:
simplevar:
default: "true"
description: "simplevar description"
complexvar:
default:
key1: value1
key2: value2
key3:
- value3
- value4
description: "complexvar description"
resources:
models:
mymodel:
creation_timestamp: 123
description: "my model"
latest_versions:
- creation_timestamp: 123
tags: ${var.complexvar.key1}
status: READY
permissions:
- service_principal_name: myserviceprincipal
level: CAN_MANAGE
experiments:
myexperiment:
artifact_location: /dbfs/myexperiment
last_update_time: ${var.complexvar.key2}
lifecycle_stage: ${var.simplevar}
permissions:
- service_principal_name: myserviceprincipal
level: CAN_MANAGE
model_serving_endpoints:
myendpoint:
config:
served_models:
- model_name: ${resources.models.mymodel.name}
model_version: abc
scale_to_zero_enabled: true
workload_size: Large
name: myendpoint
schemas:
myschema:
catalog_name: mycatalog
name: myschema
registered_models:
myregisteredmodel:
catalog_name: mycatalog
name: myregisteredmodel
schema_name: ${resources.schemas.myschema.name}
grants:
- principal: abcd
privileges:
- SELECT
- INSERT

View File

@ -0,0 +1,54 @@
bundle:
name: a pipeline
workspace:
host: "https://myworkspace.com"
root_path: /abc
presets:
name_prefix: "[DEV]"
jobs_max_concurrent_runs: 10
variables:
simplevar:
default: true
description: "simplevar description"
complexvar:
default:
key1: value1
key2: value2
key3:
- value3
- value4
description: "complexvar description"
artifacts:
mywheel:
path: ./mywheel.whl
type: WHEEL
run_as:
service_principal_name: myserviceprincipal
resources:
jobs:
myjob:
name: myjob
tasks:
- task_key: ${bundle.name} pipeline trigger
pipeline_task:
pipeline_id: ${resources.mypipeline.id}
pipelines:
mypipeline:
name: mypipeline
libraries:
- whl: ./mywheel.whl
catalog: 3{var.complexvar.key2}
development: true
clusters:
- autoscale:
mode: ENHANCED
max_workers: 10
min_workers: 1

View File

@ -0,0 +1,16 @@
bundle:
name: quality_monitor
resources:
quality_monitors:
myqualitymonitor:
inference_log:
granularities:
- a
- b
model_id_col: a
prediction_col: b
timestamp_col: c
problem_type: PROBLEM_TYPE_CLASSIFICATION
assets_dir: /dbfs/mnt/abc
output_schema_name: default

View File

@ -0,0 +1,56 @@
bundle:
name: a run job task
databricks_cli_version: 0.200.0
compute_id: "mycompute"
variables:
simplevar:
default: 5678
description: "simplevar description"
complexvar:
default:
key1: 1234
key2: value2
key3:
- value3
- 9999
description: "complexvar description"
resources:
jobs:
inner:
permissions:
- user_name: user1
level: CAN_MANAGE
name: inner job
tasks:
- task_key: inner notebook task
notebook_task:
notebook_path: /Users/abc/notebooks/inner
existing_cluster_id: abcd
outer:
name: outer job
tasks:
- task_key: run job task 1
run_job_task:
job_id: 1234
- task_key: run job task 2
run_job_task:
job_id: ${var.complexvar.key1}
- task_key: run job task 3
run_job_task:
job_id: ${var.simplevar}
- task_key: run job task 4
run_job_task:
job_id: ${resources.inner.id}
- task_key: run job task 5
run_job_task:
job_id: ${var.complexvar.key3[1]}

View File

@ -0,0 +1,24 @@
bundle:
name: basic
variables:
complexvar:
default:
key1: 1234
key2: value2
key3:
- value3
- 9999
description: complexvar description
resources:
schemas:
myschema:
name: myschema
catalog_name: main
grants:
- ${var.complexvar}
- principal: ${workspace.current_user.me}
privileges:
- ${var.complexvar.key3[0]}
- ${var.complexvar.key2}

View File

@ -51,9 +51,15 @@ func (r *root) Generate(path string) error {
}
func Run(ctx context.Context, schema *tfjson.ProviderSchema, path string) error {
// Generate types for resources.
// Generate types for resources
var resources []*namedBlock
for _, k := range sortKeys(schema.ResourceSchemas) {
// Skipping all plugin framework struct generation.
// TODO: This is a temporary fix, generation should be fixed in the future.
if strings.HasSuffix(k, "_pluginframework") {
continue
}
v := schema.ResourceSchemas[k]
b := &namedBlock{
filePattern: "resource_%s.go",
@ -71,6 +77,12 @@ func Run(ctx context.Context, schema *tfjson.ProviderSchema, path string) error
// Generate types for data sources.
var dataSources []*namedBlock
for _, k := range sortKeys(schema.DataSourceSchemas) {
// Skipping all plugin framework struct generation.
// TODO: This is a temporary fix, generation should be fixed in the future.
if strings.HasSuffix(k, "_pluginframework") {
continue
}
v := schema.DataSourceSchemas[k]
b := &namedBlock{
filePattern: "data_source_%s.go",

View File

@ -1,3 +1,3 @@
package schema
const ProviderVersion = "1.50.0"
const ProviderVersion = "1.52.0"

View File

@ -2,8 +2,16 @@
package schema
type DataSourceClustersFilterBy struct {
ClusterSources []string `json:"cluster_sources,omitempty"`
ClusterStates []string `json:"cluster_states,omitempty"`
IsPinned bool `json:"is_pinned,omitempty"`
PolicyId string `json:"policy_id,omitempty"`
}
type DataSourceClusters struct {
ClusterNameContains string `json:"cluster_name_contains,omitempty"`
Id string `json:"id,omitempty"`
Ids []string `json:"ids,omitempty"`
FilterBy *DataSourceClustersFilterBy `json:"filter_by,omitempty"`
}

View File

@ -19,6 +19,7 @@ type DataSourceExternalLocationExternalLocationInfo struct {
CreatedBy string `json:"created_by,omitempty"`
CredentialId string `json:"credential_id,omitempty"`
CredentialName string `json:"credential_name,omitempty"`
Fallback bool `json:"fallback,omitempty"`
IsolationMode string `json:"isolation_mode,omitempty"`
MetastoreId string `json:"metastore_id,omitempty"`
Name string `json:"name,omitempty"`

View File

@ -18,12 +18,14 @@ type DataSourceShareObject struct {
AddedBy string `json:"added_by,omitempty"`
CdfEnabled bool `json:"cdf_enabled,omitempty"`
Comment string `json:"comment,omitempty"`
Content string `json:"content,omitempty"`
DataObjectType string `json:"data_object_type"`
HistoryDataSharingStatus string `json:"history_data_sharing_status,omitempty"`
Name string `json:"name"`
SharedAs string `json:"shared_as,omitempty"`
StartVersion int `json:"start_version,omitempty"`
Status string `json:"status,omitempty"`
StringSharedAs string `json:"string_shared_as,omitempty"`
Partition []DataSourceShareObjectPartition `json:"partition,omitempty"`
}

View File

@ -2,20 +2,14 @@
package schema
type ResourceAutomaticClusterUpdateWorkspaceSettingAutomaticClusterUpdateWorkspaceEnablementDetails struct {
ForcedForComplianceMode bool `json:"forced_for_compliance_mode,omitempty"`
UnavailableForDisabledEntitlement bool `json:"unavailable_for_disabled_entitlement,omitempty"`
UnavailableForNonEnterpriseTier bool `json:"unavailable_for_non_enterprise_tier,omitempty"`
}
type ResourceAutomaticClusterUpdateWorkspaceSettingAutomaticClusterUpdateWorkspaceMaintenanceWindowWeekDayBasedScheduleWindowStartTime struct {
Hours int `json:"hours,omitempty"`
Minutes int `json:"minutes,omitempty"`
Hours int `json:"hours"`
Minutes int `json:"minutes"`
}
type ResourceAutomaticClusterUpdateWorkspaceSettingAutomaticClusterUpdateWorkspaceMaintenanceWindowWeekDayBasedSchedule struct {
DayOfWeek string `json:"day_of_week,omitempty"`
Frequency string `json:"frequency,omitempty"`
DayOfWeek string `json:"day_of_week"`
Frequency string `json:"frequency"`
WindowStartTime *ResourceAutomaticClusterUpdateWorkspaceSettingAutomaticClusterUpdateWorkspaceMaintenanceWindowWeekDayBasedScheduleWindowStartTime `json:"window_start_time,omitempty"`
}
@ -25,9 +19,9 @@ type ResourceAutomaticClusterUpdateWorkspaceSettingAutomaticClusterUpdateWorkspa
type ResourceAutomaticClusterUpdateWorkspaceSettingAutomaticClusterUpdateWorkspace struct {
CanToggle bool `json:"can_toggle,omitempty"`
Enabled bool `json:"enabled,omitempty"`
Enabled bool `json:"enabled"`
EnablementDetails []any `json:"enablement_details,omitempty"`
RestartEvenIfNoUpdatesAvailable bool `json:"restart_even_if_no_updates_available,omitempty"`
EnablementDetails *ResourceAutomaticClusterUpdateWorkspaceSettingAutomaticClusterUpdateWorkspaceEnablementDetails `json:"enablement_details,omitempty"`
MaintenanceWindow *ResourceAutomaticClusterUpdateWorkspaceSettingAutomaticClusterUpdateWorkspaceMaintenanceWindow `json:"maintenance_window,omitempty"`
}

View File

@ -176,6 +176,7 @@ type ResourceCluster struct {
IdempotencyToken string `json:"idempotency_token,omitempty"`
InstancePoolId string `json:"instance_pool_id,omitempty"`
IsPinned bool `json:"is_pinned,omitempty"`
NoWait bool `json:"no_wait,omitempty"`
NodeTypeId string `json:"node_type_id,omitempty"`
NumWorkers int `json:"num_workers,omitempty"`
PolicyId string `json:"policy_id,omitempty"`

View File

@ -3,8 +3,8 @@
package schema
type ResourceComplianceSecurityProfileWorkspaceSettingComplianceSecurityProfileWorkspace struct {
ComplianceStandards []string `json:"compliance_standards,omitempty"`
IsEnabled bool `json:"is_enabled,omitempty"`
ComplianceStandards []string `json:"compliance_standards"`
IsEnabled bool `json:"is_enabled"`
}
type ResourceComplianceSecurityProfileWorkspaceSetting struct {

View File

@ -3,7 +3,7 @@
package schema
type ResourceEnhancedSecurityMonitoringWorkspaceSettingEnhancedSecurityMonitoringWorkspace struct {
IsEnabled bool `json:"is_enabled,omitempty"`
IsEnabled bool `json:"is_enabled"`
}
type ResourceEnhancedSecurityMonitoringWorkspaceSetting struct {

View File

@ -97,11 +97,13 @@ type ResourceModelServingConfigServedEntities struct {
type ResourceModelServingConfigServedModels struct {
EnvironmentVars map[string]string `json:"environment_vars,omitempty"`
InstanceProfileArn string `json:"instance_profile_arn,omitempty"`
MaxProvisionedThroughput int `json:"max_provisioned_throughput,omitempty"`
MinProvisionedThroughput int `json:"min_provisioned_throughput,omitempty"`
ModelName string `json:"model_name"`
ModelVersion string `json:"model_version"`
Name string `json:"name,omitempty"`
ScaleToZeroEnabled bool `json:"scale_to_zero_enabled,omitempty"`
WorkloadSize string `json:"workload_size"`
WorkloadSize string `json:"workload_size,omitempty"`
WorkloadType string `json:"workload_type,omitempty"`
}

View File

@ -18,20 +18,27 @@ type ResourceShareObject struct {
AddedBy string `json:"added_by,omitempty"`
CdfEnabled bool `json:"cdf_enabled,omitempty"`
Comment string `json:"comment,omitempty"`
Content string `json:"content,omitempty"`
DataObjectType string `json:"data_object_type"`
HistoryDataSharingStatus string `json:"history_data_sharing_status,omitempty"`
Name string `json:"name"`
SharedAs string `json:"shared_as,omitempty"`
StartVersion int `json:"start_version,omitempty"`
Status string `json:"status,omitempty"`
StringSharedAs string `json:"string_shared_as,omitempty"`
Partition []ResourceShareObjectPartition `json:"partition,omitempty"`
}
type ResourceShare struct {
Comment string `json:"comment,omitempty"`
CreatedAt int `json:"created_at,omitempty"`
CreatedBy string `json:"created_by,omitempty"`
Id string `json:"id,omitempty"`
Name string `json:"name"`
Owner string `json:"owner,omitempty"`
StorageLocation string `json:"storage_location,omitempty"`
StorageRoot string `json:"storage_root,omitempty"`
UpdatedAt int `json:"updated_at,omitempty"`
UpdatedBy string `json:"updated_by,omitempty"`
Object []ResourceShareObject `json:"object,omitempty"`
}

View File

@ -15,6 +15,7 @@ type ResourceSqlTable struct {
ClusterKeys []string `json:"cluster_keys,omitempty"`
Comment string `json:"comment,omitempty"`
DataSourceFormat string `json:"data_source_format,omitempty"`
EffectiveProperties map[string]string `json:"effective_properties,omitempty"`
Id string `json:"id,omitempty"`
Name string `json:"name"`
Options map[string]string `json:"options,omitempty"`

View File

@ -21,7 +21,7 @@ type Root struct {
const ProviderHost = "registry.terraform.io"
const ProviderSource = "databricks/databricks"
const ProviderVersion = "1.50.0"
const ProviderVersion = "1.52.0"
func NewRoot() *Root {
return &Root{

View File

@ -10,6 +10,7 @@ import (
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/bundle/internal/bundletest"
"github.com/databricks/cli/internal/testutil"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/databricks-sdk-go/service/compute"
"github.com/databricks/databricks-sdk-go/service/jobs"
"github.com/stretchr/testify/require"
@ -61,7 +62,7 @@ func TestGlobReferencesExpandedForTaskLibraries(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, ExpandGlobReferences())
require.Empty(t, diags)
@ -146,7 +147,7 @@ func TestGlobReferencesExpandedForForeachTaskLibraries(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, ExpandGlobReferences())
require.Empty(t, diags)
@ -221,7 +222,7 @@ func TestGlobReferencesExpandedForEnvironmentsDeps(t *testing.T) {
},
}
bundletest.SetLocation(b, ".", filepath.Join(dir, "resource.yml"))
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, ExpandGlobReferences())
require.Empty(t, diags)

View File

@ -18,6 +18,7 @@ import (
"github.com/databricks/cli/bundle/python"
"github.com/databricks/cli/bundle/scripts"
"github.com/databricks/cli/libs/cmdio"
"github.com/databricks/cli/libs/sync"
terraformlib "github.com/databricks/cli/libs/terraform"
tfjson "github.com/hashicorp/terraform-json"
)
@ -128,7 +129,7 @@ properties such as the 'catalog' or 'storage' are changed:`
}
// The deploy phase deploys artifacts and resources.
func Deploy() bundle.Mutator {
func Deploy(outputHandler sync.OutputHandler) bundle.Mutator {
// Core mutators that CRUD resources and modify deployment state. These
// mutators need informed consent if they are potentially destructive.
deployCore := bundle.Defer(
@ -157,7 +158,7 @@ func Deploy() bundle.Mutator {
libraries.ExpandGlobReferences(),
libraries.Upload(),
python.TransformWheelTask(),
files.Upload(),
files.Upload(outputHandler),
deploy.StateUpdate(),
deploy.StatePush(),
permissions.ApplyWorkspaceRootPermissions(),

View File

@ -1,18 +0,0 @@
### Overview
`docs/bundle_descriptions.json` contains both autogenerated as well as manually written
descriptions for the json schema. Specifically
1. `resources` : almost all descriptions are autogenerated from the OpenAPI spec
2. `targets` : almost all descriptions are copied over from root level entities (eg: `bundle`, `artifacts`)
3. `bundle` : manually editted
4. `include` : manually editted
5. `workspace` : manually editted
6. `artifacts` : manually editted
These descriptions are rendered in the inline documentation in an IDE
### SOP: Add schema descriptions for new fields in bundle config
Manually edit bundle_descriptions.json to add your descriptions. Note that the
descriptions in `resources` block is generated from the OpenAPI spec, and thus
any changes there will be overwritten.

View File

@ -1,109 +0,0 @@
package schema
import (
_ "embed"
"encoding/json"
"fmt"
"os"
"reflect"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/libs/jsonschema"
)
// A subset of Schema struct
type Docs struct {
Description string `json:"description"`
Properties map[string]*Docs `json:"properties,omitempty"`
Items *Docs `json:"items,omitempty"`
AdditionalProperties *Docs `json:"additionalproperties,omitempty"`
}
//go:embed docs/bundle_descriptions.json
var bundleDocs []byte
func (docs *Docs) refreshTargetsDocs() error {
targetsDocs, ok := docs.Properties["targets"]
if !ok || targetsDocs.AdditionalProperties == nil ||
targetsDocs.AdditionalProperties.Properties == nil {
return fmt.Errorf("invalid targets descriptions")
}
targetProperties := targetsDocs.AdditionalProperties.Properties
propertiesToCopy := []string{"artifacts", "bundle", "resources", "workspace"}
for _, p := range propertiesToCopy {
targetProperties[p] = docs.Properties[p]
}
return nil
}
func LoadBundleDescriptions() (*Docs, error) {
embedded := Docs{}
err := json.Unmarshal(bundleDocs, &embedded)
return &embedded, err
}
func UpdateBundleDescriptions(openapiSpecPath string) (*Docs, error) {
embedded, err := LoadBundleDescriptions()
if err != nil {
return nil, err
}
// Generate schema from the embedded descriptions, and convert it back to docs.
// This creates empty descriptions for any properties that were missing in the
// embedded descriptions.
schema, err := New(reflect.TypeOf(config.Root{}), embedded)
if err != nil {
return nil, err
}
docs := schemaToDocs(schema)
// Load the Databricks OpenAPI spec
openapiSpec, err := os.ReadFile(openapiSpecPath)
if err != nil {
return nil, err
}
spec := &Specification{}
err = json.Unmarshal(openapiSpec, spec)
if err != nil {
return nil, err
}
openapiReader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
// Generate descriptions for the "resources" field
resourcesDocs, err := openapiReader.ResourcesDocs()
if err != nil {
return nil, err
}
resourceSchema, err := New(reflect.TypeOf(config.Resources{}), resourcesDocs)
if err != nil {
return nil, err
}
docs.Properties["resources"] = schemaToDocs(resourceSchema)
docs.refreshTargetsDocs()
return docs, nil
}
// *Docs are a subset of *Schema, this function selects that subset
func schemaToDocs(jsonSchema *jsonschema.Schema) *Docs {
// terminate recursion if schema is nil
if jsonSchema == nil {
return nil
}
docs := &Docs{
Description: jsonSchema.Description,
}
if len(jsonSchema.Properties) > 0 {
docs.Properties = make(map[string]*Docs)
}
for k, v := range jsonSchema.Properties {
docs.Properties[k] = schemaToDocs(v)
}
docs.Items = schemaToDocs(jsonSchema.Items)
if additionalProperties, ok := jsonSchema.AdditionalProperties.(*jsonschema.Schema); ok {
docs.AdditionalProperties = schemaToDocs(additionalProperties)
}
return docs
}

File diff suppressed because it is too large Load Diff

View File

@ -1,62 +0,0 @@
package schema
import (
"encoding/json"
"testing"
"github.com/databricks/cli/libs/jsonschema"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestSchemaToDocs(t *testing.T) {
jsonSchema := &jsonschema.Schema{
Type: "object",
Description: "root doc",
Properties: map[string]*jsonschema.Schema{
"foo": {Type: "number", Description: "foo doc"},
"bar": {Type: "string"},
"octave": {
Type: "object",
AdditionalProperties: &jsonschema.Schema{Type: "number"},
Description: "octave docs",
},
"scales": {
Type: "object",
Description: "scale docs",
Items: &jsonschema.Schema{Type: "string"},
},
},
}
docs := schemaToDocs(jsonSchema)
docsJson, err := json.MarshalIndent(docs, " ", " ")
require.NoError(t, err)
expected :=
`{
"description": "root doc",
"properties": {
"bar": {
"description": ""
},
"foo": {
"description": "foo doc"
},
"octave": {
"description": "octave docs",
"additionalproperties": {
"description": ""
}
},
"scales": {
"description": "scale docs",
"items": {
"description": ""
}
}
}
}`
t.Log("[DEBUG] actual: ", string(docsJson))
t.Log("[DEBUG] expected: ", expected)
assert.Equal(t, expected, string(docsJson))
}

6
bundle/schema/embed.go Normal file
View File

@ -0,0 +1,6 @@
package schema
import _ "embed"
//go:embed jsonschema.json
var Bytes []byte

View File

@ -0,0 +1,71 @@
package schema_test
import (
"encoding/json"
"testing"
"github.com/databricks/cli/bundle/schema"
"github.com/databricks/cli/libs/jsonschema"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func walk(defs map[string]any, p ...string) jsonschema.Schema {
v, ok := defs[p[0]]
if !ok {
panic("not found: " + p[0])
}
if len(p) == 1 {
b, err := json.Marshal(v)
if err != nil {
panic(err)
}
res := jsonschema.Schema{}
err = json.Unmarshal(b, &res)
if err != nil {
panic(err)
}
return res
}
return walk(v.(map[string]any), p[1:]...)
}
func TestJsonSchema(t *testing.T) {
s := jsonschema.Schema{}
err := json.Unmarshal(schema.Bytes, &s)
require.NoError(t, err)
// Assert job fields have their descriptions loaded.
resourceJob := walk(s.Definitions, "github.com", "databricks", "cli", "bundle", "config", "resources.Job")
fields := []string{"name", "continuous", "deployment", "tasks", "trigger"}
for _, field := range fields {
assert.NotEmpty(t, resourceJob.AnyOf[0].Properties[field].Description)
}
// Assert descriptions were also loaded for a job task definition.
jobTask := walk(s.Definitions, "github.com", "databricks", "databricks-sdk-go", "service", "jobs.Task")
fields = []string{"notebook_task", "spark_jar_task", "spark_python_task", "spark_submit_task", "description", "depends_on", "environment_key", "for_each_task", "existing_cluster_id"}
for _, field := range fields {
assert.NotEmpty(t, jobTask.AnyOf[0].Properties[field].Description)
}
// Assert descriptions are loaded for pipelines
pipeline := walk(s.Definitions, "github.com", "databricks", "cli", "bundle", "config", "resources.Pipeline")
fields = []string{"name", "catalog", "clusters", "channel", "continuous", "deployment", "development"}
for _, field := range fields {
assert.NotEmpty(t, pipeline.AnyOf[0].Properties[field].Description)
}
// Assert enum values are loaded
schedule := walk(s.Definitions, "github.com", "databricks", "databricks-sdk-go", "service", "catalog.MonitorCronSchedule")
assert.Contains(t, schedule.AnyOf[0].Properties["pause_status"].Enum, "PAUSED")
assert.Contains(t, schedule.AnyOf[0].Properties["pause_status"].Enum, "UNPAUSED")
providers := walk(s.Definitions, "github.com", "databricks", "databricks-sdk-go", "service", "jobs.GitProvider")
assert.Contains(t, providers.Enum, "gitHub")
assert.Contains(t, providers.Enum, "bitbucketCloud")
assert.Contains(t, providers.Enum, "gitHubEnterprise")
assert.Contains(t, providers.Enum, "bitbucketServer")
}

5561
bundle/schema/jsonschema.json generated Normal file

File diff suppressed because it is too large Load Diff

View File

@ -1,293 +0,0 @@
package schema
import (
"encoding/json"
"fmt"
"strings"
"github.com/databricks/cli/libs/jsonschema"
)
type OpenapiReader struct {
// OpenAPI spec to read schemas from.
OpenapiSpec *Specification
// In-memory cache of schemas read from the OpenAPI spec.
memo map[string]jsonschema.Schema
}
const SchemaPathPrefix = "#/components/schemas/"
// Read a schema directly from the OpenAPI spec.
func (reader *OpenapiReader) readOpenapiSchema(path string) (jsonschema.Schema, error) {
schemaKey := strings.TrimPrefix(path, SchemaPathPrefix)
// return early if we already have a computed schema
memoSchema, ok := reader.memo[schemaKey]
if ok {
return memoSchema, nil
}
// check path is present in openapi spec
openapiSchema, ok := reader.OpenapiSpec.Components.Schemas[schemaKey]
if !ok {
return jsonschema.Schema{}, fmt.Errorf("schema with path %s not found in openapi spec", path)
}
// convert openapi schema to the native schema struct
bytes, err := json.Marshal(*openapiSchema)
if err != nil {
return jsonschema.Schema{}, err
}
jsonSchema := jsonschema.Schema{}
err = json.Unmarshal(bytes, &jsonSchema)
if err != nil {
return jsonschema.Schema{}, err
}
// A hack to convert a map[string]interface{} to *Schema
// We rely on the type of a AdditionalProperties in downstream functions
// to do reference interpolation
_, ok = jsonSchema.AdditionalProperties.(map[string]interface{})
if ok {
b, err := json.Marshal(jsonSchema.AdditionalProperties)
if err != nil {
return jsonschema.Schema{}, err
}
additionalProperties := &jsonschema.Schema{}
err = json.Unmarshal(b, additionalProperties)
if err != nil {
return jsonschema.Schema{}, err
}
jsonSchema.AdditionalProperties = additionalProperties
}
// store read schema into memo
reader.memo[schemaKey] = jsonSchema
return jsonSchema, nil
}
// Resolve all nested "$ref" references in the schema. This function unrolls a single
// level of "$ref" in the schema and calls into traverseSchema to resolve nested references.
// Thus this function and traverseSchema are mutually recursive.
//
// This function is safe against reference loops. If a reference loop is detected, an error
// is returned.
func (reader *OpenapiReader) safeResolveRefs(root *jsonschema.Schema, tracker *tracker) (*jsonschema.Schema, error) {
if root.Reference == nil {
return reader.traverseSchema(root, tracker)
}
key := *root.Reference
// HACK to unblock CLI release (13th Feb 2024). This is temporary until proper
// support for recursive types is added to the docs generator. PR: https://github.com/databricks/cli/pull/1204
if strings.Contains(key, "ForEachTask") {
return root, nil
}
if tracker.hasCycle(key) {
// self reference loops can be supported however the logic is non-trivial because
// cross refernce loops are not allowed (see: http://json-schema.org/understanding-json-schema/structuring.html#recursion)
return nil, fmt.Errorf("references loop detected")
}
ref := *root.Reference
description := root.Description
tracker.push(ref, ref)
// Mark reference nil, so we do not traverse this again. This is tracked
// in the memo
root.Reference = nil
// unroll one level of reference.
selfRef, err := reader.readOpenapiSchema(ref)
if err != nil {
return nil, err
}
root = &selfRef
root.Description = description
// traverse again to find new references
root, err = reader.traverseSchema(root, tracker)
if err != nil {
return nil, err
}
tracker.pop(ref)
return root, err
}
// Traverse the nested properties of the schema to resolve "$ref" references. This function
// and safeResolveRefs are mutually recursive.
func (reader *OpenapiReader) traverseSchema(root *jsonschema.Schema, tracker *tracker) (*jsonschema.Schema, error) {
// case primitive (or invalid)
if root.Type != jsonschema.ObjectType && root.Type != jsonschema.ArrayType {
return root, nil
}
// only root references are resolved
if root.Reference != nil {
return reader.safeResolveRefs(root, tracker)
}
// case struct
if len(root.Properties) > 0 {
for k, v := range root.Properties {
childSchema, err := reader.safeResolveRefs(v, tracker)
if err != nil {
return nil, err
}
root.Properties[k] = childSchema
}
}
// case array
if root.Items != nil {
itemsSchema, err := reader.safeResolveRefs(root.Items, tracker)
if err != nil {
return nil, err
}
root.Items = itemsSchema
}
// case map
additionalProperties, ok := root.AdditionalProperties.(*jsonschema.Schema)
if ok && additionalProperties != nil {
valueSchema, err := reader.safeResolveRefs(additionalProperties, tracker)
if err != nil {
return nil, err
}
root.AdditionalProperties = valueSchema
}
return root, nil
}
func (reader *OpenapiReader) readResolvedSchema(path string) (*jsonschema.Schema, error) {
root, err := reader.readOpenapiSchema(path)
if err != nil {
return nil, err
}
tracker := newTracker()
tracker.push(path, path)
resolvedRoot, err := reader.safeResolveRefs(&root, tracker)
if err != nil {
return nil, tracker.errWithTrace(err.Error(), "")
}
return resolvedRoot, nil
}
func (reader *OpenapiReader) jobsDocs() (*Docs, error) {
jobSettingsSchema, err := reader.readResolvedSchema(SchemaPathPrefix + "jobs.JobSettings")
if err != nil {
return nil, err
}
jobDocs := schemaToDocs(jobSettingsSchema)
// TODO: add description for id if needed.
// Tracked in https://github.com/databricks/cli/issues/242
jobsDocs := &Docs{
Description: "List of Databricks jobs",
AdditionalProperties: jobDocs,
}
return jobsDocs, nil
}
func (reader *OpenapiReader) pipelinesDocs() (*Docs, error) {
pipelineSpecSchema, err := reader.readResolvedSchema(SchemaPathPrefix + "pipelines.PipelineSpec")
if err != nil {
return nil, err
}
pipelineDocs := schemaToDocs(pipelineSpecSchema)
// TODO: Two fields in resources.Pipeline have the json tag id. Clarify the
// semantics and then add a description if needed. (https://github.com/databricks/cli/issues/242)
pipelinesDocs := &Docs{
Description: "List of DLT pipelines",
AdditionalProperties: pipelineDocs,
}
return pipelinesDocs, nil
}
func (reader *OpenapiReader) experimentsDocs() (*Docs, error) {
experimentSpecSchema, err := reader.readResolvedSchema(SchemaPathPrefix + "ml.Experiment")
if err != nil {
return nil, err
}
experimentDocs := schemaToDocs(experimentSpecSchema)
experimentsDocs := &Docs{
Description: "List of MLflow experiments",
AdditionalProperties: experimentDocs,
}
return experimentsDocs, nil
}
func (reader *OpenapiReader) modelsDocs() (*Docs, error) {
modelSpecSchema, err := reader.readResolvedSchema(SchemaPathPrefix + "ml.Model")
if err != nil {
return nil, err
}
modelDocs := schemaToDocs(modelSpecSchema)
modelsDocs := &Docs{
Description: "List of MLflow models",
AdditionalProperties: modelDocs,
}
return modelsDocs, nil
}
func (reader *OpenapiReader) modelServingEndpointsDocs() (*Docs, error) {
modelServingEndpointsSpecSchema, err := reader.readResolvedSchema(SchemaPathPrefix + "serving.CreateServingEndpoint")
if err != nil {
return nil, err
}
modelServingEndpointsDocs := schemaToDocs(modelServingEndpointsSpecSchema)
modelServingEndpointsAllDocs := &Docs{
Description: "List of Model Serving Endpoints",
AdditionalProperties: modelServingEndpointsDocs,
}
return modelServingEndpointsAllDocs, nil
}
func (reader *OpenapiReader) registeredModelDocs() (*Docs, error) {
registeredModelsSpecSchema, err := reader.readResolvedSchema(SchemaPathPrefix + "catalog.CreateRegisteredModelRequest")
if err != nil {
return nil, err
}
registeredModelsDocs := schemaToDocs(registeredModelsSpecSchema)
registeredModelsAllDocs := &Docs{
Description: "List of Registered Models",
AdditionalProperties: registeredModelsDocs,
}
return registeredModelsAllDocs, nil
}
func (reader *OpenapiReader) ResourcesDocs() (*Docs, error) {
jobsDocs, err := reader.jobsDocs()
if err != nil {
return nil, err
}
pipelinesDocs, err := reader.pipelinesDocs()
if err != nil {
return nil, err
}
experimentsDocs, err := reader.experimentsDocs()
if err != nil {
return nil, err
}
modelsDocs, err := reader.modelsDocs()
if err != nil {
return nil, err
}
modelServingEndpointsDocs, err := reader.modelServingEndpointsDocs()
if err != nil {
return nil, err
}
registeredModelsDocs, err := reader.registeredModelDocs()
if err != nil {
return nil, err
}
return &Docs{
Description: "Collection of Databricks resources to deploy.",
Properties: map[string]*Docs{
"jobs": jobsDocs,
"pipelines": pipelinesDocs,
"experiments": experimentsDocs,
"models": modelsDocs,
"model_serving_endpoints": modelServingEndpointsDocs,
"registered_models": registeredModelsDocs,
},
}, nil
}

View File

@ -1,493 +0,0 @@
package schema
import (
"encoding/json"
"testing"
"github.com/databricks/cli/libs/jsonschema"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestReadSchemaForObject(t *testing.T) {
specString := `
{
"components": {
"schemas": {
"foo": {
"type": "number"
},
"fruits": {
"type": "object",
"description": "fruits that are cool",
"properties": {
"guava": {
"type": "string",
"description": "a guava for my schema"
},
"mango": {
"type": "object",
"description": "a mango for my schema",
"$ref": "#/components/schemas/mango"
}
}
},
"mango": {
"type": "object",
"properties": {
"foo": {
"$ref": "#/components/schemas/foo"
}
}
}
}
}
}
`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
fruitsSchema, err := reader.readResolvedSchema("#/components/schemas/fruits")
require.NoError(t, err)
fruitsSchemaJson, err := json.MarshalIndent(fruitsSchema, " ", " ")
require.NoError(t, err)
expected := `{
"type": "object",
"description": "fruits that are cool",
"properties": {
"guava": {
"type": "string",
"description": "a guava for my schema"
},
"mango": {
"type": "object",
"description": "a mango for my schema",
"properties": {
"foo": {
"type": "number"
}
}
}
}
}`
t.Log("[DEBUG] actual: ", string(fruitsSchemaJson))
t.Log("[DEBUG] expected: ", expected)
assert.Equal(t, expected, string(fruitsSchemaJson))
}
func TestReadSchemaForArray(t *testing.T) {
specString := `
{
"components": {
"schemas": {
"fruits": {
"type": "object",
"description": "fruits that are cool",
"items": {
"description": "some papayas, because papayas are fruits too",
"$ref": "#/components/schemas/papaya"
}
},
"papaya": {
"type": "number"
}
}
}
}`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
fruitsSchema, err := reader.readResolvedSchema("#/components/schemas/fruits")
require.NoError(t, err)
fruitsSchemaJson, err := json.MarshalIndent(fruitsSchema, " ", " ")
require.NoError(t, err)
expected := `{
"type": "object",
"description": "fruits that are cool",
"items": {
"type": "number",
"description": "some papayas, because papayas are fruits too"
}
}`
t.Log("[DEBUG] actual: ", string(fruitsSchemaJson))
t.Log("[DEBUG] expected: ", expected)
assert.Equal(t, expected, string(fruitsSchemaJson))
}
func TestReadSchemaForMap(t *testing.T) {
specString := `{
"components": {
"schemas": {
"fruits": {
"type": "object",
"description": "fruits that are meh",
"additionalProperties": {
"description": "watermelons. watermelons.",
"$ref": "#/components/schemas/watermelon"
}
},
"watermelon": {
"type": "number"
}
}
}
}`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
fruitsSchema, err := reader.readResolvedSchema("#/components/schemas/fruits")
require.NoError(t, err)
fruitsSchemaJson, err := json.MarshalIndent(fruitsSchema, " ", " ")
require.NoError(t, err)
expected := `{
"type": "object",
"description": "fruits that are meh",
"additionalProperties": {
"type": "number",
"description": "watermelons. watermelons."
}
}`
t.Log("[DEBUG] actual: ", string(fruitsSchemaJson))
t.Log("[DEBUG] expected: ", expected)
assert.Equal(t, expected, string(fruitsSchemaJson))
}
func TestRootReferenceIsResolved(t *testing.T) {
specString := `{
"components": {
"schemas": {
"foo": {
"type": "object",
"description": "this description is ignored",
"properties": {
"abc": {
"type": "string"
}
}
},
"fruits": {
"type": "object",
"description": "foo fighters fighting fruits",
"$ref": "#/components/schemas/foo"
}
}
}
}`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
schema, err := reader.readResolvedSchema("#/components/schemas/fruits")
require.NoError(t, err)
fruitsSchemaJson, err := json.MarshalIndent(schema, " ", " ")
require.NoError(t, err)
expected := `{
"type": "object",
"description": "foo fighters fighting fruits",
"properties": {
"abc": {
"type": "string"
}
}
}`
t.Log("[DEBUG] actual: ", string(fruitsSchemaJson))
t.Log("[DEBUG] expected: ", expected)
assert.Equal(t, expected, string(fruitsSchemaJson))
}
func TestSelfReferenceLoopErrors(t *testing.T) {
specString := `{
"components": {
"schemas": {
"foo": {
"type": "object",
"description": "this description is ignored",
"properties": {
"bar": {
"type": "object",
"$ref": "#/components/schemas/foo"
}
}
},
"fruits": {
"type": "object",
"description": "foo fighters fighting fruits",
"$ref": "#/components/schemas/foo"
}
}
}
}`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
_, err = reader.readResolvedSchema("#/components/schemas/fruits")
assert.ErrorContains(t, err, "references loop detected. traversal trace: -> #/components/schemas/fruits -> #/components/schemas/foo")
}
func TestCrossReferenceLoopErrors(t *testing.T) {
specString := `{
"components": {
"schemas": {
"foo": {
"type": "object",
"description": "this description is ignored",
"properties": {
"bar": {
"type": "object",
"$ref": "#/components/schemas/fruits"
}
}
},
"fruits": {
"type": "object",
"description": "foo fighters fighting fruits",
"$ref": "#/components/schemas/foo"
}
}
}
}`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
_, err = reader.readResolvedSchema("#/components/schemas/fruits")
assert.ErrorContains(t, err, "references loop detected. traversal trace: -> #/components/schemas/fruits -> #/components/schemas/foo")
}
func TestReferenceResolutionForMapInObject(t *testing.T) {
specString := `
{
"components": {
"schemas": {
"foo": {
"type": "number"
},
"fruits": {
"type": "object",
"description": "fruits that are cool",
"properties": {
"guava": {
"type": "string",
"description": "a guava for my schema"
},
"mangos": {
"type": "object",
"description": "multiple mangos",
"$ref": "#/components/schemas/mango"
}
}
},
"mango": {
"type": "object",
"additionalProperties": {
"description": "a single mango",
"$ref": "#/components/schemas/foo"
}
}
}
}
}`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
fruitsSchema, err := reader.readResolvedSchema("#/components/schemas/fruits")
require.NoError(t, err)
fruitsSchemaJson, err := json.MarshalIndent(fruitsSchema, " ", " ")
require.NoError(t, err)
expected := `{
"type": "object",
"description": "fruits that are cool",
"properties": {
"guava": {
"type": "string",
"description": "a guava for my schema"
},
"mangos": {
"type": "object",
"description": "multiple mangos",
"additionalProperties": {
"type": "number",
"description": "a single mango"
}
}
}
}`
t.Log("[DEBUG] actual: ", string(fruitsSchemaJson))
t.Log("[DEBUG] expected: ", expected)
assert.Equal(t, expected, string(fruitsSchemaJson))
}
func TestReferenceResolutionForArrayInObject(t *testing.T) {
specString := `{
"components": {
"schemas": {
"foo": {
"type": "number"
},
"fruits": {
"type": "object",
"description": "fruits that are cool",
"properties": {
"guava": {
"type": "string",
"description": "a guava for my schema"
},
"mangos": {
"type": "object",
"description": "multiple mangos",
"$ref": "#/components/schemas/mango"
}
}
},
"mango": {
"type": "object",
"items": {
"description": "a single mango",
"$ref": "#/components/schemas/foo"
}
}
}
}
}`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
fruitsSchema, err := reader.readResolvedSchema("#/components/schemas/fruits")
require.NoError(t, err)
fruitsSchemaJson, err := json.MarshalIndent(fruitsSchema, " ", " ")
require.NoError(t, err)
expected := `{
"type": "object",
"description": "fruits that are cool",
"properties": {
"guava": {
"type": "string",
"description": "a guava for my schema"
},
"mangos": {
"type": "object",
"description": "multiple mangos",
"items": {
"type": "number",
"description": "a single mango"
}
}
}
}`
t.Log("[DEBUG] actual: ", string(fruitsSchemaJson))
t.Log("[DEBUG] expected: ", expected)
assert.Equal(t, expected, string(fruitsSchemaJson))
}
func TestReferenceResolutionDoesNotOverwriteDescriptions(t *testing.T) {
specString := `{
"components": {
"schemas": {
"foo": {
"type": "number"
},
"fruits": {
"type": "object",
"properties": {
"guava": {
"type": "object",
"description": "Guava is a fruit",
"$ref": "#/components/schemas/foo"
},
"mango": {
"type": "object",
"description": "What is a mango?",
"$ref": "#/components/schemas/foo"
}
}
}
}
}
}`
spec := &Specification{}
reader := &OpenapiReader{
OpenapiSpec: spec,
memo: make(map[string]jsonschema.Schema),
}
err := json.Unmarshal([]byte(specString), spec)
require.NoError(t, err)
fruitsSchema, err := reader.readResolvedSchema("#/components/schemas/fruits")
require.NoError(t, err)
fruitsSchemaJson, err := json.MarshalIndent(fruitsSchema, " ", " ")
require.NoError(t, err)
expected := `{
"type": "object",
"properties": {
"guava": {
"type": "number",
"description": "Guava is a fruit"
},
"mango": {
"type": "number",
"description": "What is a mango?"
}
}
}`
t.Log("[DEBUG] actual: ", string(fruitsSchemaJson))
t.Log("[DEBUG] expected: ", expected)
assert.Equal(t, expected, string(fruitsSchemaJson))
}

View File

@ -1,287 +0,0 @@
package schema
import (
"container/list"
"fmt"
"reflect"
"strings"
"github.com/databricks/cli/libs/dyn/dynvar"
"github.com/databricks/cli/libs/jsonschema"
)
// Fields tagged "readonly" should not be emitted in the schema as they are
// computed at runtime, and should not be assigned a value by the bundle author.
const readonlyTag = "readonly"
// Annotation for internal bundle fields that should not be exposed to customers.
// Fields can be tagged as "internal" to remove them from the generated schema.
const internalTag = "internal"
// Annotation for bundle fields that have been deprecated.
// Fields tagged as "deprecated" are removed/omitted from the generated schema.
const deprecatedTag = "deprecated"
// This function translates golang types into json schema. Here is the mapping
// between json schema types and golang types
//
// - GolangType -> Javascript type / Json Schema2
//
// - bool -> boolean
//
// - string -> string
//
// - int (all variants) -> number
//
// - float (all variants) -> number
//
// - map[string]MyStruct -> { type: object, additionalProperties: {}}
// for details visit: https://json-schema.org/understanding-json-schema/reference/object.html#additional-properties
//
// - []MyStruct -> {type: array, items: {}}
// for details visit: https://json-schema.org/understanding-json-schema/reference/array.html#items
//
// - []MyStruct -> {type: object, properties: {}, additionalProperties: false}
// for details visit: https://json-schema.org/understanding-json-schema/reference/object.html#properties
func New(golangType reflect.Type, docs *Docs) (*jsonschema.Schema, error) {
tracker := newTracker()
schema, err := safeToSchema(golangType, docs, "", tracker)
if err != nil {
return nil, tracker.errWithTrace(err.Error(), "root")
}
return schema, nil
}
func jsonSchemaType(golangType reflect.Type) (jsonschema.Type, error) {
switch golangType.Kind() {
case reflect.Bool:
return jsonschema.BooleanType, nil
case reflect.String:
return jsonschema.StringType, nil
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64,
reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64,
reflect.Float32, reflect.Float64:
return jsonschema.NumberType, nil
case reflect.Struct:
return jsonschema.ObjectType, nil
case reflect.Map:
if golangType.Key().Kind() != reflect.String {
return jsonschema.InvalidType, fmt.Errorf("only strings map keys are valid. key type: %v", golangType.Key().Kind())
}
return jsonschema.ObjectType, nil
case reflect.Array, reflect.Slice:
return jsonschema.ArrayType, nil
default:
return jsonschema.InvalidType, fmt.Errorf("unhandled golang type: %s", golangType)
}
}
// A wrapper over toSchema function to:
// 1. Detect cycles in the bundle config struct.
// 2. Update tracker
//
// params:
//
// - golangType: Golang type to generate json schema for
//
// - docs: Contains documentation to be injected into the generated json schema
//
// - traceId: An identifier for the current type, to trace recursive traversal.
// Its value is the first json tag in case of struct fields and "" in other cases
// like array, map or no json tags
//
// - tracker: Keeps track of types / traceIds seen during recursive traversal
func safeToSchema(golangType reflect.Type, docs *Docs, traceId string, tracker *tracker) (*jsonschema.Schema, error) {
// HACK to unblock CLI release (13th Feb 2024). This is temporary until proper
// support for recursive types is added to the schema generator. PR: https://github.com/databricks/cli/pull/1204
if traceId == "for_each_task" {
return &jsonschema.Schema{
Type: jsonschema.ObjectType,
}, nil
}
// WE ERROR OUT IF THERE ARE CYCLES IN THE JSON SCHEMA
// There are mechanisms to deal with cycles though recursive identifiers in json
// schema. However if we use them, we would need to make sure we are able to detect
// cycles where two properties (directly or indirectly) pointing to each other
//
// see: https://json-schema.org/understanding-json-schema/structuring.html#recursion
// for details
if tracker.hasCycle(golangType) {
return nil, fmt.Errorf("cycle detected")
}
tracker.push(golangType, traceId)
props, err := toSchema(golangType, docs, tracker)
if err != nil {
return nil, err
}
tracker.pop(golangType)
return props, nil
}
// This function returns all member fields of the provided type.
// If the type has embedded (aka anonymous) fields, this function traverses
// those in a breadth first manner
func getStructFields(golangType reflect.Type) []reflect.StructField {
fields := []reflect.StructField{}
bfsQueue := list.New()
for i := 0; i < golangType.NumField(); i++ {
bfsQueue.PushBack(golangType.Field(i))
}
for bfsQueue.Len() > 0 {
front := bfsQueue.Front()
field := front.Value.(reflect.StructField)
bfsQueue.Remove(front)
if !field.Anonymous {
fields = append(fields, field)
continue
}
fieldType := field.Type
if fieldType.Kind() == reflect.Pointer {
fieldType = fieldType.Elem()
}
for i := 0; i < fieldType.NumField(); i++ {
bfsQueue.PushBack(fieldType.Field(i))
}
}
return fields
}
func toSchema(golangType reflect.Type, docs *Docs, tracker *tracker) (*jsonschema.Schema, error) {
// *Struct and Struct generate identical json schemas
if golangType.Kind() == reflect.Pointer {
return safeToSchema(golangType.Elem(), docs, "", tracker)
}
if golangType.Kind() == reflect.Interface {
return &jsonschema.Schema{}, nil
}
rootJavascriptType, err := jsonSchemaType(golangType)
if err != nil {
return nil, err
}
jsonSchema := &jsonschema.Schema{Type: rootJavascriptType}
// If the type is a non-string primitive, then we allow it to be a string
// provided it's a pure variable reference (ie only a single variable reference).
if rootJavascriptType == jsonschema.BooleanType || rootJavascriptType == jsonschema.NumberType {
jsonSchema = &jsonschema.Schema{
AnyOf: []*jsonschema.Schema{
{
Type: rootJavascriptType,
},
{
Type: jsonschema.StringType,
Pattern: dynvar.VariableRegex,
},
},
}
}
if docs != nil {
jsonSchema.Description = docs.Description
}
// case array/slice
if golangType.Kind() == reflect.Array || golangType.Kind() == reflect.Slice {
elemGolangType := golangType.Elem()
elemJavascriptType, err := jsonSchemaType(elemGolangType)
if err != nil {
return nil, err
}
var childDocs *Docs
if docs != nil {
childDocs = docs.Items
}
elemProps, err := safeToSchema(elemGolangType, childDocs, "", tracker)
if err != nil {
return nil, err
}
jsonSchema.Items = &jsonschema.Schema{
Type: elemJavascriptType,
Properties: elemProps.Properties,
AdditionalProperties: elemProps.AdditionalProperties,
Items: elemProps.Items,
Required: elemProps.Required,
}
}
// case map
if golangType.Kind() == reflect.Map {
if golangType.Key().Kind() != reflect.String {
return nil, fmt.Errorf("only string keyed maps allowed")
}
var childDocs *Docs
if docs != nil {
childDocs = docs.AdditionalProperties
}
jsonSchema.AdditionalProperties, err = safeToSchema(golangType.Elem(), childDocs, "", tracker)
if err != nil {
return nil, err
}
}
// case struct
if golangType.Kind() == reflect.Struct {
children := getStructFields(golangType)
properties := map[string]*jsonschema.Schema{}
required := []string{}
for _, child := range children {
bundleTag := child.Tag.Get("bundle")
// Fields marked as "readonly", "internal" or "deprecated" are skipped
// while generating the schema
if bundleTag == readonlyTag || bundleTag == internalTag || bundleTag == deprecatedTag {
continue
}
// get child json tags
childJsonTag := strings.Split(child.Tag.Get("json"), ",")
childName := childJsonTag[0]
// skip children that have no json tags, the first json tag is ""
// or the first json tag is "-"
if childName == "" || childName == "-" {
continue
}
// get docs for the child if they exist
var childDocs *Docs
if docs != nil {
if val, ok := docs.Properties[childName]; ok {
childDocs = val
}
}
// compute if the child is a required field. Determined by the
// presence of "omitempty" in the json tags
hasOmitEmptyTag := false
for i := 1; i < len(childJsonTag); i++ {
if childJsonTag[i] == "omitempty" {
hasOmitEmptyTag = true
}
}
if !hasOmitEmptyTag {
required = append(required, childName)
}
// compute Schema.Properties for the child recursively
fieldProps, err := safeToSchema(child.Type, childDocs, childName, tracker)
if err != nil {
return nil, err
}
properties[childName] = fieldProps
}
jsonSchema.AdditionalProperties = false
jsonSchema.Properties = properties
jsonSchema.Required = required
}
return jsonSchema, nil
}

File diff suppressed because it is too large Load Diff

View File

@ -1,11 +0,0 @@
package schema
import "github.com/databricks/cli/libs/jsonschema"
type Specification struct {
Components *Components `json:"components"`
}
type Components struct {
Schemas map[string]*jsonschema.Schema `json:"schemas,omitempty"`
}

View File

@ -1,53 +0,0 @@
package schema
import (
"container/list"
"fmt"
)
type tracker struct {
// Nodes encountered in current path during the recursive traversal. Used to
// check for cycles
seenNodes map[interface{}]struct{}
// List of node names encountered in order in current path during the recursive traversal.
// Used to hydrate errors with path to the exact node where error occured.
//
// NOTE: node and node names can be the same
listOfNodes *list.List
}
func newTracker() *tracker {
return &tracker{
seenNodes: map[interface{}]struct{}{},
listOfNodes: list.New(),
}
}
func (t *tracker) errWithTrace(prefix string, initTrace string) error {
traceString := initTrace
curr := t.listOfNodes.Front()
for curr != nil {
if curr.Value.(string) != "" {
traceString += " -> " + curr.Value.(string)
}
curr = curr.Next()
}
return fmt.Errorf(prefix + ". traversal trace: " + traceString)
}
func (t *tracker) hasCycle(node interface{}) bool {
_, ok := t.seenNodes[node]
return ok
}
func (t *tracker) push(node interface{}, name string) {
t.seenNodes[node] = struct{}{}
t.listOfNodes.PushBack(name)
}
func (t *tracker) pop(nodeType interface{}) {
back := t.listOfNodes.Back()
t.listOfNodes.Remove(back)
delete(t.seenNodes, nodeType)
}

View File

@ -0,0 +1,36 @@
bundle:
name: clusters
workspace:
host: https://acme.cloud.databricks.com/
resources:
clusters:
foo:
cluster_name: foo
num_workers: 2
node_type_id: "i3.xlarge"
autoscale:
min_workers: 2
max_workers: 7
spark_version: "13.3.x-scala2.12"
spark_conf:
"spark.executor.memory": "2g"
targets:
default:
development:
resources:
clusters:
foo:
cluster_name: foo-override
num_workers: 3
node_type_id: "m5.xlarge"
autoscale:
min_workers: 1
max_workers: 3
spark_version: "15.2.x-scala2.12"
spark_conf:
"spark.executor.memory": "4g"
"spark.executor.memory2": "4g"

View File

@ -0,0 +1,36 @@
package config_tests
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestClusters(t *testing.T) {
b := load(t, "./clusters")
assert.Equal(t, "clusters", b.Config.Bundle.Name)
cluster := b.Config.Resources.Clusters["foo"]
assert.Equal(t, "foo", cluster.ClusterName)
assert.Equal(t, "13.3.x-scala2.12", cluster.SparkVersion)
assert.Equal(t, "i3.xlarge", cluster.NodeTypeId)
assert.Equal(t, 2, cluster.NumWorkers)
assert.Equal(t, "2g", cluster.SparkConf["spark.executor.memory"])
assert.Equal(t, 2, cluster.Autoscale.MinWorkers)
assert.Equal(t, 7, cluster.Autoscale.MaxWorkers)
}
func TestClustersOverride(t *testing.T) {
b := loadTarget(t, "./clusters", "development")
assert.Equal(t, "clusters", b.Config.Bundle.Name)
cluster := b.Config.Resources.Clusters["foo"]
assert.Equal(t, "foo-override", cluster.ClusterName)
assert.Equal(t, "15.2.x-scala2.12", cluster.SparkVersion)
assert.Equal(t, "m5.xlarge", cluster.NodeTypeId)
assert.Equal(t, 3, cluster.NumWorkers)
assert.Equal(t, "4g", cluster.SparkConf["spark.executor.memory"])
assert.Equal(t, "4g", cluster.SparkConf["spark.executor.memory2"])
assert.Equal(t, 1, cluster.Autoscale.MinWorkers)
assert.Equal(t, 3, cluster.Autoscale.MaxWorkers)
}

View File

@ -88,3 +88,21 @@ func TestComplexVariablesOverrideWithMultipleFiles(t *testing.T) {
require.Equalf(t, "false", cluster.NewCluster.SparkConf["spark.speculation"], "cluster: %v", cluster.JobClusterKey)
}
}
func TestComplexVariablesOverrideWithFullSyntax(t *testing.T) {
b, diags := loadTargetWithDiags("variables/complex", "dev")
require.Empty(t, diags)
diags = bundle.Apply(context.Background(), b, bundle.Seq(
mutator.SetVariables(),
mutator.ResolveVariableReferencesInComplexVariables(),
mutator.ResolveVariableReferences(
"variables",
),
))
require.NoError(t, diags.Error())
require.Empty(t, diags)
complexvar := b.Config.Variables["complexvar"].Value
require.Equal(t, map[string]interface{}{"key1": "1", "key2": "2", "key3": "3"}, complexvar)
}

View File

@ -35,6 +35,13 @@ variables:
- jar: "/path/to/jar"
- egg: "/path/to/egg"
- whl: "/path/to/whl"
complexvar:
type: complex
description: "A complex variable"
default:
key1: "value1"
key2: "value2"
key3: "value3"
targets:
@ -49,3 +56,9 @@ targets:
spark_conf:
spark.speculation: false
spark.databricks.delta.retentionDurationCheck.enabled: false
complexvar:
type: complex
default:
key1: "1"
key2: "2"
key3: "3"

View File

@ -10,6 +10,7 @@ import (
"github.com/databricks/cli/cmd/bundle/utils"
"github.com/databricks/cli/cmd/root"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/sync"
"github.com/spf13/cobra"
)
@ -23,13 +24,19 @@ func newDeployCommand() *cobra.Command {
var force bool
var forceLock bool
var failOnActiveRuns bool
var computeID string
var clusterId string
var autoApprove bool
var verbose bool
cmd.Flags().BoolVar(&force, "force", false, "Force-override Git branch validation.")
cmd.Flags().BoolVar(&forceLock, "force-lock", false, "Force acquisition of deployment lock.")
cmd.Flags().BoolVar(&failOnActiveRuns, "fail-on-active-runs", false, "Fail if there are running jobs or pipelines in the deployment.")
cmd.Flags().StringVarP(&computeID, "compute-id", "c", "", "Override compute in the deployment with the given compute ID.")
cmd.Flags().StringVar(&clusterId, "compute-id", "", "Override cluster in the deployment with the given compute ID.")
cmd.Flags().StringVarP(&clusterId, "cluster-id", "c", "", "Override cluster in the deployment with the given cluster ID.")
cmd.Flags().BoolVar(&autoApprove, "auto-approve", false, "Skip interactive approvals that might be required for deployment.")
cmd.Flags().MarkDeprecated("compute-id", "use --cluster-id instead")
cmd.Flags().BoolVar(&verbose, "verbose", false, "Enable verbose output.")
// Verbose flag currently only affects file sync output, it's used by the vscode extension
cmd.Flags().MarkHidden("verbose")
cmd.RunE = func(cmd *cobra.Command, args []string) error {
ctx := cmd.Context()
@ -42,7 +49,10 @@ func newDeployCommand() *cobra.Command {
b.AutoApprove = autoApprove
if cmd.Flag("compute-id").Changed {
b.Config.Bundle.ComputeID = computeID
b.Config.Bundle.ClusterId = clusterId
}
if cmd.Flag("cluster-id").Changed {
b.Config.Bundle.ClusterId = clusterId
}
if cmd.Flag("fail-on-active-runs").Changed {
b.Config.Bundle.Deployment.FailOnActiveRuns = failOnActiveRuns
@ -51,11 +61,18 @@ func newDeployCommand() *cobra.Command {
return nil
})
var outputHandler sync.OutputHandler
if verbose {
outputHandler = func(ctx context.Context, c <-chan sync.Event) {
sync.TextOutput(ctx, c, cmd.OutOrStdout())
}
}
diags = diags.Extend(
bundle.Apply(ctx, b, bundle.Seq(
phases.Initialize(),
phases.Build(),
phases.Deploy(),
phases.Deploy(outputHandler),
)),
)
}

View File

@ -152,6 +152,12 @@ func TestGenerateJobCommand(t *testing.T) {
},
},
},
Parameters: []jobs.JobParameterDefinition{
{
Name: "empty",
Default: "",
},
},
},
}, nil)
@ -198,6 +204,9 @@ func TestGenerateJobCommand(t *testing.T) {
- task_key: notebook_task
notebook_task:
notebook_path: %s
parameters:
- name: empty
default: ""
`, filepath.Join("..", "src", "notebook.py")), string(data))
data, err = os.ReadFile(filepath.Join(srcDir, "notebook.py"))

View File

@ -55,13 +55,7 @@ task or a Python wheel task, the second example applies.
return diags.Error()
}
diags = bundle.Apply(ctx, b, bundle.Seq(
phases.Initialize(),
terraform.Interpolate(),
terraform.Write(),
terraform.StatePull(),
terraform.Load(terraform.ErrorOnEmptyState),
))
diags = bundle.Apply(ctx, b, phases.Initialize())
if err := diags.Error(); err != nil {
return err
}
@ -84,6 +78,16 @@ task or a Python wheel task, the second example applies.
return fmt.Errorf("expected a KEY of the resource to run")
}
diags = bundle.Apply(ctx, b, bundle.Seq(
terraform.Interpolate(),
terraform.Write(),
terraform.StatePull(),
terraform.Load(terraform.ErrorOnEmptyState),
))
if err := diags.Error(); err != nil {
return err
}
runner, err := run.Find(b, args[0])
if err != nil {
return err

Some files were not shown because too many files have changed in this diff Show More