databricks-cli/bundle
shreyas-goenka b323703c1b
Add validation for single node clusters (#1909)
## Changes
This PR adds a warning validating that the configuration for a single
node cluster is valid for interactive, job, job-task, and pipeline
clusters.

Note: We skip the validation if a cluster policy is configured because
the policy is likely to configure `spark_conf` / `custom_tags` itself.

Note: Terrform originally only had validation for interactive, job, and
job-task clusters. This PR adding the validation for pipeline clusters
as well is new.

This PR follows the same logic as we used to have in Terraform. The
validation was removed from Terraform because we had no way to demote
the error to a warning:
https://github.com/databricks/terraform-provider-databricks/pull/4222

### Background
Single-node clusters require `spark_conf` and `custom_tags` to be
correctly set in the cluster definition for them to function optimally.
The cluster will be created even if incorrectly configured, but its
performance will not be great.

For example, if both `spark_conf` and `custom_tags` are not set and
`num_workers` is 0, then only the driver process will be launched on the
cluster compute instance thus leading to sub-optimal utilization of
available compute resources and no parallelization across worker
processes when processing a spark query.

### Issue

This PR addresses some issues reported in
https://github.com/databricks/cli/issues/1546

## Tests
Unit tests and manually.

Example output of the warning:
```
➜  bundle-playground git:(master) ✗ cli bundle validate
Warning: Single node cluster is not correctly configured
  at resources.pipelines.bar.clusters[0]
  in databricks.yml:29:11

num_workers should be 0 only for single-node clusters. To create a
valid single node cluster please ensure that the following properties
are correctly set in the cluster specification:

  spark_conf:
    spark.databricks.cluster.profile: singleNode
    spark.master: local[*]

  custom_tags:
    ResourceClass: SingleNode
  

Name: foobar
Target: default
Workspace:
  User: shreyas.goenka@databricks.com
  Path: /Workspace/Users/shreyas.goenka@databricks.com/.bundle/foobar/default

Found 1 warning
```
2024-11-22 15:48:09 +00:00
..
artifacts Rename `RootPath` -> `BundleRootPath` (#1792) 2024-09-27 10:03:05 +00:00
config Add validation for single node clusters (#1909) 2024-11-22 15:48:09 +00:00
deploy Source-linked deployments for bundles in the workspace (#1884) 2024-11-20 13:22:27 +01:00
env Remove support for DATABRICKS_BUNDLE_INCLUDES (#1317) 2024-03-27 10:13:54 +00:00
internal Make `TableName` field part of quality monitor schema (#1903) 2024-11-14 17:39:38 +00:00
libraries Use SetPermissions instead of UpdatePermissions when setting folder permissions based on top-level ones (#1822) 2024-10-29 12:06:38 +00:00
metadata Make `file_path` and `artifact_path` fields consistent with json tag (#987) 2023-11-15 13:37:26 +00:00
paths Fixed adding /Workspace prefix for resource paths (#1866) 2024-10-30 17:34:11 +00:00
permissions Fixed adding /Workspace prefix for resource paths (#1866) 2024-10-30 17:34:11 +00:00
phases Add support for AI/BI dashboards (#1743) 2024-10-29 09:11:08 +00:00
render Add `bundle summary` to display URLs for deployed resources (#1731) 2024-10-18 06:45:47 +00:00
resources Reuse resource resolution code for the run command (#1858) 2024-10-24 13:24:30 +00:00
run Reuse resource resolution code for the run command (#1858) 2024-10-24 13:24:30 +00:00
schema Make `TableName` field part of quality monitor schema (#1903) 2024-11-14 17:39:38 +00:00
scripts Rename `RootPath` -> `BundleRootPath` (#1792) 2024-09-27 10:03:05 +00:00
tests Handle normalization of `dyn.KindTime` into an any type (#1836) 2024-10-17 10:00:40 +00:00
trampoline Source-linked deployments for bundles in the workspace (#1884) 2024-11-20 13:22:27 +01:00
bundle.go Rename `RootPath` -> `BundleRootPath` (#1792) 2024-09-27 10:03:05 +00:00
bundle_read_only.go Rename `RootPath` -> `BundleRootPath` (#1792) 2024-09-27 10:03:05 +00:00
bundle_test.go Rename `RootPath` -> `BundleRootPath` (#1792) 2024-09-27 10:03:05 +00:00
context.go Rename variable `bundle -> b` (#989) 2023-11-15 14:03:36 +00:00
context_test.go Add command that writes the materialized bundle configuration to stdout (#95) 2022-11-21 15:39:53 +01:00
deferred.go Return `diag.Diagnostics` from mutators (#1305) 2024-03-25 14:18:47 +00:00
deferred_test.go Return `diag.Diagnostics` from mutators (#1305) 2024-03-25 14:18:47 +00:00
if.go Return early in bundle destroy if no deployment exists (#1581) 2024-07-09 15:08:38 +00:00
if_test.go Return early in bundle destroy if no deployment exists (#1581) 2024-07-09 15:08:38 +00:00
log_string.go Return `diag.Diagnostics` from mutators (#1305) 2024-03-25 14:18:47 +00:00
mutator.go Return `diag.Diagnostics` from mutators (#1305) 2024-03-25 14:18:47 +00:00
mutator_read_only.go Added validate mutator to surface additional bundle warnings (#1352) 2024-04-18 15:13:16 +00:00
mutator_test.go Return `diag.Diagnostics` from mutators (#1305) 2024-03-25 14:18:47 +00:00
parallel.go Added validate mutator to surface additional bundle warnings (#1352) 2024-04-18 15:13:16 +00:00
parallel_test.go Fix flaky tests for the parallel mutator (#1426) 2024-05-13 12:16:43 +00:00
root.go Move folders package into libs (#1184) 2024-02-07 16:33:18 +00:00
root_test.go Remove support for DATABRICKS_BUNDLE_INCLUDES (#1317) 2024-03-27 10:13:54 +00:00
seq.go Return `diag.Diagnostics` from mutators (#1305) 2024-03-25 14:18:47 +00:00
seq_test.go Return `diag.Diagnostics` from mutators (#1305) 2024-03-25 14:18:47 +00:00