Commit Graph

269 Commits

Author SHA1 Message Date
shreyas-goenka b288b0701d
Update bundle/config/loader/process_include.go
Co-authored-by: Lennart Kats (databricks) <lennart.kats@databricks.com>
2024-10-02 23:57:20 +05:30
shreyas-goenka d064566fc9
Update bundle/config/loader/process_include.go
Co-authored-by: Lennart Kats (databricks) <lennart.kats@databricks.com>
2024-10-02 23:57:12 +05:30
Shreyas Goenka a737977d6f
address comments 2024-10-02 20:26:42 +02:00
Shreyas Goenka dd10eedb14
merge 2024-09-27 13:45:13 +02:00
Shreyas Goenka dd20a52743
terser recommendation message 2024-09-27 13:42:48 +02:00
Shreyas Goenka e22ba75062
remove todo 2024-09-27 12:58:33 +02:00
Shreyas Goenka 14e73acb11
- 2024-09-27 12:57:49 +02:00
Shreyas Goenka b4bd47f4ba
fail -> not match 2024-09-27 12:55:54 +02:00
Shreyas Goenka 65ad2f0557
add test for multiple resources 2024-09-27 12:04:33 +02:00
Pieter Noordhuis 1d1aa0a416
Rename `RootPath` -> `BundleRootPath` (#1792)
## Changes

After introducing the `SyncRootPath` field on the bundle (#1694), the
previous `RootPath` became ambiguous. Does it mean the bundle root path
or the sync root path? This PR renames to field to `BundleRootPath` to
remove the ambiguity.

## Tests

n/a

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-09-27 10:03:05 +00:00
Pieter Noordhuis 56cd96cb93
Move trampoline code into trampoline package (#1793)
## Changes

Doing this to make room for PyDABs under `bundle/python`.

## Tests

n/a
2024-09-27 09:32:54 +00:00
Pieter Noordhuis a1dca56abf
Trim trailing whitespace (#1794)
## Changes

Trailing whitespace is trimmed per the VS Code settings for this
repository.

## Tests

n/a
2024-09-27 09:30:39 +00:00
Shreyas Goenka 4373e7b513
make unit test package separate 2024-09-26 17:57:24 +02:00
Shreyas Goenka 28917ebdab
- 2024-09-26 17:55:09 +02:00
Shreyas Goenka 3203858032
add comment 2024-09-26 17:48:15 +02:00
Shreyas Goenka fec73db111
address comments 2024-09-26 17:47:07 +02:00
Shreyas Goenka 4cc0dd1fa1
Merge remote-tracking branch 'origin' into detect-convention 2024-09-26 17:33:44 +02:00
Shreyas Goenka d9cf582287
fix failing tests 2024-09-26 17:14:43 +02:00
Shreyas Goenka 4734249854
convert inline tests to files 2024-09-26 16:51:46 +02:00
Shreyas Goenka d14b1e2cca
directly pass config value 2024-09-26 16:07:32 +02:00
Shreyas Goenka 5308ad8bf3
split into summary and detail 2024-09-26 16:03:49 +02:00
Shreyas Goenka fd01824ee1
- 2024-09-26 15:39:41 +02:00
Andrew Nester 66f2ba64a8
Simplified isFullVariableOverrideDef implementation (#1791)
## Changes
Simplified isFullVariableOverrideDef implementation

Follow up on https://github.com/databricks/cli/pull/1787
2024-09-26 12:55:07 +00:00
shreyas-goenka 495040e4cd
Modify SetLocation test utility to take full locations as argument (#1788)
I plan to use this in https://github.com/databricks/cli/pull/1780, to
set the line and column numbers as well for the locations.

gopatch file used:
```
@@
var x expression
var y expression
var z expression
@@
-bundletest.SetLocation(x, y, z)
+bundletest.SetLocation(x, y, []dyn.Location{{File: z}})
```
2024-09-25 16:13:48 +00:00
Andrew Nester b3a3071086
Fixed full variable override detection (#1787)
## Changes
Fixes #1786 

## Tests
All valid override combinations are added as test cases
2024-09-25 12:35:16 +00:00
Gleb Kanterov 3d9decdda9
Add JobTaskClusterSpec validate mutator (#1784)
## Changes
Add JobTaskClusterSpec validate mutator. It catches the case when tasks
don't which cluster to use.

For example, we can get this error with minor modifications to
`default-python` template:

```yaml
      tasks:
        - task_key: python_file_task
          spark_python_task:
            python_file: ../src/my_project_10/main.py
```

```
 % databricks bundle validate
Error: Missing required cluster or environment settings
  at resources.jobs.my_project_10_job.tasks[0]
  in resources/my_project_10_job.yml:17:11

Task "print_github_stars" requires a cluster or an environment to run.
Specify one of the following fields: job_cluster_key, environment_key, existing_cluster_id, new_cluster.
```

We implicitly rely on "one of" validation, which does not exist. Many
bundle fields can't co-exist, for instance, specifying:
`JobTask.{existing_cluster_id,job_cluster_key}`, `Library.{whl,pypi}`,
`JobTask.{notebook_task,python_wheel_task}`, etc.

## Tests
Unit tests

---------

Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
2024-09-25 11:30:14 +00:00
Shreyas Goenka 894f4aab4e
fix windows test 2024-09-24 17:54:13 +02:00
Shreyas Goenka e31dcc2724
fix logic for dedup 2024-09-24 17:47:48 +02:00
Shreyas Goenka 5bf916357a
return if diags has error 2024-09-24 17:42:03 +02:00
Shreyas Goenka f3d7f09150
fix failing tests 2024-09-24 17:39:20 +02:00
Shreyas Goenka 3571410209
cleanup todos 2024-09-24 17:31:50 +02:00
Shreyas Goenka ccbd199422
add info block 2024-09-24 17:29:10 +02:00
Shreyas Goenka fa6b68c410
add unit tests 2024-09-24 16:39:02 +02:00
Gleb Kanterov 490259a14a
Refactor jobs path translation (#1782)
## Changes
Extract package for other modules to transform different kinds of paths
in job resources.

## Tests
Unit tests
2024-09-24 13:51:54 +00:00
Shreyas Goenka 5c351d7ea3
more wip on making it work 2024-09-24 14:49:57 +02:00
Shreyas Goenka 458dbfb04d
- 2024-09-24 14:49:41 +02:00
Shreyas Goenka 667fe6bf38
Add detection and info diagnostic for convention for `.<resource-type>.yml` files 2024-09-24 14:49:41 +02:00
Shreyas Goenka aa51b63352
Modify SetLocation test utility to take full locations as argument 2024-09-24 14:45:14 +02:00
Andrew Nester 56ed9bebf3
Added support for creating all-purpose clusters (#1698)
## Changes
Added support for creating all-purpose clusters

Example of configuration

```
bundle:
  name: clusters

resources:
  clusters:
    test_cluster:
      cluster_name: "Test Cluster"
      num_workers: 2
      node_type_id: "i3.xlarge"
      autoscale:
        min_workers: 2
        max_workers: 7
      spark_version: "13.3.x-scala2.12"
      spark_conf:
        "spark.executor.memory": "2g"

  jobs:
    test_job:
      name: "Test Job"
      tasks:
        - task_key: test_task
          existing_cluster_id: ${resources.clusters.test_cluster.id}
          notebook_task:
            notebook_path: "./src/test.py"

targets:
    development:
      mode: development
      compute_id: ${resources.clusters.test_cluster.id}

```

## Tests
Added unit, config and E2E tests
2024-09-23 10:42:34 +00:00
Andrew Nester bcab6ca37b
Fixed detecting full syntax variable override which includes type field (#1775)
## Changes
Fixes #1773 

## Tests
Confirmed manually
2024-09-18 10:23:07 +00:00
Lennart Kats (databricks) e220f9ddd6
Use the friendly name of service principals when shortening their name (#1770)
## Summary

Use the friendly name of service principals when shortening their name.

This change is helpful for the prefix in development mode. Instead of
adding a prefix like `[dev 1706906c-c0a2-4c25-9f57-3a7aa3cb8123]`, we'll
prefix like `[dev my_principal]`.
2024-09-16 18:35:07 +00:00
Andrew Nester 66307134c1
Fixed generated YAML missing 'default' for empty values (#1765)
## Changes
Fixed generated YAML missing 'default' for empty values

## Tests
Added unit test
2024-09-11 09:49:58 +00:00
shreyas-goenka 5d2c0e3885
Alias variables block in the `Target` struct (#1748)
## Changes
This PR aliases and overrides the schema associated with the variables
block in `target` to allow for directly specifying a variable value in
the JSON schema (without an levels of nesting). This is needed because
this direct value is resolved by dynamically parsing the configuration
tree.

ca6332a5a4/bundle/config/root.go (L424)

## Tests
Existing unit tests.
2024-09-10 14:49:34 +00:00
Andrew Nester 02e83877f4
Added listing cluster filtering for cluster lookups (#1754)
## Changes
We added a custom resolver for the cluster to add filtering for the
cluster source when we list all clusters.

Without the filtering listing could take a very long time (5-10 mins)
which leads to lookup timeouts.

## Tests
Existing unit tests passing
2024-09-06 11:34:57 +00:00
Pieter Noordhuis ceefa80d72
Pass copy of `dyn.Path` to callback function (#1747)
## Changes

Some call sites hold on to the `dyn.Path` provided to them by the
callback. It must therefore never be mutated after the callback returns,
or these mutations leak out into unknown scope.

This change means it is no longer possible for this failure mode to
happen.

## Tests

Unit test.
2024-09-05 11:05:16 +00:00
Andrew Nester 72030844c5
Fixed variable override in target with full variable syntax (#1749)
## Changes
This PR makes sure that both of this override syntax for variables work
correctly
```
targets:
  dev:
    variables:
      cluster1:
        spark_version: "14.2.x-scala2.11"
        node_type_id: "Standard_DS3_v2"
        num_workers: 4
        spark_conf:
          spark.speculation: false
          spark.databricks.delta.retentionDurationCheck.enabled: false
      cluster2:
        default:
          spark_version: "14.2.x-scala2.11"
          node_type_id: "Standard_DS3_v2"
          num_workers: 4
          spark_conf:
            spark.speculation: false
            spark.databricks.delta.retentionDurationCheck.enabled: false
```
## Tests
Added regression test

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-09-04 17:16:40 +00:00
Andrew Nester ca6332a5a4
Fixed complex variables are not being correctly merged from include files (#1746)
## Changes
Fixes an `Error: no value assigned to required variable <variable>.`
when the main complex variable definition is defined in one file but
target override is defined in separate file which is included in the
main one.

## Tests
Added regression test
2024-09-04 11:24:55 +00:00
Gleb Kanterov ed448815b4
PythonMutator: explain missing package error (#1736)
## Changes
Explain the error when the `databricks-pydabs` package is not installed
or the Python environment isn't correctly activated.

Example output:

```
Error: python mutator process failed: ".venv/bin/python3 -m databricks.bundles.build --phase load --input .../input.json --output .../output.json --diagnostics .../diagnostics.json: exit status 1", use --debug to enable logging

.../.venv/bin/python3: Error while finding module specification for 'databricks.bundles.build' (ModuleNotFoundError: No module named 'databricks')

Explanation: 'databricks-pydabs' library is not installed in the Python environment.

If using Python wheels, ensure that 'databricks-pydabs' is included in the dependencies, 
and that the wheel is installed in the Python environment:

  $ .venv/bin/pip install -e .

If using a virtual environment, ensure it is specified as the venv_path property in databricks.yml, 
or activate the environment before running CLI commands:

  experimental:
    pydabs:
      venv_path: .venv
```

## Tests
Unit tests
2024-09-02 09:49:30 +00:00
Andrew Nester 582558cac2
Do not suppress normalisation diagnostics for resolving variables (#1740)
## Changes

Tested on the following bundle configuration

```
bundle:
  name: clusters
  mode: development

variables:
  webhook_notifications:
    description: Webhook URL for notifications
    type: complex
    default:
      on_failure:
        id: 6a6c04c1-389c-4534-95af-b68b62a9dbe6

resources:
  jobs:
    test_job:
      name: "Andrew Nester Test Job"
      tasks:
        - task_key: test_task
          notebook_task:
            notebook_path: "./src/test.py"
          new_cluster:
            num_workers: 2
            node_type_id: "i3.xlarge"
            autoscale:
              min_workers: 2
              max_workers: 7
            spark_version: "12.2.x-scala2.12"
            spark_conf:
              "spark.executor.memory": "2g"
      webhook_notifications: ${var.webhook_notifications}

```

bundle validate output is below

```
andrew.nester@HFW9Y94129 wheel % databricks bundle validate
Warning: expected sequence, found map
  at resources.jobs.test_job.webhook_notifications.on_failure
  in bundle.yml:11:9

Name: clusters
Target: default
Workspace:
  User: andrew.nester@databricks.com
  Path: /Users/andrew.nester@databricks.com/.bundle/clusters/default
```

**Note** that error correctly points to the variable
2024-09-02 09:17:18 +00:00
shreyas-goenka 5d9910c8e0
Make lock optional in the JSON schema (#1738)
Fixes https://github.com/databricks/cli/issues/1561
2024-09-02 08:39:08 +00:00