Commit Graph

273 Commits

Author SHA1 Message Date
shreyas-goenka 5d036ab6b8
Fix locker unlock for destroy (#492)
## Changes
Adds ability for allowing unlock to succeed even if the deploy file is
missing.
 
## Tests
Using integration tests and manually
2023-06-19 15:57:25 +02:00
shreyas-goenka de47cf19f1
Use better error assertions and clean up locker API (#490)
## Changes
Some cleanup work

## Tests
Locker integration test passes
2023-06-16 16:29:04 +02:00
Pieter Noordhuis 1875908b59
Pass through proxy related environment variables (#465)
## Changes

If set on the host, we must pass them through to Terraform.

## Tests

Unit tests pass.
2023-06-14 21:58:26 +02:00
Serge Smertin 2aa61a7c1b
Update with the latest Go SDK (#457)
## Changes
- removed deprecated methods
- regenerated with the latest OpenAPI spec
- picked up the latest go SDK version

## Tests
`make test`
2023-06-12 14:23:21 +02:00
Pieter Noordhuis 894d25e434
Check for nil environment before accessing it (#453) 2023-06-08 20:55:49 +00:00
stikkireddy 402fcdd62c
Skip path translation of job task for jobs with a Git source (#404)
## Changes

Added skipping of translating paths for notebook path in notebook tasks
and python file path in spark python tasks if the git source is not null.

Resolves: #402

## Tests

There is a unit test and also tested with a sample bundle:

```
resources:
  jobs:
    demo:
      git_source:
        git_branch: master
        git_provider: github
        git_url: https://github.com/test/dummy
   ....
```

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-06-07 12:34:59 +02:00
Pieter Noordhuis 349e2aff40
Allow equivalence checking of filer errors to fs errors (#416)
## Changes

The pattern `errors.Is(err, fs.ErrNotExist)` is common to check for an
error type.

Errors can implement `Is(error) bool` with a custom equivalence checker.

## Tests

New asserts all pass in the integration test.
2023-05-31 20:47:00 +02:00
Pieter Noordhuis e4ab455ea1
Don't pass synthesized TMPDIR if not already set (#409)
## Changes

On Unix systems, the default of `/tmp` always works. No need to
synthesize a path for it.

The custom TMPDIR was causing issues when used from GitHub Actions
runners.

## Tests

Confirmed manually this fixes the issue on GitHub Actions runners.
2023-05-26 13:05:30 +02:00
Andrew Nester 6141476ca2
Added support for bundle.Seq, simplified Mutator.Apply interface (#403)
## Changes
Added support for `bundle.Seq`, simplified `Mutator.Apply` interface by
removing list of mutators from return values/

## Tests
1. Ran `cli bundle deploy` and interrupted it with Cmd + C mid execution
so lock is not released
2. Ran `cli bundle deploy` top make sure that CLI is not trying to
release lock when it fail to acquire it
```
andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

^C
andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy
Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-24 12:10:23.050343 +0200 CEST. Use --force to override
```
2023-05-24 14:45:19 +02:00
Andrew Nester 273271bc59
Regenerated internal schema structs based on Terraform provider schemas (#401)
## Changes
Regenerated internal schema structs based on Terraform provider schemas

Allows to use `serverless` flag in bundle config.

## Tests
Ran `cli bundle deploy` with bundle which contains pipeline with
serverless key true
2023-05-23 19:33:24 +02:00
shreyas-goenka c53ad860e6
Create tmp files in the cache dir in terraform command runs (#395)
## Changes
Passes through tmp dir related env vars to the terraform process. Incase
any of them are not set, we assign temp dir inside bundle cache dir as
the location terraform should use.

## Tests
Manually checked that these env vars do override location where
os.CreateTemp files are created
2023-05-23 13:51:15 +02:00
Fabian Jakobs 055e528173
Rename: bricks -> databricks (#393)
## Changes
related to https://github.com/databricks/databricks-vscode/pull/721

## Rename env vars

`BRICKS_CLI_PATH` -> `DATABRICKS_CLI_PATH`
`BRICKS_OUTPUT_FORMAT` -> `DATABRICKS_OUTPUT_FORMAT`
`BRICKS_LOG_FILE` -> `DATABRICKS_LOG_FILE`
`BRICKS_LOG_LEVEL` -> `DATABRICKS_LOG_LEVEL`
`BRICKS_LOG_FORMAT` -> `DATABRICKS_LOG_FORMAT`
`BRICKS_PROGRESS_FORMAT` -> `DATABRICKS_CLI_PROGRESS_FORMAT`
`BRICKS_UPSTREAM` -> `DATABRICKS_CLI_UPSTREAM`
`BRICKS_UPSTREAM_VERSION` -> `DATABRICKS_CLI_UPSTREAM_VERSION`
2023-05-22 16:40:50 +02:00
Pieter Noordhuis 8979ed1394
Fix tests for new repository name (#390) 2023-05-16 19:02:07 +02:00
Pieter Noordhuis 98ebb78c9b
Rename bricks -> databricks (#389)
## Changes

Rename all instances of "bricks" to "databricks".

## Tests

* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
2023-05-16 18:35:39 +02:00
Andrew Nester 180dfc9a40
Added ability for deferred mutator execution (#380)
## Changes
Added `DeferredMutator` and `bundle.Defer` function which allows to
always execute some mutators either in the end of execution chain or
after error occurs in the middle of execution chain.

Usage as follows:

```
deferredMutator := bundle.Defer([]bundle.Mutator{
    lock.Acquire()
    transform.DoSomething(),
    //...
}, []bundle.Mutator{
    lock.Release(),
})
```
In such case `lock.Release()` will always be executed: either when all
operations above succeed or when any of them fails

## Tests
Before the change

```
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-10 16:41:22.902659 +0200 CEST. Use --force to override

```

After the change
```
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy 
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
```
2023-05-16 18:01:50 +02:00
Andrew Nester 33fb0b3c40
Do not truncate local state file when pulling remote changes (#382)
## Changes
When local state file exists it won't be override by remote state file

## Tests
Running `bricks bundle deploy` after state push failed does not override
local state file

Use cases verified:
1. Local state file is newer than remote
2. Local state file is older than remote
3. Local state file does not exist
4. Local state file corrupted
2023-05-16 17:02:33 +02:00
shreyas-goenka dd04875ee9
Add config environment support for variable overriding (#383)
## Changes
Allows to override default value for a variable definition from the
environment block in a bundle config. See bundle.yml for example usage

## Tests
Unit tests

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-05-15 14:07:18 +02:00
shreyas-goenka c5e940f664
Add support for variables in bundle config (#359)
## Changes
This PR now allows you to define variables in the bundle config and set
them in three ways
1. command line args
2. process environment variable
3. in the bundle config itself

## Tests
manually, unit, and black box tests

---------

Co-authored-by: Miles Yucht <miles@databricks.com>
2023-05-15 11:34:05 +02:00
Andrew Nester 473d2bf503
Improved error message when 'bricks bundle run' is executed before 'bricks bundle deploy' (#378)
## Changes
Improved error message when 'bricks bundle run' is executed before
'bricks bundle deploy'

The error happens when we attempt to load terraform state when it does
not exist.

The best way to check if terraform state actually exists is to call
`terraform show -json` and that's what already happens here

https://github.com/databricks/bricks/compare/main...error-before-deploy#diff-8c50f8c04e568397bc865b7e02d1f4ec5b18379d8d32daddfeb041035d804f5fL28

Absence of `state.Values` indicates that there is no state and likely
bundle was just never deployed.

## Tests
Ran `bricks bundle run test_job` on a new non-deployed bundle.

**Output:**

`Error: terraform show: No state. Did you forget to run 'bricks bundle
deploy'?`

Running `bricks bundle deploy && bricks bundle run test_job` succeeds.

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-05-10 11:02:25 +02:00
Andrew Nester 1916bc9d68
Fixed printing the tasks in job output in DAG execution order (#377)
Fixes #259

## Changes
Sort task output in an execution order based on task end time

## Tests
Added `TestTaskJobOutputOrderToString` unit test.
2023-05-08 16:35:47 +02:00
shreyas-goenka 37af3d5c4f
Add omitempty tag to bundle git details (#372)
## Changes
Add omit empty tag to git details. Otherwise this field becomes a
required field in the config json schema

## Tests
Tested by regenerating the json schema and checking that the git field
is now optional
2023-05-01 14:34:12 +02:00
shreyas-goenka 9e16140b6e
Add git config block to bundle config (#356)
## Changes
This config block contains commit, branch and remote_url which will be
automatically loaded if specified in the repo, and can also be specified
by the user

## Tests
Unit and black-box tests
2023-04-26 16:54:36 +02:00
Serge Smertin 9581187c9e
Update to Go SDK v0.8.0 (#351)
## Changes

- Update to Go SDK v0.8.0
- Fix all breaking changes

## Tests

- make test
2023-04-21 10:30:20 +02:00
shreyas-goenka 9b06095e47
Add support for multiple level string variable interpolation (#342)
## Changes
Traverses the variables referred in a depth first manner to resolve
string fields.
Errors out if a cycle is detected

## Tests
Manually and unit/blackbox tests
2023-04-20 01:13:33 +02:00
shreyas-goenka 089bebc92f
Do not print exceptions for non ERROR events (#347)
## Changes
Adds a check to not print exceptions trace for dlt events with a level <
ERROR

## Tests
Unit test
2023-04-19 22:11:05 +02:00
shreyas-goenka 598ad62688
Log mutator messages using progress logger (#312)
This PR uses progress logger to log messages inside mutators
2023-04-18 16:55:06 +02:00
shreyas-goenka d0872b45e2
Log pipeline update errors using progress logger (#338)
## Changes
Logs error message for all exceptions

## Tests
Manually and using unit tests
2023-04-18 15:00:34 +02:00
shreyas-goenka 59eee11989
Log job errors using progress logger (#337)
## Changes
This PR logs job errors using the progress logger

## Tests
Manually
2023-04-18 14:58:20 +02:00
shreyas-goenka 1a7b3eef18
Log job run url using progress logger (#336)
## Changes
Logs the job url using the progress logger

## Tests
Manually
2023-04-18 14:40:45 +02:00
shreyas-goenka 85889dffb1
Move state to event for whether they support inplace progress logging (#339)
## Changes
Adds a IsInplaceSupported() function to the event interface. Any event
that now uses the progress logger has to declare whether they support in
place logging

## Tests
Manually
2023-04-18 14:20:35 +02:00
shreyas-goenka 93d57dd00f
Detect duplicate identifiers in bundle config (#332)
## Changes
This PR adds checks during bundle config load and merge to error out if
there are duplicate keys for resource definitions

## Tests
Using unit tests and manually
2023-04-17 12:21:21 +02:00
Shreyas Goenka eab29603fc
Revert "Log job errors using progress logger"
This reverts commit a2e20f5206.
2023-04-15 15:19:32 +02:00
Shreyas Goenka a2e20f5206
Log job errors using progress logger 2023-04-15 15:18:38 +02:00
shreyas-goenka e8018a7209
Refactor output and progress into separate packages in run (#335)
Tested manually that output and progress logging still works
2023-04-14 14:40:34 +02:00
shreyas-goenka df0293510e
Fixes for pipeline progress logging (#330)
## Changes
1. Events are now printed in chronological order
2. Simplify events rendering by removing update/flow name. This makes it
more consistent with the web UI too
3. Switch to server side filtering on update_id

## Tests
Manually

Happy run:
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
2023-04-12T20:00:22.879Z update_progress INFO "Update e1becc is INITIALIZING."
2023-04-12T20:00:22.906Z update_progress INFO "Update e1becc is SETTING_UP_TABLES."
2023-04-12T20:00:24.496Z update_progress INFO "Update e1becc is RUNNING."
2023-04-12T20:00:24.497Z flow_progress   INFO "Flow 'sales_orders_raw' is QUEUED."
2023-04-12T20:00:24.586Z flow_progress   INFO "Flow 'sales_orders_raw' is STARTING."
2023-04-12T20:00:24.748Z flow_progress   INFO "Flow 'sales_orders_raw' is RUNNING."
2023-04-12T20:00:26.672Z flow_progress   INFO "Flow 'sales_orders_raw' has COMPLETED."
2023-04-12T20:00:27.753Z update_progress INFO "Update e1becc is COMPLETED."
```

Sad run:
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
2023-04-12T20:02:07.764Z update_progress INFO "Update 04b80e is INITIALIZING."
2023-04-12T20:02:07.870Z update_progress ERROR "Update 04b80e is FAILED."
Error: update failed
```
2023-04-14 12:21:44 +02:00
shreyas-goenka 3894d5796d
Add progress logging event for pipeline update URLs (#331)
## Changes
<!-- Summary of your changes that are easy to understand -->
Output now: 
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
The update can be found at https://e2-dogfood.staging.cloud.databricks.com/#joblist/pipelines/1cc605db-daab-4218-b38a-a63030e3eb03/updates/f92f2159-1141-47de-b1e2-1ca854b7238f

2023-04-12T20:41:19.813Z update_progress INFO "Update f92f21 is INITIALIZING."
2023-04-12T20:41:19.841Z update_progress INFO "Update f92f21 is SETTING_UP_TABLES."
2023-04-12T20:41:21.270Z update_progress INFO "Update f92f21 is RUNNING."
2023-04-12T20:41:21.271Z flow_progress   INFO "Flow 'sales_orders_raw' is QUEUED."
2023-04-12T20:41:21.349Z flow_progress   INFO "Flow 'sales_orders_raw' is STARTING."
2023-04-12T20:41:21.480Z flow_progress   INFO "Flow 'sales_orders_raw' is RUNNING."
2023-04-12T20:41:23.493Z flow_progress   INFO "Flow 'sales_orders_raw' has COMPLETED."
2023-04-12T20:41:25.484Z update_progress INFO "Update f92f21 is COMPLETED."
```

## Tests
<!-- How is this tested? -->
2023-04-14 11:11:30 +02:00
shreyas-goenka 417839021b
Add top level docs for bundle json schema (#313)
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>
2023-04-12 21:43:53 +02:00
Pieter Noordhuis b388f4a0dc
Make all workspace paths string fields (#327)
## Changes

These are unlikely to ever be DBFS paths so we can remove this level of indirection to simplify.

**Note:** this is a breaking change. Downstream usage of these fields must be updated.

## Tests

Existing tests pass.
2023-04-12 16:54:36 +02:00
Pieter Noordhuis 31ccebd62a
Store relative path to configuration file for every resource (#322)
## Changes

If a configuration file is located in a subdirectory of the bundle root,
files referenced from that configuration file should be relative to its
configuration file's directory instead of the bundle root.

## Tests

* New tests in `bundle/config/mutator/translate_paths_test.go`.
* Existing tests under `bundle/tests` pass and are augmented to assert
on paths.

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2023-04-12 16:17:13 +02:00
Miles Yucht 946906221d
Delete sync snapshots file when destroying a bundle (#323)
## Changes
This PR changes the files.Delete() mutator to delete the sync snapshots
file on destroy. This ensures that files will be uploaded when the
bundle is uploaded again.

## Tests
- [x] Manual test: Ran `bricks bundle destroy`, observed that the sync
snapshots file was deleted.
2023-04-11 16:57:01 +02:00
Pieter Noordhuis 42d29f92c9
Pass through $HOME when invoking Terraform (#319)
## Changes

This is useful when developing the Databricks Terraform provider where
you keep a local-only build of the provider and refer to it using $HOME
from `~/.terraformrc`, for example like this:

```
plugin_cache_dir = "$HOME/.terraform.d/plugin-cache"
```

## Tests

That $HOME is passed through cannot be tested as is because the
`tfexec.Terraform` struct doesn't expose it through public fields or
methods. What can be tested is a successful run of the initialize
mutator and this is included in this commit.
2023-04-11 13:11:31 +02:00
shreyas-goenka 4871f7bc8a
Add bundle destroy command (#300)
Adds bundle destroy capability to bricks
2023-04-06 12:54:58 +02:00
shreyas-goenka 6feaed4990
Fix host based auth conflicting with DEFAULT profile (#309)
## Changes
Consider the following host based configuration:
```
bundle:
  name: job_with_file_task

workspace:
  host: https://e2-dogfood.staging.cloud.databricks.com/
```

If you have a DEFAULT profile, then this host is ignored. The solution
proposed here is to remove the profile config loader if host is
explicitly specified in the bundle config.

This does come with a cost, namely that if a `DATABRICKS_CONFIG_PROFILE`
env var will be ignored, which maybe goes against unified auth spec

The ideal solution here is probably to make a change to go-SDK to not
select DEFAULT profile if host is not empty

## Tests
<!-- How is this tested? -->
2023-04-05 18:12:11 +02:00
Pieter Noordhuis d7ac265536
Allow use of file library in pipeline (#308)
## Changes

This requires databricks/databricks-sdk-go#359.

## Tests

Tests pass and ran manual verification of deployment with files.
2023-04-05 16:29:42 +02:00
Pieter Noordhuis 4e4c0658db
Interpolate paths for job tasks that reference files (#306)
## Changes

This change also swaps the order of mutators such that interpolation
happens before path translation. This means that is is possible to use
variables (e.g. `${bundle.environment}`) in notebook or file paths.

## Tests

New tests pass and verified manually.
2023-04-05 16:02:17 +02:00
shreyas-goenka 7427ceba6c
Fix output panic (#311)
## Changes
<!-- Summary of your changes that are easy to understand -->

Output now:
```
{
  "run_page_url": "https://e2-dogfood.staging.cloud.databricks.com/?o=6051921418418893#job/6199333392110/run/1088443776202122",
  "task_outputs": {
    "input": null,
    "process": {
      "logs": "[Row(max(id)=9)]\n",
      "logs_truncated": false
    }
  }
}
```

## Tests
<!-- How is this tested? -->
2023-04-05 15:55:24 +02:00
shreyas-goenka 8de7d32ed1
Add readonly bundle tag for internal fields (#302)
This PR adds a bundle: "readonly" struct tag to the json schema
generator. This allows us to skip generating json schema for internal
readonly fields

Tested using unit test
2023-04-04 12:16:07 +02:00
shreyas-goenka ddbb17b0d9
Regenerate generated empty json schema docs (#301)
## Changes
<!-- Summary of your changes that are easy to understand -->

## Tests
<!-- How is this tested? -->
2023-04-04 12:07:30 +02:00
dependabot[bot] 57cf66d3a8
Bump github.com/databricks/databricks-sdk-go from 0.5.0 to 0.6.0 (#299) 2023-04-03 21:33:21 +02:00
Pieter Noordhuis f26806be8f
Set BRICKS_CLI_PATH only if it cannot be derived from $PATH (#298)
## Changes

Related to #237.

Output of `bricks auth env` now doesn't include `BRICKS_CLI_PATH` if it
can be found in $PATH.

## Tests

Verified manually.
2023-04-03 16:23:53 +02:00
shreyas-goenka b4a30c641c
Add progress logging for pipeline runs (#283)
Add progress logging for pipeline runs
2023-03-31 17:04:12 +02:00
Pieter Noordhuis 04e77102c9
Add mutators to pull and push Terraform state (#288)
## Changes

Pull state before deploying and push state after deploying.

Note: the run command was missing mutators to initialize Terraform. This
is necessary if the cache directory is removed between running "deploy"
and "run" (which is valid now that we synchronize state).

## Tests

Manually.
2023-03-30 12:01:09 +02:00
Pieter Noordhuis 0ea0e81c8a
Ignore databricks_permissions resource when loading Terraform state (#291)
## Changes

The databricks_permissions resource may be generated if a bundle
resource includes a `permissions` block. There's no need to incorporate
details from the materialization into the bundle configuration struct.

## Tests

Confirmed that this fixes `bricks bundle run` when dealing with a bundle
with permission configuration.
2023-03-29 21:14:52 +02:00
Pieter Noordhuis 87207bba78
Configure Terraform provider auth through env vars (#290)
## Changes

Auth relied on setting a profile. In this change we enumerate all
configuration properties and export all non-empty ones as a map with
environment variables. We then pass this map to the Terraform execution
wrapper.

This results in Terraform using the bundle's authentication
configuration.

This change is needed to make #287 work.

## Tests

Manually.
2023-03-29 20:46:09 +02:00
Pieter Noordhuis cfd32c9602
Try to resolve a profile if only the host is specified (#287)
## Changes

This improves out of the box usability where a user who already
configured a `.databrickscfg` file will be able to reference the
workspace host in their `bundle.yml` and it will automatically pick up
the right profile.

## Tests

* Newly added tests pass.
* Manual testing confirms intended behavior.

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2023-03-29 20:44:19 +02:00
Pieter Noordhuis 8af934bbbb
Function to find the Git repository containing a bundle (#289)
## Changes

Useful functions from #277.

## Tests

Tests pass.
2023-03-29 16:36:35 +02:00
shreyas-goenka 8fd3dccca9
Add progress logs for job runs (#276) 2023-03-29 14:58:09 +02:00
Pieter Noordhuis edd8630f71
Log mutator phase at info level (#272) 2023-03-22 17:02:22 +01:00
Pieter Noordhuis 123a5e15e9
Acquire lock prior to deploy (#270)
Add configuration:

```
bundle:
  lock:
    enabled: true
    force: false
```

The force field can be set by passing the `--force` argument to `bricks
bundle deploy`. Doing so means the deployment lock is acquired even if
it is currently held. This should only be used in exceptional cases
(e.g. a previous deployment has failed to release the lock).
2023-03-22 16:37:26 +01:00
Pieter Noordhuis 6850caf2a2
Include mutator name in logging context (#271) 2023-03-22 15:54:10 +01:00
shreyas-goenka bfa20cdec9
Add json tags to output fields (#269)
output now:
```
{
  "run_page_url": "https://adb-309687753508875.15.azuredatabricks.net/?o=309687753508875#job/1077573342009637/run/19099317",
  "task_outputs": {
    "my_notebook_task": {
      "result": "computed results from notebook."
    }
  }
}%
```
2023-03-21 18:38:11 +01:00
shreyas-goenka 75d516939b
Error out if notebook file does not exist locally (#261)
Adds check for whether file exists locally

case 1: local (relative) file does not exist
```
    foo:
      name: "[job-output] test-job by shreyas"

      tasks:
        - task_key: my_notebook_task
          existing_cluster_id: ***
          notebook_task:
            notebook_path: "./doesnotexist"
```
output:
```
shreyas.goenka@THW32HFW6T job-output % bricks bundle deploy
Error: notebook ./doesnotexist not found. Error: open /Users/shreyas.goenka/projects/job-output/doesnotexist: no such file or directory
```


case 2: remote (absolute) file does not exist
```
    foo:
      name: "[job-output] test-job by shreyas"

      tasks:
        - task_key: my_notebook_task
          existing_cluster_id: ***
          notebook_task:
            notebook_path: "/Users/shreyas.goenka@databricks.com/doesnotexist"
```

output:
```
shreyas.goenka@THW32HFW6T job-output % bricks bundle deploy
shreyas.goenka@THW32HFW6T job-output % bricks bundle run foo
Error: failed to reach TERMINATED or SKIPPED, got INTERNAL_ERROR: Task my_notebook_task failed with message: Notebook not found: /Users/shreyas.goenka@databricks.com/doesnotexist. This caused all downstream tasks to get skipped.
```

case 3: remote exists
Successful deploy and run
2023-03-21 18:13:16 +01:00
shreyas-goenka 047a189c1e
Add job run output logging (#260)
This PR adds output logging for job runs

Tested using unit tests and manually
2023-03-21 16:25:18 +01:00
shreyas-goenka 4ac2e33def
Throw error when job run is skipped due to max_concurrent_runs (#257)
Tested manually:

Before we did not have get any errors/logs and silently failed in this
case

```
shreyas.goenka@THW32HFW6T job-output % bricks bundle run foo
Error: run skipped: Skipping this run because the limit of 1 maximum concurrent runs has been reached.
```
2023-03-21 13:17:15 +01:00
Pieter Noordhuis 66ca9ec266
Add permissions block to each resource (#264)
Example:

```yaml
resources:
  jobs:
    my_job:
      name: "[${bundle.environment}] My job"
      permissions:
        - level: CAN_VIEW
          group_name: users
```
2023-03-21 10:58:16 +01:00
Pieter Noordhuis 58563b1ea9
Add resources for mlflow models and experiments (#263)
Manually confirmed that both can be deployed.
2023-03-20 21:28:43 +01:00
Pieter Noordhuis 077ab8b864
Update Terraform provider schema structs (#265)
Generated from provider version 1.13.0.
2023-03-20 17:22:55 +01:00
Pieter Noordhuis ad666ff796
Use new logger throughout codebase (#256) 2023-03-17 15:17:31 +01:00
shreyas-goenka 7faa9dea9b
Use tracker for reference loop tracking (#252)
We incorrectly relied on map key iteration order to print debug trace.
This PR switches over to using the tracker struct to allow more reliable
json schema reference loop detection and logging

This also fixes the failing TestSelfReferenceLoopErrors and
TestCrossReferenceLoopErrors tests
2023-03-16 12:57:57 +01:00
shreyas-goenka 207777849b
Log latest error event on pipeline run fail (#239)
DAB config used to test this:

bundle.yml
```
workspace:
  host: <deco-azure-prod>

bundle:
  name: deco-538

resources:
  pipelines:
    foo:
      name: "[${bundle.name}] log pipeline errors"
      libraries:
        - notebook:
            path: ./myNb.py
      development: true
```

myNb.py
```
# Databricks notebook source
print(1/0)
```

Before:
```
2023/03/09 01:28:44 [INFO] [pipelines.foo] Update available at ***
2023/03/09 01:28:44 [INFO] [pipelines.foo] Update status: CREATED
2023/03/09 01:28:46 [INFO] [pipelines.foo] Update status: INITIALIZING
2023/03/09 01:28:52 [INFO] [pipelines.foo] Update status: FAILED
2023/03/09 01:28:52 [INFO] [pipelines.foo] Update has failed!
Error: update failed
```

Now:
```
2023/03/09 01:29:31 [INFO] [pipelines.foo] Update available at ***
2023/03/09 01:29:31 [INFO] [pipelines.foo] Update status: CREATED
2023/03/09 01:29:33 [INFO] [pipelines.foo] Update status: INITIALIZING
2023/03/09 01:29:40 [INFO] [pipelines.foo] Update status: FAILED
2023/03/09 01:29:40 [INFO] [pipelines.foo] Update has failed!
2023/03/09 01:29:40 [ERROR] [pipelines.foo] Update 27bc77 is FAILED.
trace for most recent exception:
Failed to execute python command for notebook '/Users/shreyas.goenka@databricks.com/.bundle/deco-538/default/files/myNb' with id RunnableCommandId(9070319781942164851) and error AnsiResult(---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<command--1> in <cell line: 1>()
----> 1 print(1/0)

ZeroDivisionError: division by zero,Map(),Map(),List(),List(),Map())
Error: update failed
```
2023-03-16 12:23:46 +01:00
shreyas-goenka c40e428469
skip flaky cross reference test (#251) 2023-03-15 17:09:52 +01:00
shreyas-goenka 92d1dd7e48
skip failing test for now (#249) 2023-03-15 16:57:41 +01:00
shreyas-goenka 18a216bf97
Add openapi descriptions to bundle resources (#229)
This PR:
1. Adds autogeneration of descriptions for `resources` field
2. Autogenerates empty descriptions for any properties in DABs
3. Defines SOPs for how to refresh these descriptions
4. Adds command to generate this documentation
5. Adds Automatically copy any descriptions over to `environments`
property

Basically it provides a framework for adding descriptions to the
generated JSON schema

Tested manually and using unit tests
2023-03-15 03:18:51 +01:00
Fabian Jakobs f0c35a2b27
Initialize BRICKS_CLI_PATH and increase default OAuth timeout (#237)
related to https://github.com/databricks/databricks-sdk-go/pull/330
2023-03-08 16:14:24 +01:00
shreyas-goenka f93b541b63
Show detailed error logs for jobs (#209)
PR for how to render errors on console for jobs. 
Here is the bundle used for the logs below:
```
bundle:
  name: deco-438

workspace:
  host: https://adb-309687753508875.15.azuredatabricks.net

resources:
  jobs:
    foo:
      name: "[${bundle.name}][${bundle.environment}] a test notebook"

      tasks:
        - task_key: alpha
          existing_cluster_id: 1109-115254-ox7poobk
          notebook_task:
            notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] invalid notebook"
        - task_key: beta
          existing_cluster_id: 1109-115254-ox7poobk
          notebook_task:
            notebook_path: "/does-not-exist"
        - task_key: gamma
          existing_cluster_id: 1109-115254-ox7poobk
          notebook_task:
            notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] valid notebook"
```

And this is a screenshot of the logs from the console:
<img width="1057" alt="Screenshot 2023-02-17 at 7 12 29 PM"
src="https://user-images.githubusercontent.com/88374338/219744768-ab7f1e79-db8f-466a-ad6d-f2b6f85ed17c.png">

Here are the logs when only tasks gamma is executed (successfully):
<img width="1059" alt="Screenshot 2023-02-17 at 7 13 04 PM"
src="https://user-images.githubusercontent.com/88374338/219744992-011d8b91-ec1d-44f0-a849-83c81816dd9f.png">


TODO: Investigate more possible job errors, and make sure state for them
is handled in a robust way here
2023-02-20 23:40:14 +01:00
Pieter Noordhuis dd95668474
Complete positional argument to bundle run (#220)
Command completion can be configured through `bricks completion`.
2023-02-20 21:55:06 +01:00
Pieter Noordhuis 9912ee1f92
Materialize glob expansion in configuration struct (#217)
This is needed to figure out which files should adhere to the schema.
2023-02-20 21:01:28 +01:00
Pieter Noordhuis a0ed02281d
Execute file synchronization on deploy (#211)
1. Perform file synchronization on deploy
2. Update notebook file path translation logic to point to the
synchronization target rather than treating the notebook as an artifact
and uploading it separately.
2023-02-20 19:42:55 +01:00
Pieter Noordhuis 414ea4f891
Bump databricks-sdk-go to 0.3.2 (#215) 2023-02-20 16:00:20 +01:00
Pieter Noordhuis 6c93c96bd1
Update deps for internal-only tree (#214)
Fixes dependabot warnings.
2023-02-20 14:30:42 +01:00
Pieter Noordhuis 1715a987cf
Make sync command work in bundle context; reorder args (#207)
Invoke with `bricks sync SRC DST`.

In bundle context `SRC` and `DST` arguments are taken from bundle configuration.

This PR adds `bricks bundle sync` to disambiguate between the two.
Once the VS Code extension is bundle aware they can again be consolidated.
Consolidating them today would regress the VS Code experience if a
`bundle.yml` file is present in the file tree.
2023-02-20 11:33:30 +01:00
shreyas-goenka 0ab2aa1bfa
Make file, artifact and state path optional (#204)
This PR makes bundle name required, and a few fields with defined
defaults optional, to generate a better json schema
2023-02-17 02:49:39 +01:00
Pieter Noordhuis 9a1d908f79
Add function to opportunistically load a bundle (#180)
It is not an error if a bundle cannot be found for this category.
This sets the stage for using bundle configuration in non-bundle
commands.
2023-01-27 16:57:39 +01:00
Pieter Noordhuis 35c3d9fa4e
Add workspace paths (#179)
The workspace root path is a base path for bundle storage. If not
specified, it defaults to `~/.bundle/name/environment`. This default, or
other paths starting with `~` are expanded to the current user's home
directory. The configuration also includes fields for the files path,
artifacts path, and state path. By default, these are nested under the
root path, but can be overridden if needed.
2023-01-26 19:55:38 +01:00
shreyas-goenka 83fb89ad3b
Add command for generating JSON schema for DABs bundle config (#171)
In the future can add a path flag to generate subschemas. Might be
useful depending on how config splits are supported
2023-01-23 15:00:11 +01:00
shreyas-goenka b3a30166f6
JSON Schema generator for golang types (#167)
This PR contains a struct to allow you to generate JSON schemas from
Golang types and a struct to allow injecting documentation into the json
schema. This will support autocomplete for DABs
2023-01-20 16:55:44 +01:00
Pieter Noordhuis 3582037be6
Add nil check for retries.Info.Info (#166) 2023-01-12 18:58:36 +01:00
Pieter Noordhuis 8f4461904b
Define flags for running jobs and pipelines (#146) 2022-12-23 15:17:16 +01:00
Pieter Noordhuis 49aa858b89
Run command must always take a single argument (#156) 2022-12-22 16:19:38 +01:00
Pieter Noordhuis 61ef0ba8c6
Handle nil environment (#154) 2022-12-22 15:31:32 +01:00
Pieter Noordhuis 7f83463ca3
Bump SDK to latest (#151) 2022-12-22 09:46:17 +01:00
Pieter Noordhuis 4026b2cda2
Mutator to convert paths to local notebooks files into artifacts (#144)
This lets you write:
```yaml
libraries:
  - notebook:
      path: ./events.sql
```

Instead of:
```yaml
artifacts:
  events_sql:
    notebook:
      path: ./events.sql

libraries:
  - notebook:
      path: "${artifacts.events_sql.notebook.remote_path}"
```
2022-12-16 14:49:23 +01:00
Pieter Noordhuis 1a9a431b97
No need for nil check on map (#143) 2022-12-15 21:28:27 +01:00
Pieter Noordhuis 24a3b90713
Add "default" flag to environment block (#142)
If the environment is not set through command line argument or
environment variable, the bundle loads either 1) the only environment,
2) the only environment with the default flag set.
2022-12-15 21:28:14 +01:00
Pieter Noordhuis 35243db33c
Automatically install Terraform if needed (#141)
Users can opt out and use the system-installed version with the
following configuration:

```
bundle:
  terraform:
    exec_path: terraform
```

This will find the binary in $PATH and replace it with the found value.

If this is not set, the initialize phase will install Terraform in the
bundle's cache directory.
2022-12-15 17:30:33 +01:00
Pieter Noordhuis 32a37c1b83
Use filer.Filer in bundle/deployer/locker (#136)
Summary:
* All remote path arguments for deployer and locker are now relative to
root specified at initialization
* The workspace client is now a struct field so it doesn't have to be
passed around
2022-12-15 17:16:07 +01:00
Pieter Noordhuis b111416fe5
Add `bricks bundle run` command (#134) 2022-12-15 15:12:47 +01:00
Pieter Noordhuis 72e89bf33c
Use pointers to resources in bundle configuration (#140)
Avoid copy-by-value when iterating over these maps.
2022-12-15 13:00:41 +01:00
Pieter Noordhuis d0bd74c116
Run Go formatting with 1.19 (#137)
See https://tip.golang.org/doc/go1.19#go-doc.
2022-12-14 15:59:47 +01:00
Pieter Noordhuis d713521d63
Convert job task libraries to TF JSON (#132) 2022-12-12 16:36:59 +01:00
Pieter Noordhuis c255bd686a
Define deploy command as sequence of build phases (#129) 2022-12-12 12:49:25 +01:00
Pieter Noordhuis 8640696b4b
Add minimal test for conversion to TF JSON format (#130) 2022-12-12 11:31:28 +01:00
Pieter Noordhuis 94a86972e5
Allow multiple lookup functions for interpolation (#128) 2022-12-12 10:48:52 +01:00
Pieter Noordhuis 3f8e233a18
Function to limit interpolation to specific path (#127)
New function `IncludeLookupsInPath` is counterpart to
`ExcludeLookupsInPath`.
2022-12-12 10:30:17 +01:00
Pieter Noordhuis 4f668fc58b
Mutators to work with Terraform (#124)
This includes 3 mutators:
* Interpolate resources references to TF compatible format
* Convert resources struct to TF JSON format and write it to disk
* Run TF apply
2022-12-09 08:57:30 +01:00
Pieter Noordhuis ff89c9d06f
Generate equivalent Go types from Terraform provider schema (#122)
It contains:
* `codegen` -- this turns the schema of the Databricks Terraform provider into Go types.
* `schema` -- the output of the above.
2022-12-06 16:26:19 +01:00
shreyas-goenka d9d295f2a9
Implement Terraform state synchronization and deploy (#98)
https://user-images.githubusercontent.com/88374338/203669797-abebf99e-8fa6-4d6e-b57a-abd172d8020d.mov
2022-12-06 00:40:45 +01:00
Pieter Noordhuis d5474c9673
Revert "Rename jobs -> workflows" (#118)
This reverts PR #111.

This reverts commit 230811031f.
2022-12-01 22:39:15 +01:00
Pieter Noordhuis cdc776d89e
Parameterize interpolation function (#117)
By specifying a function typed `LookupFunction` the caller can customize
which path expressions to interpolate and which ones to skip. When we
express dependencies between resources their values are known by
Terraform at deploy time. Therefore, we have to skip interpolation for
`${resources.jobs.my_job.id}` and instead rewrite it to
`${databricks_job.my_job.id}` before passing it along to Terraform.
2022-12-01 22:38:49 +01:00
Pieter Noordhuis 34af98a8c3
Mutators to define current user and default artifact path (#112) 2022-12-01 11:17:29 +01:00
Pieter Noordhuis 230811031f
Rename jobs -> workflows (#111) 2022-12-01 09:35:21 +01:00
Pieter Noordhuis c4d63eac70
Rudimentary interpolation support (#108)
Performs interpolation on string field.

It looks for patterns `${foo.bar}` where `foo.bar` points to a string
field in the configuration data model.

It does not support traversal (e.g. `${foo}` with `foo` equal
to`${bar}`), hence "rudimentary".
2022-12-01 09:33:42 +01:00
Pieter Noordhuis 4064a21797
Function to return bundle's cache directory (#109)
Parallel of `project.CacheDir()` introduced in
https://github.com/databricks/bricks/pull/82.
2022-11-30 14:40:41 +01:00
Pieter Noordhuis e1669b0352
Model code artifacts (#107)
This adds:
* Top level "artifacts" configuration key
* Support for notebooks (does language detection and upload)
* Merge of per-environment artifacts (or artifact overrides) into top level
2022-11-30 14:15:22 +01:00
shreyas-goenka 2ebfa5f369
Run unit tests on windows and macos (#103)
Unit tests are now run in all three big OS. 

Some of the changes are to make the tests green for windows while we are
skipping some of the other tests on windows/macOS to make the tests
pass. This is a temporary measure and we will incrementally migrate
these tests over so there is parity in unit testing along all three
environments!
2022-11-28 11:34:25 +01:00
Pieter Noordhuis b88b35a510
Move mutator interface to top level bundle package (#105)
While working on artifact upload and workspace interrogation I realized
this mutator interface needs to:
1. Operate at the whole bundle level so it can apply to both
configuration and internal state
2. Include a `context.Context` parameter for a) long running operations
and b) progress reporting

Previous interface:
```
Apply(*config.Root) ([]Mutator, error)
```

New interface:
```
Apply(context.Context, *Bundle) ([]Mutator, error)
```
2022-11-28 10:59:43 +01:00
Pieter Noordhuis 5c916a6fb4
Store specified environment in configuration for reference (#104) 2022-11-28 10:10:13 +01:00
Pieter Noordhuis 8e786d76a9
Update databricks-sdk-go to latest (#102) 2022-11-24 21:41:57 +01:00
Pieter Noordhuis 07f07694a4
Function to return workspace client on bundle.Bundle (#100)
Complementary command to check the identity in the context of a bundle
environment:

For example:
```
bricks bundle debug whoami -e development
```
2022-11-23 15:20:03 +01:00
Pieter Noordhuis ab1df558a2
Test that YAML anchors work (#96) 2022-11-21 15:40:27 +01:00
Pieter Noordhuis 3b351d3b00
Add command that writes the materialized bundle configuration to stdout (#95)
Used to inspect the bundle configuration after loading and merging all
files.

Once we add variable interpolation this command could show the result
after interpolation as well.

Each of the mutations to this configuration is observable, so we could
add a mode that writes each of the intermediate versions to disk for
even more fine grained introspection.
2022-11-21 15:39:53 +01:00
Pieter Noordhuis 195eb7f0f9
Add job and pipeline structs (#94) 2022-11-18 11:12:24 +01:00
Pieter Noordhuis e47fa61951
Skeleton for configuration loading and mutation (#92)
Load a tree of configuration files anchored at `bundle.yml` into the
`config.Root` struct.

All mutations (from setting defaults to merging files) are observable
through the `mutator.Mutator` interface.
2022-11-18 10:57:31 +01:00