Commit Graph

118 Commits

Author SHA1 Message Date
Shreyas Goenka 2303f20f95
sync options redefine better 2023-06-02 00:04:18 +02:00
Pieter Noordhuis 349e2aff40
Allow equivalence checking of filer errors to fs errors (#416)
## Changes

The pattern `errors.Is(err, fs.ErrNotExist)` is common to check for an
error type.

Errors can implement `Is(error) bool` with a custom equivalence checker.

## Tests

New asserts all pass in the integration test.
2023-05-31 20:47:00 +02:00
Pieter Noordhuis e4ab455ea1
Don't pass synthesized TMPDIR if not already set (#409)
## Changes

On Unix systems, the default of `/tmp` always works. No need to
synthesize a path for it.

The custom TMPDIR was causing issues when used from GitHub Actions
runners.

## Tests

Confirmed manually this fixes the issue on GitHub Actions runners.
2023-05-26 13:05:30 +02:00
Andrew Nester 6141476ca2
Added support for bundle.Seq, simplified Mutator.Apply interface (#403)
## Changes
Added support for `bundle.Seq`, simplified `Mutator.Apply` interface by
removing list of mutators from return values/

## Tests
1. Ran `cli bundle deploy` and interrupted it with Cmd + C mid execution
so lock is not released
2. Ran `cli bundle deploy` top make sure that CLI is not trying to
release lock when it fail to acquire it
```
andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

^C
andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy
Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-24 12:10:23.050343 +0200 CEST. Use --force to override
```
2023-05-24 14:45:19 +02:00
Andrew Nester 273271bc59
Regenerated internal schema structs based on Terraform provider schemas (#401)
## Changes
Regenerated internal schema structs based on Terraform provider schemas

Allows to use `serverless` flag in bundle config.

## Tests
Ran `cli bundle deploy` with bundle which contains pipeline with
serverless key true
2023-05-23 19:33:24 +02:00
shreyas-goenka c53ad860e6
Create tmp files in the cache dir in terraform command runs (#395)
## Changes
Passes through tmp dir related env vars to the terraform process. Incase
any of them are not set, we assign temp dir inside bundle cache dir as
the location terraform should use.

## Tests
Manually checked that these env vars do override location where
os.CreateTemp files are created
2023-05-23 13:51:15 +02:00
Fabian Jakobs 055e528173
Rename: bricks -> databricks (#393)
## Changes
related to https://github.com/databricks/databricks-vscode/pull/721

## Rename env vars

`BRICKS_CLI_PATH` -> `DATABRICKS_CLI_PATH`
`BRICKS_OUTPUT_FORMAT` -> `DATABRICKS_OUTPUT_FORMAT`
`BRICKS_LOG_FILE` -> `DATABRICKS_LOG_FILE`
`BRICKS_LOG_LEVEL` -> `DATABRICKS_LOG_LEVEL`
`BRICKS_LOG_FORMAT` -> `DATABRICKS_LOG_FORMAT`
`BRICKS_PROGRESS_FORMAT` -> `DATABRICKS_CLI_PROGRESS_FORMAT`
`BRICKS_UPSTREAM` -> `DATABRICKS_CLI_UPSTREAM`
`BRICKS_UPSTREAM_VERSION` -> `DATABRICKS_CLI_UPSTREAM_VERSION`
2023-05-22 16:40:50 +02:00
Pieter Noordhuis 8979ed1394
Fix tests for new repository name (#390) 2023-05-16 19:02:07 +02:00
Pieter Noordhuis 98ebb78c9b
Rename bricks -> databricks (#389)
## Changes

Rename all instances of "bricks" to "databricks".

## Tests

* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
2023-05-16 18:35:39 +02:00
Andrew Nester 180dfc9a40
Added ability for deferred mutator execution (#380)
## Changes
Added `DeferredMutator` and `bundle.Defer` function which allows to
always execute some mutators either in the end of execution chain or
after error occurs in the middle of execution chain.

Usage as follows:

```
deferredMutator := bundle.Defer([]bundle.Mutator{
    lock.Acquire()
    transform.DoSomething(),
    //...
}, []bundle.Mutator{
    lock.Release(),
})
```
In such case `lock.Release()` will always be executed: either when all
operations above succeed or when any of them fails

## Tests
Before the change

```
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-10 16:41:22.902659 +0200 CEST. Use --force to override

```

After the change
```
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy 
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
```
2023-05-16 18:01:50 +02:00
Andrew Nester 33fb0b3c40
Do not truncate local state file when pulling remote changes (#382)
## Changes
When local state file exists it won't be override by remote state file

## Tests
Running `bricks bundle deploy` after state push failed does not override
local state file

Use cases verified:
1. Local state file is newer than remote
2. Local state file is older than remote
3. Local state file does not exist
4. Local state file corrupted
2023-05-16 17:02:33 +02:00
shreyas-goenka dd04875ee9
Add config environment support for variable overriding (#383)
## Changes
Allows to override default value for a variable definition from the
environment block in a bundle config. See bundle.yml for example usage

## Tests
Unit tests

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-05-15 14:07:18 +02:00
shreyas-goenka c5e940f664
Add support for variables in bundle config (#359)
## Changes
This PR now allows you to define variables in the bundle config and set
them in three ways
1. command line args
2. process environment variable
3. in the bundle config itself

## Tests
manually, unit, and black box tests

---------

Co-authored-by: Miles Yucht <miles@databricks.com>
2023-05-15 11:34:05 +02:00
Andrew Nester 473d2bf503
Improved error message when 'bricks bundle run' is executed before 'bricks bundle deploy' (#378)
## Changes
Improved error message when 'bricks bundle run' is executed before
'bricks bundle deploy'

The error happens when we attempt to load terraform state when it does
not exist.

The best way to check if terraform state actually exists is to call
`terraform show -json` and that's what already happens here

https://github.com/databricks/bricks/compare/main...error-before-deploy#diff-8c50f8c04e568397bc865b7e02d1f4ec5b18379d8d32daddfeb041035d804f5fL28

Absence of `state.Values` indicates that there is no state and likely
bundle was just never deployed.

## Tests
Ran `bricks bundle run test_job` on a new non-deployed bundle.

**Output:**

`Error: terraform show: No state. Did you forget to run 'bricks bundle
deploy'?`

Running `bricks bundle deploy && bricks bundle run test_job` succeeds.

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-05-10 11:02:25 +02:00
Andrew Nester 1916bc9d68
Fixed printing the tasks in job output in DAG execution order (#377)
Fixes #259

## Changes
Sort task output in an execution order based on task end time

## Tests
Added `TestTaskJobOutputOrderToString` unit test.
2023-05-08 16:35:47 +02:00
shreyas-goenka 37af3d5c4f
Add omitempty tag to bundle git details (#372)
## Changes
Add omit empty tag to git details. Otherwise this field becomes a
required field in the config json schema

## Tests
Tested by regenerating the json schema and checking that the git field
is now optional
2023-05-01 14:34:12 +02:00
shreyas-goenka 9e16140b6e
Add git config block to bundle config (#356)
## Changes
This config block contains commit, branch and remote_url which will be
automatically loaded if specified in the repo, and can also be specified
by the user

## Tests
Unit and black-box tests
2023-04-26 16:54:36 +02:00
Serge Smertin 9581187c9e
Update to Go SDK v0.8.0 (#351)
## Changes

- Update to Go SDK v0.8.0
- Fix all breaking changes

## Tests

- make test
2023-04-21 10:30:20 +02:00
shreyas-goenka 9b06095e47
Add support for multiple level string variable interpolation (#342)
## Changes
Traverses the variables referred in a depth first manner to resolve
string fields.
Errors out if a cycle is detected

## Tests
Manually and unit/blackbox tests
2023-04-20 01:13:33 +02:00
shreyas-goenka 089bebc92f
Do not print exceptions for non ERROR events (#347)
## Changes
Adds a check to not print exceptions trace for dlt events with a level <
ERROR

## Tests
Unit test
2023-04-19 22:11:05 +02:00
shreyas-goenka 598ad62688
Log mutator messages using progress logger (#312)
This PR uses progress logger to log messages inside mutators
2023-04-18 16:55:06 +02:00
shreyas-goenka d0872b45e2
Log pipeline update errors using progress logger (#338)
## Changes
Logs error message for all exceptions

## Tests
Manually and using unit tests
2023-04-18 15:00:34 +02:00
shreyas-goenka 59eee11989
Log job errors using progress logger (#337)
## Changes
This PR logs job errors using the progress logger

## Tests
Manually
2023-04-18 14:58:20 +02:00
shreyas-goenka 1a7b3eef18
Log job run url using progress logger (#336)
## Changes
Logs the job url using the progress logger

## Tests
Manually
2023-04-18 14:40:45 +02:00
shreyas-goenka 85889dffb1
Move state to event for whether they support inplace progress logging (#339)
## Changes
Adds a IsInplaceSupported() function to the event interface. Any event
that now uses the progress logger has to declare whether they support in
place logging

## Tests
Manually
2023-04-18 14:20:35 +02:00
shreyas-goenka 93d57dd00f
Detect duplicate identifiers in bundle config (#332)
## Changes
This PR adds checks during bundle config load and merge to error out if
there are duplicate keys for resource definitions

## Tests
Using unit tests and manually
2023-04-17 12:21:21 +02:00
Shreyas Goenka eab29603fc
Revert "Log job errors using progress logger"
This reverts commit a2e20f5206.
2023-04-15 15:19:32 +02:00
Shreyas Goenka a2e20f5206
Log job errors using progress logger 2023-04-15 15:18:38 +02:00
shreyas-goenka e8018a7209
Refactor output and progress into separate packages in run (#335)
Tested manually that output and progress logging still works
2023-04-14 14:40:34 +02:00
shreyas-goenka df0293510e
Fixes for pipeline progress logging (#330)
## Changes
1. Events are now printed in chronological order
2. Simplify events rendering by removing update/flow name. This makes it
more consistent with the web UI too
3. Switch to server side filtering on update_id

## Tests
Manually

Happy run:
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
2023-04-12T20:00:22.879Z update_progress INFO "Update e1becc is INITIALIZING."
2023-04-12T20:00:22.906Z update_progress INFO "Update e1becc is SETTING_UP_TABLES."
2023-04-12T20:00:24.496Z update_progress INFO "Update e1becc is RUNNING."
2023-04-12T20:00:24.497Z flow_progress   INFO "Flow 'sales_orders_raw' is QUEUED."
2023-04-12T20:00:24.586Z flow_progress   INFO "Flow 'sales_orders_raw' is STARTING."
2023-04-12T20:00:24.748Z flow_progress   INFO "Flow 'sales_orders_raw' is RUNNING."
2023-04-12T20:00:26.672Z flow_progress   INFO "Flow 'sales_orders_raw' has COMPLETED."
2023-04-12T20:00:27.753Z update_progress INFO "Update e1becc is COMPLETED."
```

Sad run:
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
2023-04-12T20:02:07.764Z update_progress INFO "Update 04b80e is INITIALIZING."
2023-04-12T20:02:07.870Z update_progress ERROR "Update 04b80e is FAILED."
Error: update failed
```
2023-04-14 12:21:44 +02:00
shreyas-goenka 3894d5796d
Add progress logging event for pipeline update URLs (#331)
## Changes
<!-- Summary of your changes that are easy to understand -->
Output now: 
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
The update can be found at https://e2-dogfood.staging.cloud.databricks.com/#joblist/pipelines/1cc605db-daab-4218-b38a-a63030e3eb03/updates/f92f2159-1141-47de-b1e2-1ca854b7238f

2023-04-12T20:41:19.813Z update_progress INFO "Update f92f21 is INITIALIZING."
2023-04-12T20:41:19.841Z update_progress INFO "Update f92f21 is SETTING_UP_TABLES."
2023-04-12T20:41:21.270Z update_progress INFO "Update f92f21 is RUNNING."
2023-04-12T20:41:21.271Z flow_progress   INFO "Flow 'sales_orders_raw' is QUEUED."
2023-04-12T20:41:21.349Z flow_progress   INFO "Flow 'sales_orders_raw' is STARTING."
2023-04-12T20:41:21.480Z flow_progress   INFO "Flow 'sales_orders_raw' is RUNNING."
2023-04-12T20:41:23.493Z flow_progress   INFO "Flow 'sales_orders_raw' has COMPLETED."
2023-04-12T20:41:25.484Z update_progress INFO "Update f92f21 is COMPLETED."
```

## Tests
<!-- How is this tested? -->
2023-04-14 11:11:30 +02:00
shreyas-goenka 417839021b
Add top level docs for bundle json schema (#313)
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>
2023-04-12 21:43:53 +02:00
Pieter Noordhuis b388f4a0dc
Make all workspace paths string fields (#327)
## Changes

These are unlikely to ever be DBFS paths so we can remove this level of indirection to simplify.

**Note:** this is a breaking change. Downstream usage of these fields must be updated.

## Tests

Existing tests pass.
2023-04-12 16:54:36 +02:00
Pieter Noordhuis 31ccebd62a
Store relative path to configuration file for every resource (#322)
## Changes

If a configuration file is located in a subdirectory of the bundle root,
files referenced from that configuration file should be relative to its
configuration file's directory instead of the bundle root.

## Tests

* New tests in `bundle/config/mutator/translate_paths_test.go`.
* Existing tests under `bundle/tests` pass and are augmented to assert
on paths.

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2023-04-12 16:17:13 +02:00
Miles Yucht 946906221d
Delete sync snapshots file when destroying a bundle (#323)
## Changes
This PR changes the files.Delete() mutator to delete the sync snapshots
file on destroy. This ensures that files will be uploaded when the
bundle is uploaded again.

## Tests
- [x] Manual test: Ran `bricks bundle destroy`, observed that the sync
snapshots file was deleted.
2023-04-11 16:57:01 +02:00
Pieter Noordhuis 42d29f92c9
Pass through $HOME when invoking Terraform (#319)
## Changes

This is useful when developing the Databricks Terraform provider where
you keep a local-only build of the provider and refer to it using $HOME
from `~/.terraformrc`, for example like this:

```
plugin_cache_dir = "$HOME/.terraform.d/plugin-cache"
```

## Tests

That $HOME is passed through cannot be tested as is because the
`tfexec.Terraform` struct doesn't expose it through public fields or
methods. What can be tested is a successful run of the initialize
mutator and this is included in this commit.
2023-04-11 13:11:31 +02:00
shreyas-goenka 4871f7bc8a
Add bundle destroy command (#300)
Adds bundle destroy capability to bricks
2023-04-06 12:54:58 +02:00
shreyas-goenka 6feaed4990
Fix host based auth conflicting with DEFAULT profile (#309)
## Changes
Consider the following host based configuration:
```
bundle:
  name: job_with_file_task

workspace:
  host: https://e2-dogfood.staging.cloud.databricks.com/
```

If you have a DEFAULT profile, then this host is ignored. The solution
proposed here is to remove the profile config loader if host is
explicitly specified in the bundle config.

This does come with a cost, namely that if a `DATABRICKS_CONFIG_PROFILE`
env var will be ignored, which maybe goes against unified auth spec

The ideal solution here is probably to make a change to go-SDK to not
select DEFAULT profile if host is not empty

## Tests
<!-- How is this tested? -->
2023-04-05 18:12:11 +02:00
Pieter Noordhuis d7ac265536
Allow use of file library in pipeline (#308)
## Changes

This requires databricks/databricks-sdk-go#359.

## Tests

Tests pass and ran manual verification of deployment with files.
2023-04-05 16:29:42 +02:00
Pieter Noordhuis 4e4c0658db
Interpolate paths for job tasks that reference files (#306)
## Changes

This change also swaps the order of mutators such that interpolation
happens before path translation. This means that is is possible to use
variables (e.g. `${bundle.environment}`) in notebook or file paths.

## Tests

New tests pass and verified manually.
2023-04-05 16:02:17 +02:00
shreyas-goenka 7427ceba6c
Fix output panic (#311)
## Changes
<!-- Summary of your changes that are easy to understand -->

Output now:
```
{
  "run_page_url": "https://e2-dogfood.staging.cloud.databricks.com/?o=6051921418418893#job/6199333392110/run/1088443776202122",
  "task_outputs": {
    "input": null,
    "process": {
      "logs": "[Row(max(id)=9)]\n",
      "logs_truncated": false
    }
  }
}
```

## Tests
<!-- How is this tested? -->
2023-04-05 15:55:24 +02:00
shreyas-goenka 8de7d32ed1
Add readonly bundle tag for internal fields (#302)
This PR adds a bundle: "readonly" struct tag to the json schema
generator. This allows us to skip generating json schema for internal
readonly fields

Tested using unit test
2023-04-04 12:16:07 +02:00
shreyas-goenka ddbb17b0d9
Regenerate generated empty json schema docs (#301)
## Changes
<!-- Summary of your changes that are easy to understand -->

## Tests
<!-- How is this tested? -->
2023-04-04 12:07:30 +02:00
dependabot[bot] 57cf66d3a8
Bump github.com/databricks/databricks-sdk-go from 0.5.0 to 0.6.0 (#299) 2023-04-03 21:33:21 +02:00
Pieter Noordhuis f26806be8f
Set BRICKS_CLI_PATH only if it cannot be derived from $PATH (#298)
## Changes

Related to #237.

Output of `bricks auth env` now doesn't include `BRICKS_CLI_PATH` if it
can be found in $PATH.

## Tests

Verified manually.
2023-04-03 16:23:53 +02:00
shreyas-goenka b4a30c641c
Add progress logging for pipeline runs (#283)
Add progress logging for pipeline runs
2023-03-31 17:04:12 +02:00
Pieter Noordhuis 04e77102c9
Add mutators to pull and push Terraform state (#288)
## Changes

Pull state before deploying and push state after deploying.

Note: the run command was missing mutators to initialize Terraform. This
is necessary if the cache directory is removed between running "deploy"
and "run" (which is valid now that we synchronize state).

## Tests

Manually.
2023-03-30 12:01:09 +02:00
Pieter Noordhuis 0ea0e81c8a
Ignore databricks_permissions resource when loading Terraform state (#291)
## Changes

The databricks_permissions resource may be generated if a bundle
resource includes a `permissions` block. There's no need to incorporate
details from the materialization into the bundle configuration struct.

## Tests

Confirmed that this fixes `bricks bundle run` when dealing with a bundle
with permission configuration.
2023-03-29 21:14:52 +02:00
Pieter Noordhuis 87207bba78
Configure Terraform provider auth through env vars (#290)
## Changes

Auth relied on setting a profile. In this change we enumerate all
configuration properties and export all non-empty ones as a map with
environment variables. We then pass this map to the Terraform execution
wrapper.

This results in Terraform using the bundle's authentication
configuration.

This change is needed to make #287 work.

## Tests

Manually.
2023-03-29 20:46:09 +02:00
Pieter Noordhuis cfd32c9602
Try to resolve a profile if only the host is specified (#287)
## Changes

This improves out of the box usability where a user who already
configured a `.databrickscfg` file will be able to reference the
workspace host in their `bundle.yml` and it will automatically pick up
the right profile.

## Tests

* Newly added tests pass.
* Manual testing confirms intended behavior.

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2023-03-29 20:44:19 +02:00