Commit Graph

129 Commits

Author SHA1 Message Date
Pieter Noordhuis ad8183d7a9
Bump Go SDK to v0.12.0 (#540)
## Changes

* Regenerate CLI commands
* Ignore `account-access-control-proxy` (see #505)

## Tests

Unit and integration tests pass.
2023-07-03 11:46:45 +02:00
Pieter Noordhuis cf92698eb3
Application -> Asset (#529) 2023-06-27 01:31:20 +02:00
stikkireddy 3c1e69a064
Update variable regex to support hyphens (#503)
## Changes

Modified interpolation logic to use:
`\$\{([a-zA-Z]+([-_]*[a-zA-Z0-9]+)*(\.[a-zA-Z]+([-_]*[a-zA-Z0-9]+)*)*)\}`

**Edit**: Suggested by @pietern
`\$\{([a-zA-Z]+([-_]?[a-zA-Z0-9]+)*(\.[a-zA-Z]+([-_]?[a-zA-Z0-9]+)*)*)\}`
to be more selective and not allow consequent hyphens or underscores to
make the keys more readable.

Explanation:
1. All interpolation starts with `${` and ends with `}`
2. All interpolated locations are split by by `.`
3. All sections are expected to start with a alphabet `[a-zA-Z]`; no
numbers, hyphens or underscores.
4. All sections are expected to end with an alphanumeric `[a-zA-Z0-9]`
no hyphens or underscores

This change allows the current interpolation to be more permissive.

**Note** it does break backwards compatibility because `[a-zA-Z] !=
[\w]`. `\w` includes alphanumeric and underscores. `\w = [a-zA-Z0-9_]`

## Tests
There are two tests with examples of valid and invalid interpolation and
a test to validate expansion.
2023-06-23 12:56:54 +02:00
Gleb Kanterov ccbcccd929
Add DATABRICKS_BUNDLE_TMP env variable (#462)
## Changes
Add DATABRICKS_BUNDLE_TMP env variable. It allows using a temporary
directory instead of writing to `$CWD/.databricks/bundle`

## Tests
I added unit tests

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-06-21 09:53:54 +02:00
dependabot[bot] 7fb34e4767
Bump github.com/databricks/databricks-sdk-go from 0.9.1-0.20230614092458-b5bbc1c8dabb to 0.10.0 (#497) 2023-06-21 07:43:07 +00:00
stikkireddy ddeb7487b3
Update Terraform provider schema structs (#504)
## Changes

Generated from provider version 1.19.0.

## Tests

Ran the existing unit tests and they seemed to have checked out.
2023-06-21 08:40:11 +02:00
shreyas-goenka 5d036ab6b8
Fix locker unlock for destroy (#492)
## Changes
Adds ability for allowing unlock to succeed even if the deploy file is
missing.
 
## Tests
Using integration tests and manually
2023-06-19 15:57:25 +02:00
shreyas-goenka de47cf19f1
Use better error assertions and clean up locker API (#490)
## Changes
Some cleanup work

## Tests
Locker integration test passes
2023-06-16 16:29:04 +02:00
Pieter Noordhuis 1875908b59
Pass through proxy related environment variables (#465)
## Changes

If set on the host, we must pass them through to Terraform.

## Tests

Unit tests pass.
2023-06-14 21:58:26 +02:00
Serge Smertin 2aa61a7c1b
Update with the latest Go SDK (#457)
## Changes
- removed deprecated methods
- regenerated with the latest OpenAPI spec
- picked up the latest go SDK version

## Tests
`make test`
2023-06-12 14:23:21 +02:00
Pieter Noordhuis 894d25e434
Check for nil environment before accessing it (#453) 2023-06-08 20:55:49 +00:00
stikkireddy 402fcdd62c
Skip path translation of job task for jobs with a Git source (#404)
## Changes

Added skipping of translating paths for notebook path in notebook tasks
and python file path in spark python tasks if the git source is not null.

Resolves: #402

## Tests

There is a unit test and also tested with a sample bundle:

```
resources:
  jobs:
    demo:
      git_source:
        git_branch: master
        git_provider: github
        git_url: https://github.com/test/dummy
   ....
```

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-06-07 12:34:59 +02:00
Pieter Noordhuis 349e2aff40
Allow equivalence checking of filer errors to fs errors (#416)
## Changes

The pattern `errors.Is(err, fs.ErrNotExist)` is common to check for an
error type.

Errors can implement `Is(error) bool` with a custom equivalence checker.

## Tests

New asserts all pass in the integration test.
2023-05-31 20:47:00 +02:00
Pieter Noordhuis e4ab455ea1
Don't pass synthesized TMPDIR if not already set (#409)
## Changes

On Unix systems, the default of `/tmp` always works. No need to
synthesize a path for it.

The custom TMPDIR was causing issues when used from GitHub Actions
runners.

## Tests

Confirmed manually this fixes the issue on GitHub Actions runners.
2023-05-26 13:05:30 +02:00
Andrew Nester 6141476ca2
Added support for bundle.Seq, simplified Mutator.Apply interface (#403)
## Changes
Added support for `bundle.Seq`, simplified `Mutator.Apply` interface by
removing list of mutators from return values/

## Tests
1. Ran `cli bundle deploy` and interrupted it with Cmd + C mid execution
so lock is not released
2. Ran `cli bundle deploy` top make sure that CLI is not trying to
release lock when it fail to acquire it
```
andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

^C
andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy
Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-24 12:10:23.050343 +0200 CEST. Use --force to override
```
2023-05-24 14:45:19 +02:00
Andrew Nester 273271bc59
Regenerated internal schema structs based on Terraform provider schemas (#401)
## Changes
Regenerated internal schema structs based on Terraform provider schemas

Allows to use `serverless` flag in bundle config.

## Tests
Ran `cli bundle deploy` with bundle which contains pipeline with
serverless key true
2023-05-23 19:33:24 +02:00
shreyas-goenka c53ad860e6
Create tmp files in the cache dir in terraform command runs (#395)
## Changes
Passes through tmp dir related env vars to the terraform process. Incase
any of them are not set, we assign temp dir inside bundle cache dir as
the location terraform should use.

## Tests
Manually checked that these env vars do override location where
os.CreateTemp files are created
2023-05-23 13:51:15 +02:00
Fabian Jakobs 055e528173
Rename: bricks -> databricks (#393)
## Changes
related to https://github.com/databricks/databricks-vscode/pull/721

## Rename env vars

`BRICKS_CLI_PATH` -> `DATABRICKS_CLI_PATH`
`BRICKS_OUTPUT_FORMAT` -> `DATABRICKS_OUTPUT_FORMAT`
`BRICKS_LOG_FILE` -> `DATABRICKS_LOG_FILE`
`BRICKS_LOG_LEVEL` -> `DATABRICKS_LOG_LEVEL`
`BRICKS_LOG_FORMAT` -> `DATABRICKS_LOG_FORMAT`
`BRICKS_PROGRESS_FORMAT` -> `DATABRICKS_CLI_PROGRESS_FORMAT`
`BRICKS_UPSTREAM` -> `DATABRICKS_CLI_UPSTREAM`
`BRICKS_UPSTREAM_VERSION` -> `DATABRICKS_CLI_UPSTREAM_VERSION`
2023-05-22 16:40:50 +02:00
Pieter Noordhuis 8979ed1394
Fix tests for new repository name (#390) 2023-05-16 19:02:07 +02:00
Pieter Noordhuis 98ebb78c9b
Rename bricks -> databricks (#389)
## Changes

Rename all instances of "bricks" to "databricks".

## Tests

* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
2023-05-16 18:35:39 +02:00
Andrew Nester 180dfc9a40
Added ability for deferred mutator execution (#380)
## Changes
Added `DeferredMutator` and `bundle.Defer` function which allows to
always execute some mutators either in the end of execution chain or
after error occurs in the middle of execution chain.

Usage as follows:

```
deferredMutator := bundle.Defer([]bundle.Mutator{
    lock.Acquire()
    transform.DoSomething(),
    //...
}, []bundle.Mutator{
    lock.Release(),
})
```
In such case `lock.Release()` will always be executed: either when all
operations above succeed or when any of them fails

## Tests
Before the change

```
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-10 16:41:22.902659 +0200 CEST. Use --force to override

```

After the change
```
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy 
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!

Error: terraform not initialized
```
2023-05-16 18:01:50 +02:00
Andrew Nester 33fb0b3c40
Do not truncate local state file when pulling remote changes (#382)
## Changes
When local state file exists it won't be override by remote state file

## Tests
Running `bricks bundle deploy` after state push failed does not override
local state file

Use cases verified:
1. Local state file is newer than remote
2. Local state file is older than remote
3. Local state file does not exist
4. Local state file corrupted
2023-05-16 17:02:33 +02:00
shreyas-goenka dd04875ee9
Add config environment support for variable overriding (#383)
## Changes
Allows to override default value for a variable definition from the
environment block in a bundle config. See bundle.yml for example usage

## Tests
Unit tests

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-05-15 14:07:18 +02:00
shreyas-goenka c5e940f664
Add support for variables in bundle config (#359)
## Changes
This PR now allows you to define variables in the bundle config and set
them in three ways
1. command line args
2. process environment variable
3. in the bundle config itself

## Tests
manually, unit, and black box tests

---------

Co-authored-by: Miles Yucht <miles@databricks.com>
2023-05-15 11:34:05 +02:00
Andrew Nester 473d2bf503
Improved error message when 'bricks bundle run' is executed before 'bricks bundle deploy' (#378)
## Changes
Improved error message when 'bricks bundle run' is executed before
'bricks bundle deploy'

The error happens when we attempt to load terraform state when it does
not exist.

The best way to check if terraform state actually exists is to call
`terraform show -json` and that's what already happens here

https://github.com/databricks/bricks/compare/main...error-before-deploy#diff-8c50f8c04e568397bc865b7e02d1f4ec5b18379d8d32daddfeb041035d804f5fL28

Absence of `state.Values` indicates that there is no state and likely
bundle was just never deployed.

## Tests
Ran `bricks bundle run test_job` on a new non-deployed bundle.

**Output:**

`Error: terraform show: No state. Did you forget to run 'bricks bundle
deploy'?`

Running `bricks bundle deploy && bricks bundle run test_job` succeeds.

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-05-10 11:02:25 +02:00
Andrew Nester 1916bc9d68
Fixed printing the tasks in job output in DAG execution order (#377)
Fixes #259

## Changes
Sort task output in an execution order based on task end time

## Tests
Added `TestTaskJobOutputOrderToString` unit test.
2023-05-08 16:35:47 +02:00
shreyas-goenka 37af3d5c4f
Add omitempty tag to bundle git details (#372)
## Changes
Add omit empty tag to git details. Otherwise this field becomes a
required field in the config json schema

## Tests
Tested by regenerating the json schema and checking that the git field
is now optional
2023-05-01 14:34:12 +02:00
shreyas-goenka 9e16140b6e
Add git config block to bundle config (#356)
## Changes
This config block contains commit, branch and remote_url which will be
automatically loaded if specified in the repo, and can also be specified
by the user

## Tests
Unit and black-box tests
2023-04-26 16:54:36 +02:00
Serge Smertin 9581187c9e
Update to Go SDK v0.8.0 (#351)
## Changes

- Update to Go SDK v0.8.0
- Fix all breaking changes

## Tests

- make test
2023-04-21 10:30:20 +02:00
shreyas-goenka 9b06095e47
Add support for multiple level string variable interpolation (#342)
## Changes
Traverses the variables referred in a depth first manner to resolve
string fields.
Errors out if a cycle is detected

## Tests
Manually and unit/blackbox tests
2023-04-20 01:13:33 +02:00
shreyas-goenka 089bebc92f
Do not print exceptions for non ERROR events (#347)
## Changes
Adds a check to not print exceptions trace for dlt events with a level <
ERROR

## Tests
Unit test
2023-04-19 22:11:05 +02:00
shreyas-goenka 598ad62688
Log mutator messages using progress logger (#312)
This PR uses progress logger to log messages inside mutators
2023-04-18 16:55:06 +02:00
shreyas-goenka d0872b45e2
Log pipeline update errors using progress logger (#338)
## Changes
Logs error message for all exceptions

## Tests
Manually and using unit tests
2023-04-18 15:00:34 +02:00
shreyas-goenka 59eee11989
Log job errors using progress logger (#337)
## Changes
This PR logs job errors using the progress logger

## Tests
Manually
2023-04-18 14:58:20 +02:00
shreyas-goenka 1a7b3eef18
Log job run url using progress logger (#336)
## Changes
Logs the job url using the progress logger

## Tests
Manually
2023-04-18 14:40:45 +02:00
shreyas-goenka 85889dffb1
Move state to event for whether they support inplace progress logging (#339)
## Changes
Adds a IsInplaceSupported() function to the event interface. Any event
that now uses the progress logger has to declare whether they support in
place logging

## Tests
Manually
2023-04-18 14:20:35 +02:00
shreyas-goenka 93d57dd00f
Detect duplicate identifiers in bundle config (#332)
## Changes
This PR adds checks during bundle config load and merge to error out if
there are duplicate keys for resource definitions

## Tests
Using unit tests and manually
2023-04-17 12:21:21 +02:00
Shreyas Goenka eab29603fc
Revert "Log job errors using progress logger"
This reverts commit a2e20f5206.
2023-04-15 15:19:32 +02:00
Shreyas Goenka a2e20f5206
Log job errors using progress logger 2023-04-15 15:18:38 +02:00
shreyas-goenka e8018a7209
Refactor output and progress into separate packages in run (#335)
Tested manually that output and progress logging still works
2023-04-14 14:40:34 +02:00
shreyas-goenka df0293510e
Fixes for pipeline progress logging (#330)
## Changes
1. Events are now printed in chronological order
2. Simplify events rendering by removing update/flow name. This makes it
more consistent with the web UI too
3. Switch to server side filtering on update_id

## Tests
Manually

Happy run:
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
2023-04-12T20:00:22.879Z update_progress INFO "Update e1becc is INITIALIZING."
2023-04-12T20:00:22.906Z update_progress INFO "Update e1becc is SETTING_UP_TABLES."
2023-04-12T20:00:24.496Z update_progress INFO "Update e1becc is RUNNING."
2023-04-12T20:00:24.497Z flow_progress   INFO "Flow 'sales_orders_raw' is QUEUED."
2023-04-12T20:00:24.586Z flow_progress   INFO "Flow 'sales_orders_raw' is STARTING."
2023-04-12T20:00:24.748Z flow_progress   INFO "Flow 'sales_orders_raw' is RUNNING."
2023-04-12T20:00:26.672Z flow_progress   INFO "Flow 'sales_orders_raw' has COMPLETED."
2023-04-12T20:00:27.753Z update_progress INFO "Update e1becc is COMPLETED."
```

Sad run:
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
2023-04-12T20:02:07.764Z update_progress INFO "Update 04b80e is INITIALIZING."
2023-04-12T20:02:07.870Z update_progress ERROR "Update 04b80e is FAILED."
Error: update failed
```
2023-04-14 12:21:44 +02:00
shreyas-goenka 3894d5796d
Add progress logging event for pipeline update URLs (#331)
## Changes
<!-- Summary of your changes that are easy to understand -->
Output now: 
```
shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo
The update can be found at https://e2-dogfood.staging.cloud.databricks.com/#joblist/pipelines/1cc605db-daab-4218-b38a-a63030e3eb03/updates/f92f2159-1141-47de-b1e2-1ca854b7238f

2023-04-12T20:41:19.813Z update_progress INFO "Update f92f21 is INITIALIZING."
2023-04-12T20:41:19.841Z update_progress INFO "Update f92f21 is SETTING_UP_TABLES."
2023-04-12T20:41:21.270Z update_progress INFO "Update f92f21 is RUNNING."
2023-04-12T20:41:21.271Z flow_progress   INFO "Flow 'sales_orders_raw' is QUEUED."
2023-04-12T20:41:21.349Z flow_progress   INFO "Flow 'sales_orders_raw' is STARTING."
2023-04-12T20:41:21.480Z flow_progress   INFO "Flow 'sales_orders_raw' is RUNNING."
2023-04-12T20:41:23.493Z flow_progress   INFO "Flow 'sales_orders_raw' has COMPLETED."
2023-04-12T20:41:25.484Z update_progress INFO "Update f92f21 is COMPLETED."
```

## Tests
<!-- How is this tested? -->
2023-04-14 11:11:30 +02:00
shreyas-goenka 417839021b
Add top level docs for bundle json schema (#313)
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>
2023-04-12 21:43:53 +02:00
Pieter Noordhuis b388f4a0dc
Make all workspace paths string fields (#327)
## Changes

These are unlikely to ever be DBFS paths so we can remove this level of indirection to simplify.

**Note:** this is a breaking change. Downstream usage of these fields must be updated.

## Tests

Existing tests pass.
2023-04-12 16:54:36 +02:00
Pieter Noordhuis 31ccebd62a
Store relative path to configuration file for every resource (#322)
## Changes

If a configuration file is located in a subdirectory of the bundle root,
files referenced from that configuration file should be relative to its
configuration file's directory instead of the bundle root.

## Tests

* New tests in `bundle/config/mutator/translate_paths_test.go`.
* Existing tests under `bundle/tests` pass and are augmented to assert
on paths.

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2023-04-12 16:17:13 +02:00
Miles Yucht 946906221d
Delete sync snapshots file when destroying a bundle (#323)
## Changes
This PR changes the files.Delete() mutator to delete the sync snapshots
file on destroy. This ensures that files will be uploaded when the
bundle is uploaded again.

## Tests
- [x] Manual test: Ran `bricks bundle destroy`, observed that the sync
snapshots file was deleted.
2023-04-11 16:57:01 +02:00
Pieter Noordhuis 42d29f92c9
Pass through $HOME when invoking Terraform (#319)
## Changes

This is useful when developing the Databricks Terraform provider where
you keep a local-only build of the provider and refer to it using $HOME
from `~/.terraformrc`, for example like this:

```
plugin_cache_dir = "$HOME/.terraform.d/plugin-cache"
```

## Tests

That $HOME is passed through cannot be tested as is because the
`tfexec.Terraform` struct doesn't expose it through public fields or
methods. What can be tested is a successful run of the initialize
mutator and this is included in this commit.
2023-04-11 13:11:31 +02:00
shreyas-goenka 4871f7bc8a
Add bundle destroy command (#300)
Adds bundle destroy capability to bricks
2023-04-06 12:54:58 +02:00
shreyas-goenka 6feaed4990
Fix host based auth conflicting with DEFAULT profile (#309)
## Changes
Consider the following host based configuration:
```
bundle:
  name: job_with_file_task

workspace:
  host: https://e2-dogfood.staging.cloud.databricks.com/
```

If you have a DEFAULT profile, then this host is ignored. The solution
proposed here is to remove the profile config loader if host is
explicitly specified in the bundle config.

This does come with a cost, namely that if a `DATABRICKS_CONFIG_PROFILE`
env var will be ignored, which maybe goes against unified auth spec

The ideal solution here is probably to make a change to go-SDK to not
select DEFAULT profile if host is not empty

## Tests
<!-- How is this tested? -->
2023-04-05 18:12:11 +02:00
Pieter Noordhuis d7ac265536
Allow use of file library in pipeline (#308)
## Changes

This requires databricks/databricks-sdk-go#359.

## Tests

Tests pass and ran manual verification of deployment with files.
2023-04-05 16:29:42 +02:00