## Changes
- Instead of constructing chains of mutators and then executing them,
execute them directly.
- Remove functionality related to chain-building: Seq, If, Defer,
newPhase, logString.
- Phases become functions that apply the changes directly rather than
construct mutator chains that will be called later.
- Add a helper ApplySeq to call multiple mutators, use it where
Apply+Seq were used before.
This is intended to be a refactoring without functional changes, but
there are a few behaviour changes:
- Since defer() is used to call unlock instead of bundle.Defer()
unlocking will now happen even in case of panics.
- In --debug, the phase names are are still logged once before start of
the phase but each entry no longer has 'seq' or phase name in it.
- The message "Deployment complete!" was printed even if
terraform.Apply() mutator had an error. It no longer does that.
## Motivation
The use of the chains was necessary when mutators were returning a list
of other mutators instead of calling them directly. But that has since
been removed, so now the chain machinery have no purpose anymore.
Use of direct functions simplifies the logic and makes bugs more
apparent and easy to fix.
Other improvements that this unlocks:
- Simpler stacktraces/debugging (breakpoints).
- Use of functions with narrowly scoped API: instead of mutators that
receive full bundle config, we can use focused functions that only deal
with sections they care about prepareGitSettings(currentGitSection) ->
updatedGitSection. This makes the data flow more apparent.
- Parallel computations across mutators (within phase): launch
goroutines fetching data from APIs at the beggining, process them once
they are ready.
## Tests
Existing tests.
## Changes
This PR:
1. Incrementally improves the error messages shown to the user when the
volume they are referring to in `workspace.artifact_path` does not
exist.
2. Performs this validation in both `bundle validate` and `bundle
deploy` compared to before on just deployments.
3. It runs "fast" validations on `bundle deploy`, which earlier were
only run on `bundle validate`.
## Tests
Unit tests and manually. Also, existing integration tests provide
coverage (`TestUploadArtifactToVolumeNotYetDeployed`,
`TestUploadArtifactFileToVolumeThatDoesNotExist`)
Examples:
```
.venv➜ bundle-playground git:(master) ✗ cli bundle validate
Error: cannot access volume capital.whatever.my_volume: User does not have READ VOLUME on Volume 'capital.whatever.my_volume'.
at workspace.artifact_path
in databricks.yml:7:18
```
and
```
.venv➜ bundle-playground git:(master) ✗ cli bundle validate
Error: volume capital.whatever.foobar does not exist
at workspace.artifact_path
resources.volumes.foo
in databricks.yml:7:18
databricks.yml:12:7
You are using a volume in your artifact_path that is managed by
this bundle but which has not been deployed yet. Please first deploy
the volume using 'bundle deploy' and then switch over to using it in
the artifact_path.
```
## Changes
This PR adds support for UC volumes to DABs.
### Can I use a UC volume managed by DABs in `artifact_path`?
Yes, but we require the volume to exist before being referenced in
`artifact_path`. Otherwise you'll see an error that the volume does not
exist. For this case, this PR also adds a warning if we detect that the
UC volume is defined in the DAB itself, which informs the user to deploy
the UC volume in a separate deployment first before using it in
`artifact_path`.
We cannot create the UC volume and then upload the artifacts to it in
the same `bundle deploy` because `bundle deploy` always uploads the
artifacts to `artifact_path` before materializing any resources defined
in the bundle. Supporting this in a single deployment requires us to
migrate away from our dependency on the Databricks Terraform provider to
manage the CRUD lifecycle of DABs resources.
### Why do we not support `preset.name_prefix` for UC volumes?
UC volumes will not have a `dev_shreyas_goenka` prefix added in `mode:
development`. Configuring `presets.name_prefix` will be a no-op for UC
volumes. We have decided not to support prefixing for UC resources. This
is because:
1. UC provides its own namespace hierarchy that is independent of DABs.
2. Users can always manually use `${workspace.current_user.short_name}`
to configure the prefixes manually.
Customers often manually set up a UC hierarchy for dev and prod,
including a schema or catalog per developer. Thus, it's often
unnecessary for us to add prefixing in `mode: development` by default
for UC resources.
In retrospect, supporting prefixing for UC schemas and registered models
was a mistake and will be removed in a future release of DABs.
## Tests
Unit, integration test, and manually.
### Manual Testing cases:
1. UC volume does not exist:
```
➜ bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/my_volume that is configured in the artifact_path: Not Found
```
2. UC Volume does not exist, but is defined in the DAB
```
➜ bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/managed_by_dab that is configured in the artifact_path: Not Found
Warning: You might be using a UC volume in your artifact_path that is managed by this bundle but which has not been deployed yet. Please deploy the UC volume in a separate bundle deploy before using it in the artifact_path.
at resources.volumes.bar
in databricks.yml:24:7
```
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
Library glob expansion happens during deployment. Before that, all
entries that refer to local paths in resource definitions are made
relative to the _sync root_. Before #1694, they were made relative to
the _bundle root_. This PR didn't update the library glob expansion code
to use the sync root path.
If you were using the sync paths setting with library globs, the CLI
would fail to expand the globs because the code was using the wrong path
to anchor those globs.
This change fixes the issue.
## Tests
Manually confirmed that this fixes the issue reported in #1755.
## Changes
Previously for all the libraries referenced in configuration DABs made
sure that there is corresponding artifact section.
But this is not really necessary and flexible, because local libraries
might be built outside of dabs context.
It also created difficult to follow logic in code where we back
referenced libraries to artifacts which was difficult to fllow
This PR does 3 things:
1. Allows all local libraries referenced in DABs config to be uploaded
to remote
2. Simplifies upload and glob references expand logic by doing this in
single place
3. Speed things up by uploading library only once and doing this in
parallel
## Tests
Added unit + integration tests + made sure that change is backward
compatible (no changes in existing tests)
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>