## Changes
Make `pydabs/venv_path` optional. When not specified, CLI detects the
Python interpreter using `python.DetectExecutable`, the same way as for
`artifacts`. `python.DetectExecutable` works correctly if a virtual
environment is activated or `python3` is available on PATH through other
means.
Extract the venv detection code from PyDABs into `libs/python/detect`.
This code will be used when we implement the `python/venv_path` section
in `databricks.yml`.
## Tests
Unit tests and manually
---------
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
## Changes
Some diagnostics can have multiple paths associated with them. For
instance, ensuring that unique resource keys are used across all
resources. This PR extends `diag.Diagnostic` to accept multiple paths.
This PR is symmetrical to
https://github.com/databricks/cli/pull/1610/files
## Tests
Unit tests
## Changes
This PR changes `diag.Diagnostics` to allow including multiple locations
associated with the diagnostic message. The diagnostics that now return
multiple locations with this PR are:
1. Warning for unknown keys in config.
2. Use of experimental.run_as
3. Accidental sync.exludes that exclude all files.
## Tests
Existing unit tests pass. New unit test case to assert on error message
when multiple locations are included.
Example output:
```
➜ bundle-playground-2 ~/cli2/cli/cli bundle validate
Warning: You are using the legacy mode of run_as. The support for this mode is experimental and might be removed in a future release of the CLI. In order to run the DLT pipelines in your DAB as the run_as user this mode changes the owners of the pipelines to the run_as identity, which requires the user deploying the bundle to be a workspace admin, and also a Metastore admin if the pipeline target is in UC.
at experimental.use_legacy_run_as
in resources.yml:10:22
databricks.yml:13:22
Name: fix run_if
Target: default
Workspace:
User: shreyas.goenka@databricks.com
Path: /Users/shreyas.goenka@databricks.com/.bundle/fix run_if/default
Found 1 warning
```
## Changes
This PR changes the location metadata associated with a `dyn.Value` to a
slice of locations. This will allow us to keep track of location
metadata across merges and overrides.
The convention is to treat the first location in the slice as the
primary location. Also, the semantics are the same as before if there's
only one location associated with a value, that is:
1. For complex values (maps, sequences) the location of the v1 is
primary in Merge(v1, v2)
2. For primitive values the location of v2 is primary in Merge(v1, v2)
## Tests
Modifying existing merge unit tests. Other existing unit tests and
integration tests pass.
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
PyDABs output can omit empty sequences/mappings because we don't track
them as optional. There is no semantic difference between empty and
missing, which makes omitting correct. CLI detects that we falsely
modify input resources by deleting all empty collections.
To handle that, we extend `dyn.Override` to allow visitors to ignore
certain deletes. If we see that an empty sequence or mapping is deleted,
we revert such delete.
## Tests
Unit tests
---------
Co-authored-by: Pieter Noordhuis <pcnoordhuis@gmail.com>
## Changes
Allow PyDABs to report `dyn.Diagnostics` by writing to
`diagnostics.json` supplied as an argument, similar to `input.json` and
`output.json`
Such errors are not yet properly printed in `databricks bundle
validate`, which will be fixed in a follow-up PR.
## Tests
Unit tests
## Changes
Replace stdin/stdout with files in `PythonMutator`. Files are created in
a temporary directory.
Rename `ApplyPythonMutator` to `PythonMutator`.
Add test for `dyn.Location` behavior during the "load" stage.
## Tests
Unit tests
## Changes
Add ApplyPythonMutator, which will fork the Python subprocess and
process pipe bundle configuration through it.
It's enabled through `experimental` section, for example:
```yaml
experimental:
pydabs:
enable: true
venv_path: .venv
```
For now, it's limited to two phases in the mutator pipeline:
- `load`: adds new jobs
- `init`: adds new jobs, or modifies existing ones
It's enforced that no jobs are modified in `load` and not jobs are
deleted in `load/init`, because, otherwise, it will break existing
assumptions.
## Tests
Unit tests