These fields (key and values) needs to be double quoted in order for
yaml loader to read, parse and unmarshal it into Go struct correctly
because these fields are `map[string]string` type.
## Tests
Added regression unit and E2E tests
## Changes
Before this change, any error in a subtree would cause the entire
subtree to be dropped from the output.
This is not ideal when debugging, so instead we drop only the values
that cannot be normalized. Note that this doesn't change behavior if the
caller is properly checking the returned diagnostics for errors.
Note: this includes a change to use `dyn.InvalidValue` as opposed to
`dyn.NilValue` when returning errors.
## Tests
Added unit tests for the case where nested struct, map, or slice
elements contain an error.
## Changes
`executor.Exec` now uses `cmd.CombinedOutput`. Previous implementation
was hanging on my windows VM during `bundle deploy` on the
`ReadAll(MultiReader(stdout, stderr))` line.
The problem is related to the fact the MultiReader reads sequentially,
and the `stdout` is the first in line. Even simple `io.ReadAll(stdout)`
hangs on me, as it seems like the command that we spawn (python wheel
build) waits for the error stream to be finished before closing stdout
on its own side? Reading `stderr` (or `out`) in a separate go-routine
fixes the deadlock, but `cmd.CombinedOutput` feels like a simpler
solution.
Also noticed that Exec was not removing `scriptFile` after itself, fixed
that too.
## Tests
Unit tests and manually
## Changes
Not doing this means that the output struct is not a true representation
of the `dyn.Value` and unrepresentable state (e.g. unexported fields)
can be carried over across `convert.ToTyped` calls.
## Tests
Unit tests.
## Changes
This was an issue in cases where the typed structure contains a non-nil
pointer to an empty struct. After conversion to a `dyn.Value` and back
to the typed structure, the pointer became nil.
## Tests
Unit tests.
## Changes
References to keys that themselves are also variable references were
shortcircuited in the previous approach. This meant that certain fields
were resolved even if the lookup function would have instructed to skip
resolution.
To fix this we separate the memoization of resolved variable references
from the memoization of lookups. Now, every variable reference is passed
through the lookup function.
## Tests
Before this change, the new test failed with:
```
=== RUN TestResolveWithSkipEverything
[...]/libs/dyn/dynvar/resolve_test.go:208:
Error Trace: [...]/libs/dyn/dynvar/resolve_test.go:208
Error: Not equal:
expected: "${d} ${c} ${c} ${d}"
actual : "${b} ${a} ${a} ${b}"
Diff:
--- Expected
+++ Actual
@@ -1 +1 @@
-${d} ${c} ${c} ${d}
+${b} ${a} ${a} ${b}
Test: TestResolveWithSkipEverything
```
## Changes
Group bundle run flags by job and pipeline types
## Tests
```
Run a resource (e.g. a job or a pipeline)
Usage:
databricks bundle run [flags] KEY
Job Flags:
--dbt-commands strings A list of commands to execute for jobs with DBT tasks.
--jar-params strings A list of parameters for jobs with Spark JAR tasks.
--notebook-params stringToString A map from keys to values for jobs with notebook tasks. (default [])
--params stringToString comma separated k=v pairs for job parameters (default [])
--pipeline-params stringToString A map from keys to values for jobs with pipeline tasks. (default [])
--python-named-params stringToString A map from keys to values for jobs with Python wheel tasks. (default [])
--python-params strings A list of parameters for jobs with Python tasks.
--spark-submit-params strings A list of parameters for jobs with Spark submit tasks.
--sql-params stringToString A map from keys to values for jobs with SQL tasks. (default [])
Pipeline Flags:
--full-refresh strings List of tables to reset and recompute.
--full-refresh-all Perform a full graph reset and recompute.
--refresh strings List of tables to update.
--refresh-all Perform a full graph update.
Flags:
-h, --help help for run
--no-wait Don't wait for the run to complete.
Global Flags:
--debug enable debug logging
-o, --output type output type: text or json (default text)
-p, --profile string ~/.databrickscfg profile
-t, --target string bundle target to use (if applicable)
--var strings set values for variables defined in bundle config. Example: --var="foo=bar"
```
## Changes
This function could panic when either side of the comparison is a nil or
empty slice. This logic is triggered when comparing the input value to
the output value when calling `dyn.Map`.
## Tests
Unit tests.
## Changes
Adds the short_name helper function. short_name is useful when templates
do not want to print the full userName (typically email or service
principal application-id) of the current user.
## Tests
Integration test. Also adds integration tests for other helper functions
that interact with the Databricks API.
## Changes
Allow specifying executable in artifact section
```
artifacts:
test:
type: whl
executable: bash
...
```
We also skip bash found on Windows if it's from WSL because it won't be
correctly executed, see the issue above
Fixes#1159
## Changes
In the dynamic configuration, the nil value (dyn.NilValue) denotes a
value that should not be serialized, ie a value being nil is the same as
it not existing in the first place.
This is not true for zero values in maps and slices. This PR fixes the
conversion from typed values to dyn.Value, to treat zero values in maps
and slices as zero and not nil.
## Tests
Unit tests
## Changes
This PR:
Introduces `anyOf` to `skip_prompt_if`. This allows you to make OR
conditionals for skipping prompts during template initialization.
## Tests
Added unit test and confirmed existing ones still work. Also tested
manually.
---------
Co-authored-by: Shreyas Goenka <shreyas.goenka@databricks.com>
## Changes
This is the `dyn` counterpart to the `bundle/config/interpolation`
package.
It relies on the paths in `${foo.bar}` being valid `dyn.Path` instances.
It leverages `dyn.Walk` to get a complete picture of all variable
references and uses `dyn.Get` to retrieve values pointed to by variable
references.
Depends on #1142.
## Tests
Unit test coverage. I tried to mirror the tests from
`bundle/config/interpolation` and added new ones where applicable (for
example to test type retention of referenced values).
## Changes
This change adds the following functions:
* `dyn.Get(value, "foo.bar") -> (dyn.Value, error)`
* `dyn.Set(value, "foo.bar", newValue) -> (dyn.Value, error)`
* `dyn.Map(value, "foo.bar", func) -> (dyn.Value, error)`
And equivalent functions that take a previously constructed `dyn.Path`:
* `dyn.GetByPath(value, dyn.Path) -> (dyn.Value, error)`
* `dyn.SetByPath(value, dyn.Path, newValue) -> (dyn.Value, error)`
* `dyn.MapByPath(value, dyn.Path, func) -> (dyn.Value, error)`
Changes made by the "set" and "map" functions are never reflected in the
input argument; they return new `dyn.Value` instances for all nodes in
the path leading up to the changed value.
## Tests
New unit tests cover all critical paths.
## Changes
Now it's possible to generate bundle configuration for existing job.
For now it only supports jobs with notebook tasks.
It will download notebooks referenced in the job tasks and generate
bundle YAML config for this job which can be included in larger bundle.
## Tests
Running command manually
Example of generated config
```
resources:
jobs:
job_128737545467921:
name: Notebook job
format: MULTI_TASK
tasks:
- task_key: as_notebook
existing_cluster_id: 0704-xxxxxx-yyyyyyy
notebook_task:
base_parameters:
bundle_root: /Users/andrew.nester@databricks.com/.bundle/job_with_module_imports/development/files
notebook_path: ./entry_notebook.py
source: WORKSPACE
run_if: ALL_SUCCESS
max_concurrent_runs: 1
```
## Tests
Manual (on our last 100 jobs) + added end-to-end test
```
--- PASS: TestAccGenerateFromExistingJobAndDeploy (50.91s)
PASS
coverage: 61.5% of statements in ./...
ok github.com/databricks/cli/internal/bundle 51.209s coverage: 61.5% of
statements in ./...
```
## Changes
The nil value is a real valid value that we need to represent. To
accommodate this we introduced `dyn.KindInvalid` as the zero-value for
`dyn.Kind` (see #904), but did not yet update the comments on
`dyn.NilValue` or add tests for `kind.go`.
This also moves `KindNil` to be last in the definition order (least
likely to care about it).
## Tests
Tests pass.
## Changes
The file `value.go` had a couple `AsZZZ` and `MustZZZ` functions.
This change backfills missing versions and moves all of them to a
separate file.
## Tests
Tests pass; full coverage.
## Changes
This PR changes the default and `mode: production` recommendation to
target `/Users` for deployment. Previously, we used `/Shared`, but
because of a lack of POSIX-like permissions in WorkspaceFS this meant
that files inside would be readable and writable by other users in the
workspace.
Detailed change:
* `default-python` no longer uses a path that starts with `/Shared`
* `mode: production` no longer requires a path that starts with
`/Shared`
## Related PRs
Docs: https://github.com/databricks/docs/pull/14585
Examples: https://github.com/databricks/bundle-examples/pull/17
## Tests
* Manual tests
* Template unit tests (with an extra check to avoid /Shared)
## Changes
This PR adds retry logic to user input prompts, prompting users again if
the value does not match the requirements specified in the bundle
template schema.
## Tests
Manually. Here's an example UX. The first prompt expects an integer and
the second one a string made only from the letters "defg"
```
shreyas.goenka@THW32HFW6T cli % cli bundle init ~/mlops-stack
Please enter an integer [123]: abc
Validation failed: "abc" is not a integer
Please enter an integer [123]: 123
Please enter a string [dddd]: apple
Validation failed: invalid value for input_root_dir: "apple". Only characters the 'd', 'e', 'f', 'g' are allowed
```
## Changes
The name "dynamic value", or "dyn" for short, is more descriptive than
the opaque "config". Also, it conveniently does not alias with other
packages in the repository, or (popular ones) elsewhere.
(discussed with @andrewnester)
## Tests
n/a
## Changes
This change adds:
* A `config.Walk` function to walk a configuration tree
* A `config.Path` type to represent a value's path inside a tree
* Functions to create a `config.Path` from a string, or convert one to a
string
## Tests
Additional unit tests with full coverage.
## Changes
Instead of handling command chaining ourselves, we execute passed
commands as-is by storing them, in temp file and passing to correct
interpreter (bash or cmd) based on OS.
Fixes#1065
## Tests
Added unit tests
## Changes
Fixes nightly test `TestAccBundleInitErrorOnUnknownFields`.
`TestAccBundleInitErrorOnUnknownFields` has an interactive shell by
default so the test fails on waiting for prompt.
This was introduced in #1069.
## Tests
Nightly test succeed.
## Changes
If a user configures a workspace host in a bundle and wants to use the
"azure-cli" authentication type, we would still run profile resolution.
If the databrickscfg has a matching profile, we still load it, even
though it should be a fallback.
## Tests
* Unit test.
* Manually confirmed that setting `DATABRICKS_AUTH_TYPE=azure-cli` now
works as expected.
## Changes
- Tweak strings, documentation in template
- Extend requirements-dev.txt with setuptools/wheel for building whl
files
- Clarify what the "_job.yml" file is for for users who are only
interested in DLT pipelines (answering a question that came up recently)
## Tests
Existing tests exercise this template
## Changes
It wasn't working because it deferred to the regular `slog.TextHandler`
for the `WithAttr` and `WithGroup` functions. Both of these functions
don't mutate the handler but return a new one. When the top-level logger
called one of these, log records in that context used the standard
handler instead of ours.
To implement tracking of attributes and groups, I followed the guide at
https://github.com/golang/example/blob/master/slog-handler-guide/README.md
for writing custom handlers.
## Tests
The new tests demonstrate formatting through `t.Log` and look good.
## Changes
This PR introduces the `skip_prompt_if` extension to the jsonschema
library. If the inputs provided by the user match the JSON schema then
the prompt for that property is skipped.
Right now only constant checks are supported, but if in the future more
complicated conditionals are required, this can be extended to support
`allOf`, `oneOf`, `anyOf` etc allowing template authors to specify
conditionals of arbitary complexity.
## Tests
Unit tests and manually.
## Changes
This PR adds versioning for bundle templates. Right now there's only
logic for the maximum version of templates supported. At some point in
the future if we make a breaking template change we can also include a
minimum version of template supported by the CLI.
## Tests
Unit tests.
## Changes
Only clusters with their source attribute equal to `UI` or `API` should
be presented in the dropdown.
## Tests
Unit test and manual confirmation.
## Changes
If a struct has a field of type `config.Value`, then we set it to the
source value while converting a `config.Value` instance to a struct as
part of a call to `convert.ToTyped`.
This is convenient when dealing with deeply nested structs where
functions on inner structs need access to the metadata provided by their
corresponding `config.Value` (e.g. where they were defined).
## Tests
Added unit tests pass.
## Changes
A bug in the code that pulls the remote state could cause the local
state to be empty instead of a copy of the remote state. This happened
only if the local state was present and stale when compared to the
remote version.
We correctly checked for the state serial to see if the local state had
to be replaced but didn't seek back on the remote state before writing
it out. Because the staleness check would read the remote state in full,
copying from the same reader would immediately yield an EOF.
## Tests
* Unit tests for state pull and push mutators that rely on a mocked
filer.
* An integration test that deploys the same bundle from multiple paths,
triggering the staleness logic.
Both failed prior to the fix and now pass.
Adds better error message when input path is not a bundle template
before:
```
shreyas.goenka@THW32HFW6T bricks % cli bundle init ~/bricks
Error: open /Users/shreyas.goenka/bricks/databricks_template_schema.json: no such file or directory
```
after:
```
shreyas.goenka@THW32HFW6T bricks % cli bundle init ~/bricks
Error: expected to find a template schema file at /Users/shreyas.goenka/bricks/databricks_template_schema.json
```
## Changes
DLT currently doesn't always set `$PYTHONPATH` correctly (ES-947370).
This restores the original workaround to make new pipelines work while
that issue is being addressed. The workaround was removed in #832.
Manually tested.
## Changes
This PR is the counterpart to #904. With this change, we are able to
convert a `config.Value` into a Go struct, make modifications to the Go
struct, and reflect those changes in a new `config.Value`.
This functionality allows us to incrementally introduce this
configuration representation to existing bundle mutators. Bundle
mutators expect a `*bundle.Bundle` argument and mutate its configuration
directly. These mutations are not reflected in the corresponding
`config.Value` (once introduced), which means we cannot use the
`config.Value` as source of truth until we update _all_ mutators. To
address this, we can run `convert.ToTyped` and `convert.FromTyped` at
the mutator boundary (from `bundle.Apply`) and capture changes made to
the Go struct. Then we can incrementally make mutators aware of the
`config.Value` configuration and have them mutate that structure
directly.
## Tests
New unit tests pass.
Manual spot checks against the bundle configuration type.
## Changes
If args[0] == "." was provided to bundle init command, it would try to
resolve it as a built in template and error out.
## Tests
Manually
before:
```
shreyas.goenka@THW32HFW6T mlops-stack % cli bundle init .
Error: open /var/folders/lg/njll3hjx7pjcgxs6n7b290bw0000gp/T/templates3934264356/templates/databricks_template_schema.json: no such file or directory
```
after:
```
shreyas.goenka@THW32HFW6T mlops-stack % cli bundle init .
Welcome to MLOps Stacks. For detailed information on project generation, see the README at https://github.com/databricks/mlops-stacks/blob/main/README.md.
Project Name [my-mlops-project]: ^C
```