Commit Graph

149 Commits

Author SHA1 Message Date
Gleb Kanterov 25737bbb5d
Add regression tests for CLI error output (#1566)
## Changes
Add regression tests for https://github.com/databricks/cli/issues/1563

We test 2 code paths:
- if there is an error, we can print to stderr
- if there is a valid output, we can print to stdout

We should also consider adding black-box tests that will run the CLI
binary as a black box and inspect its output to stderr/stdout.

## Tests
Unit tests
2024-07-10 06:38:06 +00:00
shreyas-goenka 553fdd1e81
Serialize dynamic value for `bundle validate` output (#1499)
## Changes
Using dynamic values allows us to retain references like
`${resources.jobs...}` even when the type of field is not integer, eg:
`run_job_task`, or in general values that do not map to the Go types for
a field.

## Tests
Integration test
2024-06-18 15:04:20 +00:00
Pieter Noordhuis 448d41027d
Fix listing notebooks in a subdirectory (#1468)
## Changes

This worked fine if the notebooks are located in the filer's root and
didn't if they are nested in a directory.

This change adds test coverage and fixes the underlying issue.

## Tests

Ran integration test manually.
2024-06-04 09:53:14 +00:00
shreyas-goenka ec33a7c059
Add `filer.Filer` to read notebooks from WSFS without omitting their extension (#1457)
## Changes
This PR adds a filer that'll allow us to read notebooks from the WSFS
using their full paths (with the extension included). The filer relies
on the existing workspace filer (and consequently the workspace
import/export/list APIs).

Using this filer along with a virtual filesystem layer
(https://github.com/databricks/cli/pull/1452/files) will allow us to use
our custom implementation (which preserves the notebook extensions)
rather than the default mount available via DBR when the CLI is run from
DBR.

## Tests
Integration tests.

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-05-30 11:59:27 +00:00
Pieter Noordhuis 424499ec1d
Abstract over filesystem interaction with libs/vfs (#1452)
## Changes

Introduce `libs/vfs` for an implementation of `fs.FS` and friends that
_includes_ the absolute path it is anchored to.

This is needed for:
1. Intercepting file operations to inject custom logic (e.g., logging,
access control).
2. Traversing directories to find specific leaf directories (e.g.,
`.git`).
3. Converting virtual paths to OS-native paths.

Options 2 and 3 are not possible with the standard `fs.FS` interface.
They are needed such that we can provide an instance to the sync package
and still detect the containing `.git` directory and convert paths to
native paths.

This change focuses on making the following packages use `vfs.Path`:
* libs/fileset
* libs/git
* libs/sync

All entries returned by `fileset.All` are now slash-separated. This has
2 consequences:
* The sync snapshot now always uses slash-separated paths
* We don't need to call `filepath.FromSlash` as much as we did

## Tests

* All unit tests pass
* All integration tests pass
* Manually confirmed that a deployment made on Windows by a previous
version of the CLI can be deployed by a new version of the CLI while
retaining the validity of the local sync snapshot as well as the remote
deployment state.
2024-05-30 07:41:50 +00:00
Ilia Babanov 157877a152
Fix bundle destroy integration test (#1435)
I've updated the `deploy_then_remove_resources` test template in the
previous PR, but didn't notice that it was used in the destroy test too.
Now destroy test also checks deletion of jobs
2024-05-16 09:32:55 +00:00
Ilia Babanov 2035516fde
Don't merge-in remote resources during depolyments (#1432)
## Changes
`check_running_resources` now pulls the remote state without modifying
the bundle state, similar to how it was doing before. This avoids a
problem when we fail to compute deployment metadata for a deleted job
(which we shouldn't do in the first place)

`deploy_then_remove_resources_test` now also deploys and deletes a job
(in addition to a pipeline), which catches the error that this PR fixes.

## Tests
Unit and integ tests
2024-05-15 12:41:44 +00:00
Andrew Nester 1872aa12b3
Added support for job environments (#1379)
## Changes
The main changes are:
1. Don't link artifacts to libraries anymore and instead just iterate
over all jobs and tasks when uploading artifacts and update local path
to remote
2. Iterating over `jobs.environments` to check if there are any local
libraries and checking that they exist locally
3. Added tests to check environments are handled correctly

End-to-end test will follow up

## Tests
Added regression test, existing tests (including integration one) pass
2024-04-22 11:44:34 +00:00
Pieter Noordhuis cd675ded9a
Update `testutil` helpers to return path (#1383)
## Changes

I spotted a few call sites where the path of a test file was synthesized
multiple times. It is easier to capture the path as a variable and reuse
it.
2024-04-19 15:05:36 +00:00
shreyas-goenka e008c2bd8c
Cleanup remote file path on bundle destroy (#1374)
## Changes
The sync struct initialization would recreate the deleted `file_path`.
This PR moves to not initializing the sync object to delete the
snapshot, thus fixing the lingering `file_path` after `bundle destroy`.

## Tests
Manually, and a integration test to prevent regression.
2024-04-19 11:48:04 +00:00
Pieter Noordhuis 6e59b13452
Update Spark version in integration tests to 13.3 (#1375)
## Tests

Integration test run pending.
2024-04-19 11:31:54 +00:00
Andrew Nester 60a4a347f9
Fixed typo in error template for auth describe (#1341)
## Changes
Fixed typo in error template for auth describe

## Tests
Manually + added integration test
2024-04-08 11:19:13 +00:00
Andrew Nester 56e393c743
Allow specifying CLI version constraints required to run the bundle (#1320)
## Changes
Allow specifying CLI version constraints required to run the bundle

Example of configuration:

#### only allow specific version
```
bundle:
  name: my-bundle
  databricks_cli_version: "0.210.0"
```

#### allow all patch releases
```
bundle:
  name: my-bundle
  databricks_cli_version: "0.210.*"
```

#### constrain minimum version
```
bundle:
  name: my-bundle
  databricks_cli_version: ">= 0.210.0"
```

#### constrain range
```
bundle:
  name: my-bundle
  databricks_cli_version: ">= 0.210.0, <= 1.0.0"
```

For other examples see:
https://github.com/Masterminds/semver?tab=readme-ov-file#checking-version-constraints

Example error
```
sh-3.2$ databricks bundle validate
Error: Databricks CLI version constraint not satisfied. Required: >= 1.0.0, current: 0.216.0
```
## Tests
Added unit test cover all possible configuration permutations

---------

Co-authored-by: Lennart Kats (databricks) <lennart.kats@databricks.com>
2024-04-02 12:55:21 +00:00
Pieter Noordhuis 00d76d5afa
Move path field to bundle type (#1316)
## Changes

The bundle path was previously stored on the `config.Root` type under
the assumption that the first configuration file being loaded would set
it. This is slightly counterintuitive and we know what the path is upon
construction of the bundle. The new location for this property reflects
this.

## Tests

Unit tests pass.
2024-03-27 09:03:24 +00:00
Pieter Noordhuis ed194668db
Return `diag.Diagnostics` from mutators (#1305)
## Changes

This diagnostics type allows us to capture multiple warnings as well as
errors in the return value. This is a preparation for returning
additional warnings from mutators in case we detect non-fatal problems.

* All return statements that previously returned an error now return
`diag.FromErr`
* All return statements that previously returned `fmt.Errorf` now return
`diag.Errorf`
* All `err != nil` checks now use `diags.HasError()` or `diags.Error()`

## Tests

* Existing tests pass.
* I confirmed no call site under `./bundle` or `./cmd/bundle` uses
`errors.Is` on the return value from mutators. This is relevant because
we cannot wrap errors with `%w` when calling `diag.Errorf` (like
`fmt.Errorf`; context in https://github.com/golang/go/issues/47641).
2024-03-25 14:18:47 +00:00
Andrew Nester 9cf3dbe686
Use UserName field to identify if service principal is used (#1310)
## Changes
Use UserName field to identify if service principal is used

## Tests
Integration test passed
2024-03-25 11:32:45 +00:00
Andrew Nester d216404f27
Do CheckRunningResource only after terraform.Write (#1292)
## Changes
CheckRunningResource does `terraform.Show` which (I believe) expects
valid `bundle.tf.json` which is only written as part of
`terraform.Write` later.

With this PR order is changed.

Fixes #1286 

## Tests
Added regression E2E test
2024-03-18 15:39:18 +00:00
Andrew Nester 1b0ac61093
Added deployment state for bundles (#1267)
## Changes
This PR introduces new structure (and a file) being used locally and
synced remotely to Databricks workspace to track bundle deployment
related metadata.

The state is pulled from remote, updated and pushed back remotely as
part of `bundle deploy` command.

This state can be used for deployment sequencing as it's `Version` field
is monotonically increasing on each deployment.

Currently, it only tracks files being synced as part of the deployment.

This helps fix the issue with files not being removed during deployments
on CI/CD as sync snapshot was never present there.

Fixes #943 

## Tests
Added E2E (regression) test for files removal on CI/CD

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-03-18 14:41:58 +00:00
Andrew Nester e22dd8af7d
Enabled incorrectly skipped tests (#1280)
## Changes
Some integration tests were missing `TestAcc` prefix therefore were
skipped.

## Tests
Running integration tests
2024-03-14 12:56:21 +00:00
shreyas-goenka 5f29b5ecd9
Fix TestAccBundleInitOnMlopsStacks to work on aws-prod-ucws (#1283)
## Changes
aws-prod-ucws has CLOUD_ENV set to "ucws" which was failing the
validation checks in the template itself. This PR fixes the test.

## Tests
The tests pass now
2024-03-13 12:59:49 +00:00
shreyas-goenka d4329f470f
Add integration test for mlops-stacks initialization (#1155)
## Changes
This PR:
1. Adds an integration test for mlops-stacks that checks the
initialization and deployment of the project was successful.
2. Fixes a bug in the initialization of templates from non-tty. We need
to process the input parameters in order since their descriptions can
refer to input parameters that came before in the interactive UX.

## Tests
The integration test passes in CI.
2024-03-12 14:15:54 +00:00
Andrew Nester c7818560ca
Add usage string when command fails with incorrect arguments (#1276)
## Changes
Add usage string when command fails with incorrect arguments

Fixes #1119

## Tests
Example output

```
> databricks libraries cluster-status 
Error: accepts 1 arg(s), received 0

Usage:
  databricks libraries cluster-status CLUSTER_ID [flags]

Flags:
  -h, --help   help for cluster-status

Global Flags:
      --debug            enable debug logging
  -o, --output type      output type: text or json (default text)
  -p, --profile string   ~/.databrickscfg profile
  -t, --target string    bundle target to use (if applicable)
```
2024-03-12 14:12:34 +00:00
Miles Yucht b65ce75c1f
Use Go SDK Iterators when listing resources with the CLI (#1202)
## Changes
Currently, when the CLI run a list API call (like list jobs), it uses
the `List*All` methods from the SDK, which list all resources in the
collection. This is very slow for large collections: if you need to list
all jobs from a workspace that has 10,000+ jobs, you'll be waiting for
at least 100 RPCs to complete before seeing any output.

Instead of using List*All() methods, the SDK recently added an iterator
data structure that allows traversing the collection without needing to
completely list it first. New pages are fetched lazily if the next
requested item belongs to the next page. Using the List() methods that
return these iterators, the CLI can proactively print out some of the
response before the complete collection has been fetched.

This involves a pretty major rewrite of the rendering logic in `cmdio`.
The idea there is to define custom rendering logic based on the type of
the provided resource. There are three renderer interfaces:

1. textRenderer: supports printing something in a textual format (i.e.
not JSON, and not templated).
2. jsonRenderer: supports printing something in a pretty-printed JSON
format.
3. templateRenderer: supports printing something using a text template.

There are also three renderer implementations:

1. readerRenderer: supports printing a reader. This only implements the
textRenderer interface.
2. iteratorRenderer: supports printing a `listing.Iterator` from the Go
SDK. This implements jsonRenderer and templateRenderer, buffering 20
resources at a time before writing them to the output.
3. defaultRenderer: supports printing arbitrary resources (the previous
implementation).

Callers will either use `cmdio.Render()` for rendering individual
resources or `io.Reader` or `cmdio.RenderIterator()` for rendering an
iterator. This separate method is needed to safely be able to match on
the type of the iterator, since Go does not allow runtime type matches
on generic types with an existential type parameter.

One other change that needs to happen is to split the templates used for
text representation of list resources into a header template and a row
template. The template is now executed multiple times for List API
calls, but the header should only be printed once. To support this, I
have added `headerTemplate` to `cmdIO`, and I have also changed
`RenderWithTemplate` to include a `headerTemplate` parameter everywhere.

## Tests
- [x] Unit tests for text rendering logic
- [x] Unit test for reflection-based iterator construction.

---------

Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
2024-02-21 14:16:36 +00:00
shreyas-goenka 4ac1c1655b
Fix CLI nightlies on our UC workspaces (#1225)
This PR fixes some test helper functions so that they work properly on
our test environments for AWS and Azure UC workspaces.
2024-02-21 13:06:03 +00:00
shreyas-goenka 5ba0aaa5c5
Add support for UC Volumes to the `databricks fs` commands (#1209)
## Changes
```
shreyas.goenka@THW32HFW6T cli % databricks fs -h
Commands to do file system operations on DBFS and UC Volumes.

Usage:
  databricks fs [command]

Available Commands:
  cat         Show file content.
  cp          Copy files and directories.
  ls          Lists files.
  mkdir       Make directories.
  rm          Remove files and directories.
```

This PR adds support for UC Volumes to the fs commands. The fs commands
for UC volumes work the same as they currently do for DBFS. This is
ensured by running the same test matrix we across both DBFS and UC
Volumes versions of the fs commands.

## Tests
Support for UC volumes is tested by running the same tests as we did
originally for DBFS commands. The tests require a `main` catalog to
exist in the workspace, which does in our test workspaces environments
which have the `TEST_METASTORE_ID` environment variable set.

For the Files API filer, we do the same by running mostly common tests
to ensure the filers for "local", "wsfs", "dbfs" and "files API" are
consistent.

The tests are also made to all run in parallel to reduce the time taken.
To ensure the separation of the tests, each test creates its own UC
schema (for UC volumes tests) or DBFS directories (for DBFS tests).
2024-02-20 16:14:37 +00:00
Pieter Noordhuis 87dd46a3f8
Use dynamic configuration model in bundles (#1098)
## Changes

This is a fundamental change to how we load and process bundle
configuration. We now depend on the configuration being represented as a
`dyn.Value`. This representation is functionally equivalent to Go's
`any` (it is variadic) and allows us to capture metadata associated with
a value, such as where it was defined (e.g. file, line, and column). It
also allows us to represent Go's zero values properly (e.g. empty
string, integer equal to 0, or boolean false).

Using this representation allows us to let the configuration model
deviate from the typed structure we have been relying on so far
(`config.Root`). We need to deviate from these types when using
variables for fields that are not a string themselves. For example,
using `${var.num_workers}` for an integer `workers` field was impossible
until now (though not implemented in this change).

The loader for a `dyn.Value` includes functionality to capture any and
all type mismatches between the user-defined configuration and the
expected types. These mismatches can be surfaced as validation errors in
future PRs.

Given that many mutators expect the typed struct to be the source of
truth, this change converts between the dynamic representation and the
typed representation on mutator entry and exit. Existing mutators can
continue to modify the typed representation and these modifications are
reflected in the dynamic representation (see `MarkMutatorEntry` and
`MarkMutatorExit` in `bundle/config/root.go`).

Required changes included in this change:
* The existing interpolation package is removed in favor of
`libs/dyn/dynvar`.
* Functionality to merge job clusters, job tasks, and pipeline clusters
are now all broken out into their own mutators.

To be implemented later:
* Allow variable references for non-string types.
* Surface diagnostics about the configuration provided by the user in
the validation output.
* Some mutators use a resource's configuration file path to resolve
related relative paths. These depend on `bundle/config/paths.Path` being
set and populated through `ConfigureConfigFilePath`. Instead, they
should interact with the dynamically typed configuration directly. Doing
this also unlocks being able to differentiate different base paths used
within a job (e.g. a task override with a relative path defined in a
directory other than the base job).

## Tests

* Existing unit tests pass (some have been modified to accommodate)
* Integration tests pass
2024-02-16 19:41:58 +00:00
Andrew Nester e474948a4b
Generate correct YAML if custom_tags or spark_conf is used for pipeline or job cluster configuration (#1210)
These fields (key and values) needs to be double quoted in order for
yaml loader to read, parse and unmarshal it into Go struct correctly
because these fields are `map[string]string` type.

## Tests
Added regression unit and E2E tests
2024-02-15 15:03:19 +00:00
Andrew Nester 80670eceed
Added `bundle deployment bind` and `unbind` command (#1131)
## Changes
Added `bundle deployment bind` and `unbind` command.

This command allows to bind bundle-defined resources to existing
resources in Databricks workspace so they become DABs-managed.

## Tests
Manually + added E2E test
2024-02-14 18:04:45 +00:00
Pieter Noordhuis 4073e45d4b
Use mockery to generate mocks compatible with testify/mock (#1190)
## Changes

This is the same approach we use in the Go SDK.

## Tests

Tests pass.
2024-02-08 15:18:53 +00:00
Pieter Noordhuis f8b0f783ea
Use `acc.WorkspaceTest` helper from bundle integration tests (#1181)
## Changes

This helper:
* Constructs a context
* Constructs a `*databricks.WorkspaceClient`
* Ensures required environment variables are present to run an
integration test
* Enables debugging integration tests from VS Code

Debugging integration tests (from VS Code) is made possible by a prelude
in the helper that checks if the calling process is a debug binary, and
if so, sources environment variables from
`~/..databricks/debug-env.json` (if present).

## Tests

Integration tests still pass.

---------

Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
2024-02-07 11:18:56 +00:00
Pieter Noordhuis b64e11304c
Fix integration test with invalid configuration (#1182)
## Changes

The indentation mistake on the `path` field under `notebook` meant the
pipeline had a single entry with a `nil` notebook field. This was
allowed but incorrect.

While working on the `dyn.Value` approach, this yielded a non-nil but
zeroed `notebook` field and a failure to translate an empty path.

## Tests

Correcting the indentation made the test fail because the file is not a
notebook. I changed it to a `file` reference and the test now passes.
2024-02-07 10:53:50 +00:00
Pieter Noordhuis 33c446dadd
Refactor library to artifact matching to not use pointers (#1172)
## Changes

The approach to do this was:
1. Iterate over all libraries in all job tasks
2. Find references to local libraries
3. Store pointer to `compute.Library` in the matching artifact file to
signal it should be uploaded

This breaks down when introducing #1098 because we can no longer track
unexported state across mutators. The approach in this PR performs the
path matching twice; once in the matching mutator where we check if each
referenced file has an artifacts section, and once during artifact
upload to rewrite the library path from a local file reference to an
absolute Databricks path.

## Tests

Integration tests pass.
2024-02-05 15:29:45 +00:00
shreyas-goenka cb3ad737f1
Add short_name helper function to bundle init templates (#1167)
## Changes
Adds the short_name helper function. short_name is useful when templates
do not want to print the full userName (typically email or service
principal application-id) of the current user.

## Tests
Integration test. Also adds integration tests for other helper functions
that interact with the Databricks API.
2024-02-01 16:46:07 +00:00
Andrew Nester b28432afed
Add `--key` flag for generate commands to specify resource key (#1165)
## Changes
Add --key for generate commands to specify resource key.

Also, resource config files are now not prefixed anymore.

## Tests
Integration tests passed

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-01-31 10:23:35 +00:00
Andrew Nester f269f8015d
Added `bundle generate pipeline` command (#1139)
## Changes
Added `bundle generate pipeline` command

Usage as the following

```
databricks bundle generate pipeline --existing-pipeline-id f3b8c580-0a88-4b55-xxxx-yyyyyyyyyy
```

## Tests
Manually + added E2E test
2024-01-25 11:35:14 +00:00
Andrew Nester 7067782cf1
Fixed path matching for Windows in generate job test (#1132)
## Changes
Fixed path matching for Windows in generate job test
2024-01-19 08:05:59 +00:00
Andrew Nester 70fe0e36ef
Added `databricks bundle generate job` command (#1043)
## Changes
Now it's possible to generate bundle configuration for existing job.
For now it only supports jobs with notebook tasks.

It will download notebooks referenced in the job tasks and generate
bundle YAML config for this job which can be included in larger bundle.

## Tests
Running command manually

Example of generated config
```
resources:
  jobs:
    job_128737545467921:
      name: Notebook job
      format: MULTI_TASK
      tasks:
        - task_key: as_notebook
          existing_cluster_id: 0704-xxxxxx-yyyyyyy
          notebook_task:
            base_parameters:
              bundle_root: /Users/andrew.nester@databricks.com/.bundle/job_with_module_imports/development/files
            notebook_path: ./entry_notebook.py
            source: WORKSPACE
          run_if: ALL_SUCCESS
      max_concurrent_runs: 1
 ```

## Tests
Manual (on our last 100 jobs) + added end-to-end test

```
--- PASS: TestAccGenerateFromExistingJobAndDeploy (50.91s)
PASS
coverage: 61.5% of statements in ./...
ok github.com/databricks/cli/internal/bundle 51.209s coverage: 61.5% of
statements in ./...
```
2024-01-17 14:26:33 +00:00
shreyas-goenka 2c0d06715c
Fix windows style file paths in fs cp command (#1118)
## Changes
Copying a local file in windows to remote directory in DBFS would fail
if the path was specified as a windows style path (compared to a UNIX
style path). This PR fixes that.

Note, UNIX style paths will continue to work because `filepath.Base`
respects both `/` and `\` as file separators. See: `IsPathSeparator` in
https://go.dev/src/os/path_windows.go.

Fixes issue: https://github.com/databricks/cli/issues/1109.

## Tests
Integration test and manually
```
C:\Users\shreyas.goenka>Desktop\cli.exe fs cp .\Desktop\foo.txt dbfs:/Users/shreyas.goenka@databricks.com
.\Desktop\foo.txt -> dbfs:/Users/shreyas.goenka@databricks.com/foo.txt

C:\Users\shreyas.goenka>Desktop\cli.exe fs cat  dbfs:/Users/shreyas.goenka@databricks.com/foo.txt
hello, world
````
2024-01-11 18:49:42 +00:00
Andrew Nester 3e6e04831f
Fixed storage-credentials list command in text output (#1094)
## Changes
Fixes #1029 
Closes #910 

## Tests
Added regression test
2024-01-02 07:24:51 +00:00
Andrew Nester 83d50001fc
Pass parameters to task when run with `--python-params` and `python_wheel_wrapper` is true (#1037)
## Changes
It makes the behaviour consistent with or without `python_wheel_wrapper`
on when job is run with `--python-params` flag.

In `python_wheel_wrapper` mode it converts dynamic `python_params` in a
dynamic specially named `notebook_param` and the wrapper reads them with
`dbutils` and pass to `sys.argv`

Fixes #1000

## Tests
Added an integration test.

Integration tests pass.
2023-12-01 10:35:20 +00:00
Andrew Nester 5431174302
Do not add wheel content hash in uploaded Python wheel path (#1015)
## Changes
Removed hash from the upload path since it's not useful anyway.

The main reason for that change was to make it work on all-purpose
clusters. But in order to make it work, wheel version needs to be
increased anyway. So having only hash in path is useless.

Note: using --build-number (build tag) flag does not help with
re-installing libraries on all-purpose clusters. The reason is that
`pip` ignoring build tag when upgrading the library and only look at
wheel version.
Build tag is only used for sorting the versions and the one with higher
build tag takes priority when installed. It only works if no library is
installed.
See
a15dd75d98/src/pip/_internal/index/package_finder.py (L522-L556)
https://github.com/pypa/pip/issues/4781

Thus, the only way to reinstall the library on all-purpose cluster is to
increase wheel version manually or use automatic version generation,
f.e.
```
setup(
  version=datetime.datetime.utcnow().strftime("%Y%m%d.%H%M%S"),
  ...
)
```

## Tests
Integration tests passed.
2023-11-29 10:40:12 +00:00
Pieter Noordhuis 6187803007
Correctly overwrite local state if remote state is newer (#1008)
## Changes

A bug in the code that pulls the remote state could cause the local
state to be empty instead of a copy of the remote state. This happened
only if the local state was present and stale when compared to the
remote version.

We correctly checked for the state serial to see if the local state had
to be replaced but didn't seek back on the remote state before writing
it out. Because the staleness check would read the remote state in full,
copying from the same reader would immediately yield an EOF.

## Tests

* Unit tests for state pull and push mutators that rely on a mocked
filer.
* An integration test that deploys the same bundle from multiple paths,
triggering the staleness logic.

Both failed prior to the fix and now pass.
2023-11-24 11:15:46 +00:00
shreyas-goenka 0c837e5772
Make `file_path` and `artifact_path` fields consistent with json tag (#987)
## Changes
This PR:
1. Renames `FilesPath` -> `FilePath` and `ArtifactsPath` ->
`ArtifactPath` in the bundle and metadata configuration to make them
consistant with the json tags.
2. Fixes development / production mode error messages to point to
`file_path` and `artifact_path`

## Tests
Existing unit tests. This is a strightforward renaming of the fields.
2023-11-15 13:37:26 +00:00
shreyas-goenka f208853626
Fix integration test asserting errors on unknown template parameters (#977)
## Changes
Recent descriptions were made mandatory for input parameters so this
test started failing.

## Tests
The test passes now.
2023-11-10 11:05:32 +00:00
Serge Smertin 56bcb6f833
Make Cobra runner compatible with testing interactive flows (#957)
## Changes
This PR enables testing commands with stdin

## Tests
https://github.com/databricks/cli/pull/914
2023-11-07 19:06:27 +00:00
shreyas-goenka d70d7445c4
Remove resolution of repo names against the Databricks Github account (#940)
## Changes
This functionality is not exercised (and will not be anytime soon).
Instead we use a map to have first party aliases for supported
templates.


1e46b9f88a/cmd/bundle/init.go (L21)

## Tests
Existing tests and manually, bundle init still works.
2023-11-01 13:02:06 +00:00
shreyas-goenka 5a8cd0c5bc
Persist deployment metadata in WSFS (#845)
## Changes

This PR introduces a metadata struct that stores a subset of bundle
configuration that we wish to expose to other Databricks services that
wish to integrate with bundles.

This metadata file is uploaded to a file
`${bundle.workspace.state_path}/metadata.json` in the WSFS destination
of the bundle deployment.

Documentation for emitted metadata fields:
* `version`: Version for the metadata file schema
* `config.bundle.git.branch`: Name of the git branch the bundle was
deployed from.
* `config.bundle.git.origin_url`: URL for git remote "origin"
* `config.bundle.git.bundle_root_path`: Relative path of the bundle root
from the root of the git repository. Is set to "." if they are the same.
* `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this
bundle was deployed from. Note, the deployment might not exactly match
this commit version if there are changes that have not been committed to
git at deploy time,
* `file_path`: Path in workspace where we sync bundle files to. 
* `resources.jobs.[job-ref].id`: Id of the job
* `resources.jobs.[job-ref].relative_path`: Relative path of the yaml
config file from the bundle root where this job was defined.

Example metadata object when bundle root and git root are the same:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "master",
        "origin_url": "www.host.com",
        "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0",
        "bundle_root_path": "."
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```

Example metadata when the git root is one level above the bundle repo:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "dev-branch",
        "origin_url": "www.my-repo.com",
        "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b",
        "bundle_root_path": "pipeline-progress"
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```


This unblocks integration to the jobs break glass UI for bundles.

## Tests
Unit tests and integration tests.
2023-10-27 12:55:43 +00:00
Andrew Nester 4ce279e386
Added test for tasks with python wheel wrapper on (#897)
## Changes
Added test for tasks with python wheel wrapper on

## Tests
```
2023/10/20 16:42:07 [INFO] Listing secrets from ...
=== RUN   TestAccPythonWheelTaskDeployAndRunWithWrapper
    python_wheel_test.go:13: aws
    helpers.go:43: Configuration for template:  {"node_type_id":"i3.xlarge","python_wheel_wrapper":true,"spark_version":"12.2.x-scala2.12","unique_id":"224a58a5-7ecb-4e7a-9c89-c7f5ea57924e"}
 ...
Resource deployment completed!
Run URL: ...

2023-10-20 16:42:33 "[default] Test Wheel Job 224a58a5-7ecb-4e7a-9c89-c7f5ea57924e" RUNNING 
2023-10-20 16:47:27 "[default] Test Wheel Job 224a58a5-7ecb-4e7a-9c89-c7f5ea57924e" TERMINATED SUCCESS 
    helpers.go:169: [databricks stdout]: Hello from my func
    helpers.go:169: [databricks stdout]: Got arguments:
    helpers.go:169: [databricks stdout]: ['my_test_code', 'one', 'two']
...
--- PASS: TestAccPythonWheelTaskDeployAndRunWithWrapper (321.61s)
PASS
coverage: 93.5% of statements in ./...
ok      github.com/databricks/cli/internal/bundle       322.307s        coverage: 93.5% of statements in ./...
```
2023-10-20 15:03:29 +00:00
shreyas-goenka 5712845329
Make default dev semver a const (#891)
## Changes
<!-- Summary of your changes that are easy to understand -->

## Tests
<!-- How is this tested? -->
2023-10-19 18:56:54 +00:00
shreyas-goenka 3700785dfa
Add support for validating CLI version when loading a jsonschema object (#883)
## Changes
Updates to bundle templates can require updated versions of the CLI.
This PR extends the JSON schema representation to allow template authors
to set a min CLI version they require for their templates.

This is required to make improvements/additions to the mlops-stacks repo

## Tests
Tested using unit tests and manually. 

For manualy testing, I created a custom build of the CLI using go
releaser and then tested it against a local instance of mlops-stack
When mlops-stack schema has:
```
  "min_databricks_cli_version": "v5000.1.1",
```

output (error as expected)
```
shreyas.goenka@THW32HFW6T bricks % ./dist/cli_darwin_arm64/databricks bundle init  ~/mlops-stack
Error: minimum CLI version "v5000.1.1" is greater than current CLI version "v0.207.2-dev+1b992c0". Please upgrade your current Databricks CLI
```

When the mlops-stack schema has:
```
  "min_databricks_cli_version": "v0.1.1",
```

output (validation passes)
```
shreyas.goenka@THW32HFW6T bricks % ./dist/cli_darwin_arm64/databricks bundle init  ~/mlops-stack
Welcome to MLOps Stack. For detailed information on project generation, see the README at https://github.com/databricks/mlops-stack/blob/main/README.md.

Project Name [my-mlops-project]: ^C
```
2023-10-19 14:01:48 +00:00