## Changes
#629 introduced a change to autopopulate the host from .databrickscfg if
the user is logging back into a host they were previously using. This
did not respect the DATABRICKS_CONFIG_FILE env variable, causing the
flow to stop working for users with no .databrickscfg file in their home
directory.
This PR refactors all config file loading to go through one interface,
`databrickscfg.GetDatabricksCfg()`, and an auxiliary
`databrickscfg.GetDatabricksCfgPath()` to get the configured file path.
Closes#655.
## Tests
```
$ databricks auth login --profile abc
Error: open /Users/miles/.databrickscfg: no such file or directory
$ ./cli auth login --profile abc
Error: cannot load Databricks config file: open /Users/miles/.databrickscfg: no such file or directory
$ DATABRICKS_CONFIG_FILE=~/.databrickscfg.bak ./cli auth login --profile abc
Databricks Host: https://asdf
```
## Changes
This PR adds two features:
1. The bundle init command
2. Support for prompting for input values
In order to do this, this PR also introduces a new `config` struct which
handles reading config files, prompting users and all validation steps
before we materialize the template
With this PR users can start authoring custom templates, based on go
text templates, for their projects / orgs.
## Tests
Unit tests, both existing and new
## Changes
A pretty annoying part of the current CLI experience is that when
logging in with `databricks auth login`, you always need to type the
name of the host. This seems unnecessary if you have already logged into
a host before, since the CLI can read the previous host from your
`.databrickscfg` file.
This change handles this case by setting the host if unspecified to the
host in the corresponding profile. Combined with autocomplete, this
makes the login process simple:
```
databricks auth login --profile prof<tab><enter>
```
## Tests
Logged in to an existing profile by running the above command (but for a
real profile I had).
## Changes
This checks whether the Git settings are consistent with the actual Git
state of a source directory.
(This PR adds to https://github.com/databricks/cli/pull/577.)
Previously, we would silently let users configure their Git branch to
e.g. `main` and deploy with that metadata even if they were actually on
a different branch.
With these changes, the following config would result in an error when
deployed from any other branch than `main`:
```
bundle:
name: example
workspace:
git:
branch: main
environments:
...
```
> not on the right Git branch:
> expected according to configuration: main
> actual: my-feature-branch
It's not very useful to set the same branch for all environments,
though. For development, it's better to just let the CLI auto-detect the
right branch. Therefore, it's now possible to set the branch just for a
single environment:
```
bundle:
name: example 2
environments:
development:
default: true
production:
# production can only be deployed from the 'main' branch
git:
branch: main
```
Adding to that, the `mode: production` option actually checks that users
explicitly set the Git branch as seen above. Setting that branch helps
avoid mistakes, where someone accidentally deploys to production from
the wrong branch. (I could see us offering an escape hatch for that in
the future.)
# Testing
Manual testing to validate the experience and error messages. Automated
unit tests.
---------
Co-authored-by: Fabian Jakobs <fabian.jakobs@databricks.com>
## Changes
This removes the remaining dependency on global state and unblocks work
to parallelize integration tests. As is, we can already uncomment an
integration test that had to be skipped because of other tests tainting
global state. This is no longer an issue.
Also see #595 and #606.
## Tests
* Unit and integration tests pass.
* Manually confirmed the help output is the same.
## Changes
This change is another step towards a CLI without globals. Also see #595.
The flags for the root command are now encapsulated in struct types.
## Tests
Unit tests pass.
## Changes
The assertions would fail because `DATABRICKS_TOKEN` overrides a token
set in the profile.
## Tests
Tests now pass if `DATABRICKS_TOKEN` is set.
## Changes
Generated commands relied on global variables for flags and request
payloads. This is difficult to test if a sequence of tests tries to run
the same command with various arguments because the global state causes
test interference. Moreover, it is impossible to run tests in parallel.
This change modifies the approach and turns every command group and
command itself into a function that returns a `*cobra.Command`. All
flags and request payloads are variables scoped to the command's
initialization function. This means it is possible to construct
independent copies of the CLI structure and fixes the test isolation
issue.
The scope of this change is only the generated commands. The other
commands will be changed accordingly in subsequent changes.
## Tests
Unit and integration tests pass.
## Changes
* Add support for using `databricks.yml` as config file. If
`databricks.yml` is not found then falling back to `bundle.yml` for
backwards compatibility.
* Add support for `.yaml` extension.
* Give an error when more than one config file is found
## Tests
* added unit test
* manual testing the different cases
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
Currently, `databricks auth login` is difficult to use. If a user types
this command in, the command fails with
```
Error: init: cannot fetch credentials
```
after prompting for a profile name.
To make this experience smoother, this change ensures that the host, and
if necessary, the account ID, are prompted for input from the user if
they aren't provided on the CLI.
## Tests
Manual tests:
```
$ ./cli auth token
Databricks Host: https://<HOST>.staging.cloud.databricks.com
{
"access_token": "...",
"token_type": "Bearer",
"expiry": "2023-07-11T12:56:59.929671+02:00"
}
$ ./cli auth login
Databricks Host: https://<HOST>.staging.cloud.databricks.com
Databricks Profile Name: <HOST>-test
Profile <HOST>-test was successfully saved
$ ./cli auth login
Databricks Host: https://accounts.cloud.databricks.com
Databricks Account ID: <ACCOUNTID>
Databricks Profile Name: ACCOUNT-<ACCOUNTID>-test
Profile ACCOUNT-<ACCOUNTID>-test was successfully saved
```
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
Correctly use --profile flag passed for all bundle commands.
Also adds a validation that if bundle configured host mismatches
provided profile, it throws an error.
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
Currently, `databricks --profile <TAB>` autocompletes with the shell
default behavior, listing files in the local directory. This is not a
great experience. Especially given that the suggested profile names for
accounts are so long, it can be cumbersome to type them out by hand.
This PR configures autocompletion for `--profile` to inspect the
profiles of ~/.databrickscfg.
One potential improvement is to filter the response based on whether the
command is known to be account-level or workspace-level.
## Tests
Manual test.
<img width="579" alt="Screenshot_11_07_2023__18_31"
src="https://github.com/databricks/cli/assets/1850319/d7a3acd0-2511-45ac-bd82-95567775c10a">
This implements the "development run" functionality that we desire for DABs in the workspace / IDE.
## bundle.yml changes
In bundle.yml, there should be a "dev" environment that is marked as
`mode: debug`:
```
environments:
dev:
default: true
mode: development # future accepted values might include pull_request, production
```
Setting `mode` to `development` indicates that this environment is used
just for running things for development. This results in several changes
to deployed assets:
* All assets will get '[dev]' in their name and will get a 'dev' tag
* All assets will be hidden from the list of assets (future work; e.g.
for jobs we would have a special job_type that hides it from the list)
* All deployed assets will be ephemeral (future work, we need some form
of garbage collection)
* Pipelines will be marked as 'development: true'
* Jobs can run on development compute through the `--compute` parameter
in the CLI
* Jobs get their schedule / triggers paused
* Jobs get concurrent runs (it's really annoying if your runs get
skipped because the last run was still in progress)
Other accepted values for `mode` are `default` (which does nothing) and
`pull-request` (which is reserved for future use).
## CLI changes
To run a single job called "shark_sighting" on existing compute, use the
following commands:
```
$ databricks bundle deploy --compute 0617-201942-9yd9g8ix
$ databricks bundle run shark_sighting
```
which would deploy and run a job called "[dev] shark_sightings" on the
compute provided. Note that `--compute` is not accepted in production
environments, so we show an error if `mode: development` is not used.
The `run --deploy` command offers a convenient shorthand for the common
combination of deploying & running:
```
$ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix
$ bundle run --deploy shark_sightings
```
The `--deploy` addition isn't really essential and I welcome feedback 🤔
I played with the idea of a "debug" or "dev" command but that seemed to
only make the option space even broader for users. The above could work
well with an IDE or workspace that automatically sets the target
compute.
One more thing I added is`run --no-wait` can now be used to run
something without waiting for it to be completed (useful for IDE-like
environments that can display progress themselves).
```
$ bundle run --deploy shark_sightings --no-wait
```
## Changes
Two issues with this command:
* The command line arguments for the secret value were ignored
* If the secret value was piped through stdin, it would still prompt
The second issue prevented users from using multi-line strings because
the prompt reads until end-of-line.
This change adds testing infrastructure for:
* Setting up a workspace focused test (common between many tests)
* Running a snippet of Python through the command execution API
Porting more integration tests to use this infrastructure will be done
in later commits.
## Tests
New integration test passes.
The interactive path cannot be integration tested just yet.
## Changes
When there are positional required parameters in the command which can't
be unmarshalled from JSON, we should require them despite the fact
`--json` flag is provided.
The reason is that for some of the command, for example, `databricks
groups patch ID` these arguments are actually path arguments in API and
can't be set as part of `--json` body provided.
Original change which introduced this ignore logic is here:
https://github.com/databricks/cli/pull/405
Fixes https://github.com/databricks/cli/issues/533,
https://github.com/databricks/cli/issues/537
Note: Code generation is based on the change in this PR:
https://github.com/databricks/databricks-sdk-go/pull/536
## Tests
1. Running `cli groups patch 123 --json {...}` works correctly
Backward compatibility tests with previous changes from
https://github.com/databricks/cli/pull/405
1. `cli clusters events --json '{"cluster_id": "1029-xxxx"}'` - works,
returns list of events
2. `cli clusters events 1029-xxxx` - works, returns list of events
3. `cli clusters events` - works, first prompts for Cluster ID and then
returns the list of events
## Changes
Also see #525.
The direct download flag has been removed in newer versions because of
the content type issue.
Instead, we can make the command decode the base64 output when the
output mode is text.
```
$ databricks workspace export /some/path/script.sh
#!/bin/bash
echo "this is a script"
```
## Tests
New integration test.
## Changes
Adds the dbfs:/ prefix to paths output by the cp command so they can be
used with the CLI
## Tests
Manually
Currently there are no integration tests for command output, I'll add
them in separately
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
Added configure-cluster flag for auth login which will allow to
configure cluster ID and save it in Databricks profile
Note: the build will fail until this one is merged and released
https://github.com/databricks/databricks-sdk-go/pull/524
## Tests
```
andrew.nester@HFW9Y94129 cli % ./cli auth login https://xxxxxxx.databricks.com --configure-cluster
✔ Databricks Profile Name: my-profile█
Search: █
? Choose cluster:
10.1 ML beta (1029-yyyyy-xxxxxx)
10.5 ML standard cluster
12.2 LTS
↓ 13.1 free for all
andrew.nester@HFW9Y94129 cli % cat ~/.databrickscfg
[DEFAULT]
host = https://xxxxx.databricks.com
cluster_id = 1029-xxxxx-yyyyy
auth_type = databricks-cli
```
## Changes
Fixed jobs create command to only accept JSON payload.
Note: relies on this PR from Go SDK
https://github.com/databricks/databricks-sdk-go/pull/522
## Tests
```
andrew.nester@HFW9Y94129 cli % ./cli jobs create -h
Create a new job.
Create a new job.
Usage:
databricks jobs create [flags]
Flags:
-h, --help help for create
--json JSON either inline JSON string or @path/to/file.json with request body (default JSON (0 bytes))
Global Flags:
-e, --environment string bundle environment to use (if applicable)
--log-file file file to write logs to (default stderr)
--log-format type log output format (text or json) (default text)
--log-level format log level (default disabled)
-o, --output type output type: text or json (default text)
-p, --profile string ~/.databrickscfg profile
--progress-format format format for progress logs (append, inplace, json) (default default)
```
## Tests
New integration test for the read/write parts of the other filers. The
integration test cannot be shared just yet because the Files API doesn't
include support for creating/listing/removing directories yet.
## Changes
`--force` flag did not exist for `bundle destroy`. This PR adds that in.
## Tests
manually tested. Now adding the `--force` flag hijacks the deploy lock
on the target directory.
## Changes
Some of the command such as `databricks alerts create` require
positional arguments which are not primitive.
Since these arguments are required, we should correctly set ExactArgs
for such commands
Fixes#367
## Tests
Running `databricks alerts create`
Before
```
andrew.nester@HFW9Y94129 cli % ./cli alerts create
panic: runtime error: index out of range [0] with length 0
goroutine 1 [running]:
github.com/databricks/bricks/cmd/workspace/alerts.glob..func1(0x22a1280?, {0x2321638, 0x0, 0x0?})
github.com/databricks/bricks/cmd/workspace/alerts/alerts.go:57 +0x355
github.com/spf13/cobra.(*Command).execute(0x22a1280, {0x2321638, 0x0, 0x0})
github.com/spf13/cobra@v1.7.0/command.go:940 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0x22a0700)
github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3bd
github.com/spf13/cobra.(*Command).ExecuteContextC(...)
github.com/spf13/cobra@v1.7.0/command.go:1001
github.com/databricks/bricks/cmd/root.Execute()
github.com/databricks/bricks/cmd/root/root.go:80 +0x6a
main.main()
github.com/databricks/bricks/main.go:18 +0x17
```
After
```
andrew.nester@HFW9Y94129 cli % ./cli alerts create
Error: provide command input in JSON format by specifying --json option
```
Acceptance test
```
=== RUN TestAccAlertsCreateErrWhenNoArguments
alerts_test.go:10: gcp
helpers.go:147: Error running command: provide command input in JSON format by specifying --json option
--- PASS: TestAccAlertsCreateErrWhenNoArguments (1.99s)
PASS
```
## Changes
Disable shell completions for generated commands.
Default completion behavior completes local files which never makes
sense.
Automatic contextual completion of required arguments would be super
powerful but a lot of work to get right. Until then, we could do manual
completion functions in `overrides.go` as needed.
This fixes#374.
## Tests
Confirmed manually that commands no longer complete local files.
## Changes
With this change related commands show up next to each other in help
output.
The ordered list of groups is hard-coded until it can be derived from
the specification.
## Tests
Manually confirmed that the help output of the root command and the
account command list commands by their groups.
## Changes
This includes the following changes:
* Move profile loading code to libs/databrickscfg and add tests
* Update prompt label to reflect workspace/account profiles
* Start prompt in search mode by default
* Custom error if `~/.databrickscfg` doesn't exist
* Custom error if `~/.databrickscfg` doesn't contain profiles
* Use stderr for prompt so that stdout redirection works (e.g. with `jq` or `jless`)
## Tests
* New unit tests pass
* Manual tests for both workspace and account commands
* Search-by-default is really nice if you have many profiles
## Changes
This PR:
1. Adds the export-dir command
2. Changes filer.Read to return an error if a user tries to read a
directory
3. Adds returning internal file structures from filer.Stat().Sys()
## Tests
Integration tests and manually
## Changes
Some of the commands do not support prompts, for example `workspace
get-status` but we were wrongly suggesting customers some option.
Quick fix for this is not to provide prompts for these known commands.
Note: it uses a method from this PR in Go SDK
https://github.com/databricks/databricks-sdk-go/pull/416
## Tests
Running `workspace get-status`
Before
```
andrew.nester@HFW9Y94129 multiples-tasks % ../../cli/cli workspace get-status
Error: Path () doesn't start with '/'
```
After
```
andrew.nester@HFW9Y94129 multiples-tasks % ../../cli/cli workspace get-status
Error: accepts 1 arg(s), received 0
```
## Changes
Better error message if can not load prompts
## Tests
Setup 2 jobs with the same name and ran `cli job get`
```
andrew.nester@HFW9Y94129 multiples-tasks % ../../cli/cli jobs get
Error: failed to load names for Jobs drop-down. Please manually specify required arguments. Original error: duplicate .Settings.Name: duplicatejob
```
## Changes
Use cmdio in the version command such that it accepts the `--output` flag.
This removes the existing `--detail` flag which previously made the
command print JSON output.
## Tests
New integration test passes.
## Changes
Do not prompt for List methods
## Tests
Running
```
cli workspace list
```
Before
```
cli workspace list
Error: Path () doesn't start with '/'
```
After
```
cli workspace list
Error: accepts 1 arg(s), received 0
```
## Changes
Added support for `bundle.Seq`, simplified `Mutator.Apply` interface by
removing list of mutators from return values/
## Tests
1. Ran `cli bundle deploy` and interrupted it with Cmd + C mid execution
so lock is not released
2. Ran `cli bundle deploy` top make sure that CLI is not trying to
release lock when it fail to acquire it
```
andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy
Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files!
^C
andrew.nester@HFW9Y94129 multiples-tasks % cli bundle deploy
Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-24 12:10:23.050343 +0200 CEST. Use --force to override
```
## Changes
With this PR, all of the command below print version and exit:
```
$ databricks -v
Databricks CLI v0.100.1-dev+4d3fa76
$ databricks --version
Databricks CLI v0.100.1-dev+4d3fa76
$ databricks version
Databricks CLI v0.100.1-dev+4d3fa76
```
## Tests
Added integration test for each flag or command.
## Changes
Rename all instances of "bricks" to "databricks".
## Tests
* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
## Changes
This PR now allows you to define variables in the bundle config and set
them in three ways
1. command line args
2. process environment variable
3. in the bundle config itself
## Tests
manually, unit, and black box tests
---------
Co-authored-by: Miles Yucht <miles@databricks.com>
This PR adds the following command groups:
## Workspace-level command groups
* `bricks alerts` - The alerts API can be used to perform CRUD operations on alerts.
* `bricks catalogs` - A catalog is the first layer of Unity Catalog’s three-level namespace.
* `bricks cluster-policies` - Cluster policy limits the ability to configure clusters based on a set of rules.
* `bricks clusters` - The Clusters API allows you to create, start, edit, list, terminate, and delete clusters.
* `bricks current-user` - This API allows retrieving information about currently authenticated user or service principal.
* `bricks dashboards` - In general, there is little need to modify dashboards using the API.
* `bricks data-sources` - This API is provided to assist you in making new query objects.
* `bricks experiments` - MLflow Experiment tracking.
* `bricks external-locations` - An external location is an object that combines a cloud storage path with a storage credential that authorizes access to the cloud storage path.
* `bricks functions` - Functions implement User-Defined Functions (UDFs) in Unity Catalog.
* `bricks git-credentials` - Registers personal access token for Databricks to do operations on behalf of the user.
* `bricks global-init-scripts` - The Global Init Scripts API enables Workspace administrators to configure global initialization scripts for their workspace.
* `bricks grants` - In Unity Catalog, data is secure by default.
* `bricks groups` - Groups simplify identity management, making it easier to assign access to Databricks Workspace, data, and other securable objects.
* `bricks instance-pools` - Instance Pools API are used to create, edit, delete and list instance pools by using ready-to-use cloud instances which reduces a cluster start and auto-scaling times.
* `bricks instance-profiles` - The Instance Profiles API allows admins to add, list, and remove instance profiles that users can launch clusters with.
* `bricks ip-access-lists` - IP Access List enables admins to configure IP access lists.
* `bricks jobs` - The Jobs API allows you to create, edit, and delete jobs.
* `bricks libraries` - The Libraries API allows you to install and uninstall libraries and get the status of libraries on a cluster.
* `bricks metastores` - A metastore is the top-level container of objects in Unity Catalog.
* `bricks model-registry` - MLflow Model Registry commands.
* `bricks permissions` - Permissions API are used to create read, write, edit, update and manage access for various users on different objects and endpoints.
* `bricks pipelines` - The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines.
* `bricks policy-families` - View available policy families.
* `bricks providers` - Databricks Providers REST API.
* `bricks queries` - These endpoints are used for CRUD operations on query definitions.
* `bricks query-history` - Access the history of queries through SQL warehouses.
* `bricks recipient-activation` - Databricks Recipient Activation REST API.
* `bricks recipients` - Databricks Recipients REST API.
* `bricks repos` - The Repos API allows users to manage their git repos.
* `bricks schemas` - A schema (also called a database) is the second layer of Unity Catalog’s three-level namespace.
* `bricks secrets` - The Secrets API allows you to manage secrets, secret scopes, and access permissions.
* `bricks service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms.
* `bricks serving-endpoints` - The Serving Endpoints API allows you to create, update, and delete model serving endpoints.
* `bricks shares` - Databricks Shares REST API.
* `bricks storage-credentials` - A storage credential represents an authentication and authorization mechanism for accessing data stored on your cloud tenant.
* `bricks table-constraints` - Primary key and foreign key constraints encode relationships between fields in tables.
* `bricks tables` - A table resides in the third layer of Unity Catalog’s three-level namespace.
* `bricks token-management` - Enables administrators to get all tokens and delete tokens for other users.
* `bricks tokens` - The Token API allows you to create, list, and revoke tokens that can be used to authenticate and access Databricks REST APIs.
* `bricks users` - User identities recognized by Databricks and represented by email addresses.
* `bricks volumes` - Volumes are a Unity Catalog (UC) capability for accessing, storing, governing, organizing and processing files.
* `bricks warehouses` - A SQL warehouse is a compute resource that lets you run SQL commands on data objects within Databricks SQL.
* `bricks workspace` - The Workspace API allows you to list, import, export, and delete notebooks and folders.
* `bricks workspace-conf` - This API allows updating known workspace settings for advanced users.
## Account-level command groups
* `bricks account billable-usage` - This API allows you to download billable usage logs for the specified account and date range.
* `bricks account budgets` - These APIs manage budget configuration including notifications for exceeding a budget for a period.
* `bricks account credentials` - These APIs manage credential configurations for this workspace.
* `bricks account custom-app-integration` - These APIs enable administrators to manage custom oauth app integrations, which is required for adding/using Custom OAuth App Integration like Tableau Cloud for Databricks in AWS cloud.
* `bricks account encryption-keys` - These APIs manage encryption key configurations for this workspace (optional).
* `bricks account groups` - Groups simplify identity management, making it easier to assign access to Databricks Account, data, and other securable objects.
* `bricks account ip-access-lists` - The Accounts IP Access List API enables account admins to configure IP access lists for access to the account console.
* `bricks account log-delivery` - These APIs manage log delivery configurations for this account.
* `bricks account metastore-assignments` - These APIs manage metastore assignments to a workspace.
* `bricks account metastores` - These APIs manage Unity Catalog metastores for an account.
* `bricks account networks` - These APIs manage network configurations for customer-managed VPCs (optional).
* `bricks account o-auth-enrollment` - These APIs enable administrators to enroll OAuth for their accounts, which is required for adding/using any OAuth published/custom application integration.
* `bricks account private-access` - These APIs manage private access settings for this account.
* `bricks account published-app-integration` - These APIs enable administrators to manage published oauth app integrations, which is required for adding/using Published OAuth App Integration like Tableau Cloud for Databricks in AWS cloud.
* `bricks account service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms.
* `bricks account storage` - These APIs manage storage configurations for this workspace.
* `bricks account storage-credentials` - These APIs manage storage credentials for a particular metastore.
* `bricks account users` - User identities recognized by Databricks and represented by email addresses.
* `bricks account vpc-endpoints` - These APIs manage VPC endpoint configurations for this account.
* `bricks account workspace-assignment` - The Workspace Permission Assignment API allows you to manage workspace permissions for principals in your account.
* `bricks account workspaces` - These APIs manage workspaces for this account.
## Changes
<!-- Summary of your changes that are easy to understand -->
1. Log os.Args and bricks version before every command execution
2. After a command execution, logs the error and exit code
## Tests
<!-- How is this tested? -->
Manually,
case 1: Run `bricks version` successfully
```
shreyas.goenka@THW32HFW6T bricks % bricks version --log-level=info --log-file stderr
time=2023-04-12T00:15:04.011+02:00 level=INFO source=root.go:34 msg="process args: [bricks, version, --log-level=info, --log-file, stderr]"
time=2023-04-12T00:15:04.011+02:00 level=INFO source=root.go:35 msg="version: 0.0.0-dev+375eb1c50283"
0.0.0-dev+375eb1c50283
time=2023-04-12T00:15:04.011+02:00 level=INFO source=root.go:68 msg="exit code: 0"
```
case 2: Run `bricks bundle deploy` in a working dir where `bundle.yml`
does not exist
```
shreyas.goenka@THW32HFW6T bricks % bricks bundle deploy --log-level=info --log-file=stderr
time=2023-04-12T00:19:16.783+02:00 level=INFO source=root.go:34 msg="process args: [bricks, bundle, deploy, --log-level=info, --log-file=stderr]"
time=2023-04-12T00:19:16.784+02:00 level=INFO source=root.go:35 msg="version: 0.0.0-dev+375eb1c50283"
Error: unable to locate bundle root: bundle.yml not found
time=2023-04-12T00:19:16.784+02:00 level=ERROR source=root.go:64 msg="unable to locate bundle root: bundle.yml not found"
time=2023-04-12T00:19:16.784+02:00 level=ERROR source=root.go:65 msg="exit code: 1"
```
## Changes
These are unlikely to ever be DBFS paths so we can remove this level of indirection to simplify.
**Note:** this is a breaking change. Downstream usage of these fields must be updated.
## Tests
Existing tests pass.
## Changes
<!-- Summary of your changes that are easy to understand -->
This PR removes the project package and it's dependents in the bricks
repo
## Tests
<!-- How is this tested? -->
## Changes
Pull state before deploying and push state after deploying.
Note: the run command was missing mutators to initialize Terraform. This
is necessary if the cache directory is removed between running "deploy"
and "run" (which is valid now that we synchronize state).
## Tests
Manually.
Add configuration:
```
bundle:
lock:
enabled: true
force: false
```
The force field can be set by passing the `--force` argument to `bricks
bundle deploy`. Doing so means the deployment lock is acquired even if
it is currently held. This should only be used in exceptional cases
(e.g. a previous deployment has failed to release the lock).
New global flags:
* `--log-file FILE`: can be literal `stdout`, `stderr`, or a file name (default `stderr`)
* `--log-level LEVEL`: can be `error`, `warn`, `info`, `debug`, `trace`, or `disabled` (default `disabled`)
* `--log-format TYPE`: can be `text` or `json` (default `text`)
New functions in the `log` package take a `context.Context` and retrieve
the logger from said context.
Because we carry the logger in a context, adding
[attributes](https://pkg.go.dev/golang.org/x/exp/slog#hdr-Attrs_and_Values)
to the logger can be done as follows:
```go
ctx = log.NewContext(ctx, log.GetLogger(ctx).With("foo", "bar"))
```
This PR:
1. Adds autogeneration of descriptions for `resources` field
2. Autogenerates empty descriptions for any properties in DABs
3. Defines SOPs for how to refresh these descriptions
4. Adds command to generate this documentation
5. Adds Automatically copy any descriptions over to `environments`
property
Basically it provides a framework for adding descriptions to the
generated JSON schema
Tested manually and using unit tests
JSON output makes it easy to process synchronization progress
information in downstream tools (e.g. the vscode extension).
This changes introduces a `sync.Event` interface type for progress events as
well as an `sync.EventNotifier` that lets the sync code pass along
progress events to calling code.
Example output in text mode (default, this uses the existing logger calls):
```text
2023/03/03 14:07:17 [INFO] Remote file sync location: /Repos/pieter.noordhuis@databricks.com/...
2023/03/03 14:07:18 [INFO] Initial Sync Complete
2023/03/03 14:07:22 [INFO] Action: PUT: foo
2023/03/03 14:07:23 [INFO] Uploaded foo
2023/03/03 14:07:23 [INFO] Complete
2023/03/03 14:07:25 [INFO] Action: DELETE: foo
2023/03/03 14:07:25 [INFO] Deleted foo
2023/03/03 14:07:25 [INFO] Complete
```
Example output in JSON mode:
```json
{"timestamp":"2023-03-03T14:08:15.459439+01:00","seq":0,"type":"start"}
{"timestamp":"2023-03-03T14:08:15.459461+01:00","seq":0,"type":"complete"}
{"timestamp":"2023-03-03T14:08:18.459821+01:00","seq":1,"type":"start","put":["foo"]}
{"timestamp":"2023-03-03T14:08:18.459867+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":0}
{"timestamp":"2023-03-03T14:08:19.418696+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":1}
{"timestamp":"2023-03-03T14:08:19.421397+01:00","seq":1,"type":"complete","put":["foo"]}
{"timestamp":"2023-03-03T14:08:22.459238+01:00","seq":2,"type":"start","delete":["foo"]}
{"timestamp":"2023-03-03T14:08:22.459268+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":0}
{"timestamp":"2023-03-03T14:08:22.686413+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":1}
{"timestamp":"2023-03-03T14:08:22.688989+01:00","seq":2,"type":"complete","delete":["foo"]}
```
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
Invoke with `bricks sync SRC DST`.
In bundle context `SRC` and `DST` arguments are taken from bundle configuration.
This PR adds `bricks bundle sync` to disambiguate between the two.
Once the VS Code extension is bundle aware they can again be consolidated.
Consolidating them today would regress the VS Code experience if a
`bundle.yml` file is present in the file tree.
Example when called from vscode (and everything is hooked up):
```
> * User-Agent: bricks/0.0.21-devel databricks-sdk-go/0.2.0 go/1.19.4 os/darwin upstream/databricks-vscode
```
This configures the user agent with the bricks version and the name of
the command being executed.
Example user agent value:
```
> * User-Agent: bricks/0.0.21-devel databricks-sdk-go/0.2.0 go/1.19.4 os/darwin cmd/sync auth/pat
```
This is a follow up for #194.
Includes relevant fields listed on
https://goreleaser.com/customization/templates/ into build artifacts.
The version command outputs the version by default:
```
$ bricks version
0.0.21-devel
```
Or all build information if `--json` is specified:
```
$ bricks version --json
{
"ProjectName": "bricks",
"Version": "0.0.21-devel",
"Branch": "version-info",
"Tag": "v0.0.20",
"ShortCommit": "193b56b",
"FullCommit": "193b56b0929128c0836d35e913c46fd66fa2a93c",
"CommitTime": "2023-02-02T22:04:42+01:00",
"Summary": "v0.0.20-5-g193b56b",
"Major": 0,
"Minor": 0,
"Patch": 20,
"Prerelease": "",
"IsSnapshot": true,
"BuildTime": "2023-02-02T22:07:36+01:00"
}
```
We intend to let non-bundle commands use bundle configuration for their
operating context (workspace, auth, default cluster, etc).
As such, all commands must first try to load a bundle configuration.
If there is no bundle they can fall back on taking their operating
context from command line flags and the environment.
This is on top of #180.
By default the command runs an incremental, one-time sync, similar to the
behavior of rsync. The `--persist-snapshot` flag has been removed and the
command now always saves a synchronization snapshot.
* Add `--full` flag to force full synchronization
* Add `--watch` flag to run continuously and watch the local file system for changes
This builds on #176.
This change also adds testcases for checking if the specified path is
nested under the valid base paths and fixes an edge case where the user
could synchronize into their home directory directly.
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
The code depended on the project package for:
* git.FileSet in the watchdog
* project.CacheDir to determine snapshot path
These dependencies are now denormalized in the SyncOptions struct.
Follow up for #173.
With this change:
* Paths under `/Workspace/<me>` and `/Repos/<me>` are allowed
* The sync destination is checked to be either a directory or a repository
* If it is under `/Repos` and doesn't exist, the command returns an error
This PR:
1. Refactors the sync integration tests to make them more readable
2. Adds additional tests for edge cases we encountered during vscode
runs
3. Intensional side effect: sync integration tests are also green on
windows (see
https://github.com/databricks/eng-dev-ecosystem/actions/runs/3817365642/jobs/6493576727)
Change in coverage
- We now test for python notebook <-> python file interconversion and
python notebook deletion being synced to workspace
- Tests are split up and are more focused on testing specific edge cases
Tested by running the unit and integration tests locally
Tested manually on windows
Screenshot from windows sync logs indicating that the correct slashed
for paths were used:
<img width="623" alt="Screenshot 2022-12-21 at 9 09 13 PM"
src="https://user-images.githubusercontent.com/88374338/208943937-146670b2-1afd-4e0b-8f4e-6091c8c7e17a.png">
@pietern with this the state machine for syncing becomes slightly more
complicated, indicating a stronger need for a tree based approach herre
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
If the environment is not set through command line argument or
environment variable, the bundle loads either 1) the only environment,
2) the only environment with the default flag set.
This PR:
- Implements safeguards for not accidentally/maliciously deleting repos
by sanitizing relative paths
- Adds versioning for snapshot schemas to allow invalidation if needed
- Adds logic to delete preexisting remote artifacts that might not have
been cleaned up properly if they conflict with an upload
- A bunch of tests for the changes here
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Unit tests are now run in all three big OS.
Some of the changes are to make the tests green for windows while we are
skipping some of the other tests on windows/macOS to make the tests
pass. This is a temporary measure and we will incrementally migrate
these tests over so there is parity in unit testing along all three
environments!
While working on artifact upload and workspace interrogation I realized
this mutator interface needs to:
1. Operate at the whole bundle level so it can apply to both
configuration and internal state
2. Include a `context.Context` parameter for a) long running operations
and b) progress reporting
Previous interface:
```
Apply(*config.Root) ([]Mutator, error)
```
New interface:
```
Apply(context.Context, *Bundle) ([]Mutator, error)
```
This PR introduces tracking of remote names and local names of files in snapshots to disambiguate between files which might have the same remote name and handle clean deleting of files whose remote name changes due (eg. python notebook getting converted to a python notebook)
Used to inspect the bundle configuration after loading and merging all
files.
Once we add variable interpolation this command could show the result
after interpolation as well.
Each of the mutations to this configuration is observable, so we could
add a mode that writes each of the intermediate versions to disk for
even more fine grained introspection.
This PR does multiple things, which are:
1. Creates .databricks dir according to outcomes concluded in "bricks
configuration principles"
2. Puts the sync snapshots into a file whose names is tagged with
md5(concat(host, remote-path))
3. Saves both host and username in the bricks snapshot for debuggability
Tested manually:
https://user-images.githubusercontent.com/88374338/195672267-9dd90230-570f-49b7-847f-05a5a6fd8986.mov
Not settled whether this should live as a top level command or hidden
under some debug scope. Either way, the ability to make arbitrary API
calls and leverage unified auth is a super useful tool.
Tested manually
We are adding this flag because the default bricks sync is not robust
against changing the profile and other project config changes. This will
be used in the initial version of the vscode extention