This PR adds the following command groups:
## Workspace-level command groups
* `bricks alerts` - The alerts API can be used to perform CRUD operations on alerts.
* `bricks catalogs` - A catalog is the first layer of Unity Catalog’s three-level namespace.
* `bricks cluster-policies` - Cluster policy limits the ability to configure clusters based on a set of rules.
* `bricks clusters` - The Clusters API allows you to create, start, edit, list, terminate, and delete clusters.
* `bricks current-user` - This API allows retrieving information about currently authenticated user or service principal.
* `bricks dashboards` - In general, there is little need to modify dashboards using the API.
* `bricks data-sources` - This API is provided to assist you in making new query objects.
* `bricks experiments` - MLflow Experiment tracking.
* `bricks external-locations` - An external location is an object that combines a cloud storage path with a storage credential that authorizes access to the cloud storage path.
* `bricks functions` - Functions implement User-Defined Functions (UDFs) in Unity Catalog.
* `bricks git-credentials` - Registers personal access token for Databricks to do operations on behalf of the user.
* `bricks global-init-scripts` - The Global Init Scripts API enables Workspace administrators to configure global initialization scripts for their workspace.
* `bricks grants` - In Unity Catalog, data is secure by default.
* `bricks groups` - Groups simplify identity management, making it easier to assign access to Databricks Workspace, data, and other securable objects.
* `bricks instance-pools` - Instance Pools API are used to create, edit, delete and list instance pools by using ready-to-use cloud instances which reduces a cluster start and auto-scaling times.
* `bricks instance-profiles` - The Instance Profiles API allows admins to add, list, and remove instance profiles that users can launch clusters with.
* `bricks ip-access-lists` - IP Access List enables admins to configure IP access lists.
* `bricks jobs` - The Jobs API allows you to create, edit, and delete jobs.
* `bricks libraries` - The Libraries API allows you to install and uninstall libraries and get the status of libraries on a cluster.
* `bricks metastores` - A metastore is the top-level container of objects in Unity Catalog.
* `bricks model-registry` - MLflow Model Registry commands.
* `bricks permissions` - Permissions API are used to create read, write, edit, update and manage access for various users on different objects and endpoints.
* `bricks pipelines` - The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines.
* `bricks policy-families` - View available policy families.
* `bricks providers` - Databricks Providers REST API.
* `bricks queries` - These endpoints are used for CRUD operations on query definitions.
* `bricks query-history` - Access the history of queries through SQL warehouses.
* `bricks recipient-activation` - Databricks Recipient Activation REST API.
* `bricks recipients` - Databricks Recipients REST API.
* `bricks repos` - The Repos API allows users to manage their git repos.
* `bricks schemas` - A schema (also called a database) is the second layer of Unity Catalog’s three-level namespace.
* `bricks secrets` - The Secrets API allows you to manage secrets, secret scopes, and access permissions.
* `bricks service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms.
* `bricks serving-endpoints` - The Serving Endpoints API allows you to create, update, and delete model serving endpoints.
* `bricks shares` - Databricks Shares REST API.
* `bricks storage-credentials` - A storage credential represents an authentication and authorization mechanism for accessing data stored on your cloud tenant.
* `bricks table-constraints` - Primary key and foreign key constraints encode relationships between fields in tables.
* `bricks tables` - A table resides in the third layer of Unity Catalog’s three-level namespace.
* `bricks token-management` - Enables administrators to get all tokens and delete tokens for other users.
* `bricks tokens` - The Token API allows you to create, list, and revoke tokens that can be used to authenticate and access Databricks REST APIs.
* `bricks users` - User identities recognized by Databricks and represented by email addresses.
* `bricks volumes` - Volumes are a Unity Catalog (UC) capability for accessing, storing, governing, organizing and processing files.
* `bricks warehouses` - A SQL warehouse is a compute resource that lets you run SQL commands on data objects within Databricks SQL.
* `bricks workspace` - The Workspace API allows you to list, import, export, and delete notebooks and folders.
* `bricks workspace-conf` - This API allows updating known workspace settings for advanced users.
## Account-level command groups
* `bricks account billable-usage` - This API allows you to download billable usage logs for the specified account and date range.
* `bricks account budgets` - These APIs manage budget configuration including notifications for exceeding a budget for a period.
* `bricks account credentials` - These APIs manage credential configurations for this workspace.
* `bricks account custom-app-integration` - These APIs enable administrators to manage custom oauth app integrations, which is required for adding/using Custom OAuth App Integration like Tableau Cloud for Databricks in AWS cloud.
* `bricks account encryption-keys` - These APIs manage encryption key configurations for this workspace (optional).
* `bricks account groups` - Groups simplify identity management, making it easier to assign access to Databricks Account, data, and other securable objects.
* `bricks account ip-access-lists` - The Accounts IP Access List API enables account admins to configure IP access lists for access to the account console.
* `bricks account log-delivery` - These APIs manage log delivery configurations for this account.
* `bricks account metastore-assignments` - These APIs manage metastore assignments to a workspace.
* `bricks account metastores` - These APIs manage Unity Catalog metastores for an account.
* `bricks account networks` - These APIs manage network configurations for customer-managed VPCs (optional).
* `bricks account o-auth-enrollment` - These APIs enable administrators to enroll OAuth for their accounts, which is required for adding/using any OAuth published/custom application integration.
* `bricks account private-access` - These APIs manage private access settings for this account.
* `bricks account published-app-integration` - These APIs enable administrators to manage published oauth app integrations, which is required for adding/using Published OAuth App Integration like Tableau Cloud for Databricks in AWS cloud.
* `bricks account service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms.
* `bricks account storage` - These APIs manage storage configurations for this workspace.
* `bricks account storage-credentials` - These APIs manage storage credentials for a particular metastore.
* `bricks account users` - User identities recognized by Databricks and represented by email addresses.
* `bricks account vpc-endpoints` - These APIs manage VPC endpoint configurations for this account.
* `bricks account workspace-assignment` - The Workspace Permission Assignment API allows you to manage workspace permissions for principals in your account.
* `bricks account workspaces` - These APIs manage workspaces for this account.
## Changes
<!-- Summary of your changes that are easy to understand -->
1. Log os.Args and bricks version before every command execution
2. After a command execution, logs the error and exit code
## Tests
<!-- How is this tested? -->
Manually,
case 1: Run `bricks version` successfully
```
shreyas.goenka@THW32HFW6T bricks % bricks version --log-level=info --log-file stderr
time=2023-04-12T00:15:04.011+02:00 level=INFO source=root.go:34 msg="process args: [bricks, version, --log-level=info, --log-file, stderr]"
time=2023-04-12T00:15:04.011+02:00 level=INFO source=root.go:35 msg="version: 0.0.0-dev+375eb1c50283"
0.0.0-dev+375eb1c50283
time=2023-04-12T00:15:04.011+02:00 level=INFO source=root.go:68 msg="exit code: 0"
```
case 2: Run `bricks bundle deploy` in a working dir where `bundle.yml`
does not exist
```
shreyas.goenka@THW32HFW6T bricks % bricks bundle deploy --log-level=info --log-file=stderr
time=2023-04-12T00:19:16.783+02:00 level=INFO source=root.go:34 msg="process args: [bricks, bundle, deploy, --log-level=info, --log-file=stderr]"
time=2023-04-12T00:19:16.784+02:00 level=INFO source=root.go:35 msg="version: 0.0.0-dev+375eb1c50283"
Error: unable to locate bundle root: bundle.yml not found
time=2023-04-12T00:19:16.784+02:00 level=ERROR source=root.go:64 msg="unable to locate bundle root: bundle.yml not found"
time=2023-04-12T00:19:16.784+02:00 level=ERROR source=root.go:65 msg="exit code: 1"
```
## Changes
These are unlikely to ever be DBFS paths so we can remove this level of indirection to simplify.
**Note:** this is a breaking change. Downstream usage of these fields must be updated.
## Tests
Existing tests pass.
## Changes
<!-- Summary of your changes that are easy to understand -->
This PR removes the project package and it's dependents in the bricks
repo
## Tests
<!-- How is this tested? -->
## Changes
Pull state before deploying and push state after deploying.
Note: the run command was missing mutators to initialize Terraform. This
is necessary if the cache directory is removed between running "deploy"
and "run" (which is valid now that we synchronize state).
## Tests
Manually.
Add configuration:
```
bundle:
lock:
enabled: true
force: false
```
The force field can be set by passing the `--force` argument to `bricks
bundle deploy`. Doing so means the deployment lock is acquired even if
it is currently held. This should only be used in exceptional cases
(e.g. a previous deployment has failed to release the lock).
New global flags:
* `--log-file FILE`: can be literal `stdout`, `stderr`, or a file name (default `stderr`)
* `--log-level LEVEL`: can be `error`, `warn`, `info`, `debug`, `trace`, or `disabled` (default `disabled`)
* `--log-format TYPE`: can be `text` or `json` (default `text`)
New functions in the `log` package take a `context.Context` and retrieve
the logger from said context.
Because we carry the logger in a context, adding
[attributes](https://pkg.go.dev/golang.org/x/exp/slog#hdr-Attrs_and_Values)
to the logger can be done as follows:
```go
ctx = log.NewContext(ctx, log.GetLogger(ctx).With("foo", "bar"))
```
This PR:
1. Adds autogeneration of descriptions for `resources` field
2. Autogenerates empty descriptions for any properties in DABs
3. Defines SOPs for how to refresh these descriptions
4. Adds command to generate this documentation
5. Adds Automatically copy any descriptions over to `environments`
property
Basically it provides a framework for adding descriptions to the
generated JSON schema
Tested manually and using unit tests
JSON output makes it easy to process synchronization progress
information in downstream tools (e.g. the vscode extension).
This changes introduces a `sync.Event` interface type for progress events as
well as an `sync.EventNotifier` that lets the sync code pass along
progress events to calling code.
Example output in text mode (default, this uses the existing logger calls):
```text
2023/03/03 14:07:17 [INFO] Remote file sync location: /Repos/pieter.noordhuis@databricks.com/...
2023/03/03 14:07:18 [INFO] Initial Sync Complete
2023/03/03 14:07:22 [INFO] Action: PUT: foo
2023/03/03 14:07:23 [INFO] Uploaded foo
2023/03/03 14:07:23 [INFO] Complete
2023/03/03 14:07:25 [INFO] Action: DELETE: foo
2023/03/03 14:07:25 [INFO] Deleted foo
2023/03/03 14:07:25 [INFO] Complete
```
Example output in JSON mode:
```json
{"timestamp":"2023-03-03T14:08:15.459439+01:00","seq":0,"type":"start"}
{"timestamp":"2023-03-03T14:08:15.459461+01:00","seq":0,"type":"complete"}
{"timestamp":"2023-03-03T14:08:18.459821+01:00","seq":1,"type":"start","put":["foo"]}
{"timestamp":"2023-03-03T14:08:18.459867+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":0}
{"timestamp":"2023-03-03T14:08:19.418696+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":1}
{"timestamp":"2023-03-03T14:08:19.421397+01:00","seq":1,"type":"complete","put":["foo"]}
{"timestamp":"2023-03-03T14:08:22.459238+01:00","seq":2,"type":"start","delete":["foo"]}
{"timestamp":"2023-03-03T14:08:22.459268+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":0}
{"timestamp":"2023-03-03T14:08:22.686413+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":1}
{"timestamp":"2023-03-03T14:08:22.688989+01:00","seq":2,"type":"complete","delete":["foo"]}
```
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
Invoke with `bricks sync SRC DST`.
In bundle context `SRC` and `DST` arguments are taken from bundle configuration.
This PR adds `bricks bundle sync` to disambiguate between the two.
Once the VS Code extension is bundle aware they can again be consolidated.
Consolidating them today would regress the VS Code experience if a
`bundle.yml` file is present in the file tree.
Example when called from vscode (and everything is hooked up):
```
> * User-Agent: bricks/0.0.21-devel databricks-sdk-go/0.2.0 go/1.19.4 os/darwin upstream/databricks-vscode
```
This configures the user agent with the bricks version and the name of
the command being executed.
Example user agent value:
```
> * User-Agent: bricks/0.0.21-devel databricks-sdk-go/0.2.0 go/1.19.4 os/darwin cmd/sync auth/pat
```
This is a follow up for #194.
Includes relevant fields listed on
https://goreleaser.com/customization/templates/ into build artifacts.
The version command outputs the version by default:
```
$ bricks version
0.0.21-devel
```
Or all build information if `--json` is specified:
```
$ bricks version --json
{
"ProjectName": "bricks",
"Version": "0.0.21-devel",
"Branch": "version-info",
"Tag": "v0.0.20",
"ShortCommit": "193b56b",
"FullCommit": "193b56b0929128c0836d35e913c46fd66fa2a93c",
"CommitTime": "2023-02-02T22:04:42+01:00",
"Summary": "v0.0.20-5-g193b56b",
"Major": 0,
"Minor": 0,
"Patch": 20,
"Prerelease": "",
"IsSnapshot": true,
"BuildTime": "2023-02-02T22:07:36+01:00"
}
```
We intend to let non-bundle commands use bundle configuration for their
operating context (workspace, auth, default cluster, etc).
As such, all commands must first try to load a bundle configuration.
If there is no bundle they can fall back on taking their operating
context from command line flags and the environment.
This is on top of #180.
By default the command runs an incremental, one-time sync, similar to the
behavior of rsync. The `--persist-snapshot` flag has been removed and the
command now always saves a synchronization snapshot.
* Add `--full` flag to force full synchronization
* Add `--watch` flag to run continuously and watch the local file system for changes
This builds on #176.
This change also adds testcases for checking if the specified path is
nested under the valid base paths and fixes an edge case where the user
could synchronize into their home directory directly.
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
The code depended on the project package for:
* git.FileSet in the watchdog
* project.CacheDir to determine snapshot path
These dependencies are now denormalized in the SyncOptions struct.
Follow up for #173.
With this change:
* Paths under `/Workspace/<me>` and `/Repos/<me>` are allowed
* The sync destination is checked to be either a directory or a repository
* If it is under `/Repos` and doesn't exist, the command returns an error
This PR:
1. Refactors the sync integration tests to make them more readable
2. Adds additional tests for edge cases we encountered during vscode
runs
3. Intensional side effect: sync integration tests are also green on
windows (see
https://github.com/databricks/eng-dev-ecosystem/actions/runs/3817365642/jobs/6493576727)
Change in coverage
- We now test for python notebook <-> python file interconversion and
python notebook deletion being synced to workspace
- Tests are split up and are more focused on testing specific edge cases
Tested by running the unit and integration tests locally
Tested manually on windows
Screenshot from windows sync logs indicating that the correct slashed
for paths were used:
<img width="623" alt="Screenshot 2022-12-21 at 9 09 13 PM"
src="https://user-images.githubusercontent.com/88374338/208943937-146670b2-1afd-4e0b-8f4e-6091c8c7e17a.png">
@pietern with this the state machine for syncing becomes slightly more
complicated, indicating a stronger need for a tree based approach herre
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
If the environment is not set through command line argument or
environment variable, the bundle loads either 1) the only environment,
2) the only environment with the default flag set.
This PR:
- Implements safeguards for not accidentally/maliciously deleting repos
by sanitizing relative paths
- Adds versioning for snapshot schemas to allow invalidation if needed
- Adds logic to delete preexisting remote artifacts that might not have
been cleaned up properly if they conflict with an upload
- A bunch of tests for the changes here
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Unit tests are now run in all three big OS.
Some of the changes are to make the tests green for windows while we are
skipping some of the other tests on windows/macOS to make the tests
pass. This is a temporary measure and we will incrementally migrate
these tests over so there is parity in unit testing along all three
environments!
While working on artifact upload and workspace interrogation I realized
this mutator interface needs to:
1. Operate at the whole bundle level so it can apply to both
configuration and internal state
2. Include a `context.Context` parameter for a) long running operations
and b) progress reporting
Previous interface:
```
Apply(*config.Root) ([]Mutator, error)
```
New interface:
```
Apply(context.Context, *Bundle) ([]Mutator, error)
```
This PR introduces tracking of remote names and local names of files in snapshots to disambiguate between files which might have the same remote name and handle clean deleting of files whose remote name changes due (eg. python notebook getting converted to a python notebook)
Used to inspect the bundle configuration after loading and merging all
files.
Once we add variable interpolation this command could show the result
after interpolation as well.
Each of the mutations to this configuration is observable, so we could
add a mode that writes each of the intermediate versions to disk for
even more fine grained introspection.
This PR does multiple things, which are:
1. Creates .databricks dir according to outcomes concluded in "bricks
configuration principles"
2. Puts the sync snapshots into a file whose names is tagged with
md5(concat(host, remote-path))
3. Saves both host and username in the bricks snapshot for debuggability
Tested manually:
https://user-images.githubusercontent.com/88374338/195672267-9dd90230-570f-49b7-847f-05a5a6fd8986.mov
Not settled whether this should live as a top level command or hidden
under some debug scope. Either way, the ability to make arbitrary API
calls and leverage unified auth is a super useful tool.
Tested manually
We are adding this flag because the default bricks sync is not robust
against changing the profile and other project config changes. This will
be used in the initial version of the vscode extention
This PR:
1. Replaces scim.Me call to use the go SDK instead of the terraform
client
2. Removes terraform client from bricks project
Tested manually that the scim.Me call works now and returns the correct
user
go build works
By:
* Add .gitkeep to retain test fixture directories under
./python/testdata
* Move GitHub related functionality to ./experimental (it is not in use)
* Comment out test in ./cmd/sync
* Fix test in ./git
Unexported fields are skipped during marshalling, so this would be a
nop.
The code in `./cmd/sync/github*.go` is currently unused.
Confirmed that staticcheck passes when run with:
```
staticcheck -checks SA9005 ./...
```
Functionality from `io/ioutil` has moved to the `io` and `os` packages
in go1.16 ([reference](https://pkg.go.dev/io/ioutil)).
Confirmed that staticcheck passes when run with:
```
staticcheck -checks SA1019 ./...
```