databricks-cli

Commit Graph

Author	SHA1	Message	Date
shreyas-goenka	30efe91c6d	Make local files default for fs commands (#506 ) ## Changes <!-- Summary of your changes that are easy to understand --> ## Tests <!-- How is this tested? -->	2023-06-23 16:07:09 +02:00
shreyas-goenka	d0e9953ad9	Use direct download for workspace filer read (#514 ) ## Changes Use the download method from the SDK in the read method for the WSFS implementation of the filer interface. Closes #452. ## Tests Tested by existing integration tests	2023-06-23 15:17:39 +02:00
shreyas-goenka	f9260125aa	Remove extra call to filer.Stat in dbfs filer.Read (#515 ) ## Changes This PR removes the stat call and instead relies on errors returned by the go SDK to return the appropriate errors ## Tests Tested using existing filer integration tests	2023-06-23 15:08:22 +02:00
Pieter Noordhuis	ae13135fc6	Follow up to #513 (#521 ) This reverts changes from #513 which ended up not being necessary.	2023-06-23 12:51:31 +02:00
Andrew Nester	d23d4aef3a	Fixed error on multiple profiles and failure to create a new profile with configured cluster (#513 )	2023-06-22 17:20:10 +02:00
shreyas-goenka	f2a2d058d1	Remove \r from new line print statments (#509 ) ## Changes Removes carriage character from new line prints for json output mode and sync events ## Tests Manually	2023-06-22 13:47:52 +02:00
Andrew Nester	7db1990c56	Added prompts for Databricks profile for auth login command (#502 ) ## Changes Added prompts for Databricks profile for auth login command ## Tests ``` andrew.nester@HFW9Y94129 cli % ./cli auth login https://xxxx-databricks.com ✔ Databricks Profile Name: my-profile█ ```	2023-06-21 12:58:28 +02:00
Pieter Noordhuis	e19eaca4d1	Add filer.Filer implementation backed by the Files API (#474 ) ## Tests New integration test for the read/write parts of the other filers. The integration test cannot be shared just yet because the Files API doesn't include support for creating/listing/removing directories yet.	2023-06-19 18:29:13 +00:00
shreyas-goenka	5d036ab6b8	Fix locker unlock for destroy (#492 ) ## Changes Adds ability for allowing unlock to succeed even if the deploy file is missing. ## Tests Using integration tests and manually	2023-06-19 15:57:25 +02:00
shreyas-goenka	bb32067a80	Add fs cp command (#463 ) ## Tests Tested using integration tests	2023-06-16 17:09:08 +02:00
shreyas-goenka	de47cf19f1	Use better error assertions and clean up locker API (#490 ) ## Changes Some cleanup work ## Tests Locker integration test passes	2023-06-16 16:29:04 +02:00
Pieter Noordhuis	b9406efd27	Update configure command (#482 ) ## Changes This now uses: * libs/cmdio to determine interactivity and perform prompting * libs/databrickscfg to persist the profile It loads a config.Config structure from the environment just like we do for unified authentication. It is therefore possible to specify both the host and token with environment variables. ## Tests ``` pieter.noordhuis@L4GHXDT29P /tmp % export DATABRICKS_CONFIG_FILE=.databrickscfg pieter.noordhuis@L4GHXDT29P /tmp % databricks configure Databricks Host: https://foo.bar Personal Access Token: *** pieter.noordhuis@L4GHXDT29P /tmp % cat .databrickscfg [DEFAULT] host = https://foo.bar token = token pieter.noordhuis@L4GHXDT29P /tmp % echo token \| databricks configure Error: host must be set in non-interactive mode pieter.noordhuis@L4GHXDT29P /tmp % echo token \| databricks configure --host foo Error: must start with https:// pieter.noordhuis@L4GHXDT29P /tmp % echo token \| databricks configure --host https://foo pieter.noordhuis@L4GHXDT29P /tmp % cat .databrickscfg [DEFAULT] host = https://foo token = token pieter.noordhuis@L4GHXDT29P /tmp % cat .databrickscfg pieter.noordhuis@L4GHXDT29P /tmp % databricks configure --host https://foo Personal Access Token: **** pieter.noordhuis@L4GHXDT29P /tmp % cat .databrickscfg [DEFAULT] host = https://foo token = token2 ```	2023-06-15 12:50:19 +00:00
shreyas-goenka	3d03df8844	Add mode default as a valid set value for progress-format flag (#472 ) Manually tested `progress-format` works fine for all three modes	2023-06-14 13:26:56 +02:00
Pieter Noordhuis	a17876480a	Include [DEFAULT] section header when writing ~/.databrickscfg (#464 ) ## Changes The ini library omits the default section header and in doing so breaks compatibility with Python's config parser. It raises: ``` Error: MissingSectionHeaderError: File contains no section headers. ``` This commit makes sure the DEFAULT section header is included. If the config file doesn't include a DEFAULT section itself, we include a comment describing its purpose. ## Tests New tests pass. Manually confirmed the DEFAULT section header is included. --------- Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>	2023-06-13 16:41:56 +00:00
shreyas-goenka	d38649088c	Add workspace import-dir command (#456 ) ## Tests Testing using integration tests and manually	2023-06-12 21:03:46 +02:00
Pieter Noordhuis	960ce2e18e	Add implementation of filer.Filer for local filesystem (#460 ) ## Changes Local file reads on Windows require the file handle to be closed after using it. This commit includes an interface change to return an `io.ReadCloser` from `Read` to accommodate this. ## Tests The existing integration tests for the filer interface all pass.	2023-06-12 15:53:58 +02:00
Pieter Noordhuis	16bb224108	Add directory tracking to sync (#425 ) ## Changes This change replaces usage of the `repofiles` package with the `filer` package to consolidate WSFS code paths. The `repofiles` package implemented the following behavior. If a file at `foo/bar.txt` was created and removed, the directory `foo` was kept around because we do not perform directory tracking. If subsequently, a file at `foo` was created, it resulted in an `fs.ErrExist` because it is impossible to overwrite a directory. It would then perform a recursive delete of the path if this happened and retry the file write. To make this use case work without resorting to a recursive delete on conflict, we need to implement directory tracking as part of sync. The approach in this commit is as follows: 1. Maintain set of directories needed for current set of files. Compare to previous set of files. This results in mkdir of added directories and rmdir of removed directories. 2. Creation of new directories should happen prior to writing files. Otherwise, many file writes may race to create the same parent directories, resulting in additional API calls. Removal of existing directories should happen after removing files. 3. Making new directories can be deduped across common prefixes where only the longest prefix is created recursively. 4. Removing existing directories must happen sequentially, starting with the longest prefix. 5. Removal of directories is a best effort. It fails only if the directory is not empty, and if this happens we know something placed a file or directory manually, outside of sync. ## Tests * Existing integration tests pass (modified where it used to assert directories weren't cleaned up) * New integration test to confirm the inability to remove a directory doesn't fail the sync run	2023-06-12 11:44:00 +00:00
Pieter Noordhuis	e4415bfbcf	Tweak profile prompt (#454 ) ## Changes This includes the following changes: * Move profile loading code to libs/databrickscfg and add tests * Update prompt label to reflect workspace/account profiles * Start prompt in search mode by default * Custom error if `~/.databrickscfg` doesn't exist * Custom error if `~/.databrickscfg` doesn't contain profiles * Use stderr for prompt so that stdout redirection works (e.g. with `jq` or `jless`) ## Tests * New unit tests pass * Manual tests for both workspace and account commands * Search-by-default is really nice if you have many profiles	2023-06-09 13:56:35 +02:00
shreyas-goenka	4818541062	Add workspace export-dir command (#449 ) ## Changes This PR: 1. Adds the export-dir command 2. Changes filer.Read to return an error if a user tries to read a directory 3. Adds returning internal file structures from filer.Stat().Sys() ## Tests Integration tests and manually	2023-06-08 18:15:12 +02:00
shreyas-goenka	53164ae880	Add new line to cmdio JSON rendering (#443 ) ## Changes This PR adds a new line break to JSON rendering using cmdio. This is useful when we call `cmdio.Render` multiple times ## Tests Manually Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2023-06-08 15:48:51 +02:00
Pieter Noordhuis	be10ff9a75	Include recursive deletion in filer interface (#442 ) ## Changes This captures the recursive deletion of a directory tree in the filer interface. Prompted by #433. ## Tests Integration tests pass (ran the filer ones on AWS and Azure).	2023-06-06 06:27:47 +00:00
shreyas-goenka	ae10419eb8	Add fs cat command for dbfs files (#430 ) ## Changes TSIA ## Tests Manually and integration tests	2023-06-06 01:16:23 +02:00
shreyas-goenka	6ff00122ad	Add fs ls command for dbfs (#429 ) ## Changes 1. Adds fs ls command 2. Adds ability to define multiple templates ## Tests Manually and integration tests	2023-06-05 17:41:30 +02:00
Andrew Nester	1f130f3722	Do not use FgWhite and FgBlack for terminal output (#435 ) ## Changes Using white / black color for terminal output will lead to poorly displayed content in either light or dark terminal backgrounds. Some other CLIs experienced same issues (https://github.com/qri-io/qri/pull/774) Instead, let's just use color to highlight some of the output so it's more compatible with different background styles ## Tests <img width="772" alt="Screenshot 2023-06-05 at 16 05 09" src="https://github.com/databricks/cli/assets/2969996/01790239-6a33-4059-86a8-d5117ea0b75f"> --- <img width="757" alt="Screenshot 2023-06-05 at 16 05 20" src="https://github.com/databricks/cli/assets/2969996/ea3b9fdc-3782-4f4f-a9df-19e66af0c04f">	2023-06-05 17:30:40 +02:00
Pieter Noordhuis	1c0d67f66c	Add fs.FS adapter for the filer interface (#422 ) ## Changes This enables the use of `io/fs` functions `fs.Glob` and `fs.WalkDir` with filers. We can't use `fs.FS` as the standard interface instead of `filer.Filer` because: 1. It was made for reading from filesystems only, not writing 2. It doesn't take a context for the core functions Therefore a wrapper will do. ## Tests * Added unit tests to cover the adapter through a fake filer. * Manually ran `fs.WalkDir` against both WSFS and DBFS filers.	2023-06-02 12:49:59 +00:00
Serge Smertin	a6c9533c1c	Add profile on `databricks auth login` (#423 ) ## Changes - added saving profile to `~/.databrickscfg` whenever we do `databricks auth login`. - we either match profile by account id / canonical host or introduce the new one from deployment name. - fail on multiple profiles with matching accounts or workspace hosts. - overriding `~/.databrickscfg` keeps the (valid) comments, but reformats the file. ## Tests <!-- How is this tested? --> - `make test` - `go run main.go auth login --account-id XXX --host https://accounts.cloud.databricks.com/` - `go run main.go auth token --account-id XXX --host https://accounts.cloud.databricks.com/` - `go run main.go auth login --host https://XXX.cloud.databricks.com/`	2023-06-02 13:49:39 +02:00
shreyas-goenka	91097856b5	Add check for path is a directory in filer.ReadDir (#426 ) ## Tests Integration tests	2023-06-02 12:28:35 +02:00
Pieter Noordhuis	2b56af6016	Add Stat function to filer.Filer interface (#421 ) ## Changes TSIA ## Tests New integration test passes.	2023-06-01 20:23:22 +02:00
Serge Smertin	24ebfdf31e	Add readable console logger (#370 ) ## Changes Add a readable colored console logger that is active only for TTYs: <img width="764" alt="image" src="https://user-images.githubusercontent.com/259697/235221427-ca482b32-9f88-4adb-ada3-8c4f35f50f06.png"> ## Tests Run `go run main.go clusters list --log-level debug --profile demo`	2023-06-01 11:37:33 +02:00
Pieter Noordhuis	349e2aff40	Allow equivalence checking of filer errors to fs errors (#416 ) ## Changes The pattern `errors.Is(err, fs.ErrNotExist)` is common to check for an error type. Errors can implement `Is(error) bool` with a custom equivalence checker. ## Tests New asserts all pass in the integration test.	2023-05-31 20:47:00 +02:00
Pieter Noordhuis	42cd8daee0	Make filer.Filer return fs.DirEntry from ReadDir (#415 ) ## Changes This allows for compatibility with the stdlib functions in io/fs. ## Tests Integration tests pass.	2023-05-31 14:22:26 +02:00
Pieter Noordhuis	27df4e765c	Implement DBFS filer (#139 ) Adds a DBFS implementation of the `filer.Filer` interface. The integration tests are reused between the workspace filesystem and DBFS implementations to ensure identical behavior.	2023-05-31 13:24:20 +02:00
Pieter Noordhuis	92cb52041d	Add Mkdir and ReadDir functions to filer.Filer interface (#414 ) ## Changes This cherry-picks the filer changes from #408. ## Tests Manually ran integration tests.	2023-05-31 11:11:17 +02:00
Andrew Nester	05eaf7ff50	Added secrets input prompt for secrets put-secret command (#413 ) ## Changes Added secrets input prompt for secrets put-secrets command ## Tests <img width="623" alt="Screenshot 2023-05-30 at 12 06 24" src="https://github.com/databricks/cli/assets/2969996/9338e6ba-c504-48cc-ac97-cac97dde7a3a">	2023-05-31 10:18:29 +02:00
Fabian Jakobs	aa85638070	Sync: Gracefully handle broken notebook files (#398 ) ## Changes Ignore broken notebook files during sync. Fixes https://github.com/databricks/databricks-vscode/issues/712	2023-05-23 12:10:15 +02:00
Pieter Noordhuis	8979ed1394	Fix tests for new repository name (#390 )	2023-05-16 19:02:07 +02:00
Pieter Noordhuis	98ebb78c9b	Rename bricks -> databricks (#389 ) ## Changes Rename all instances of "bricks" to "databricks". ## Tests * Confirmed the goreleaser build works, uses the correct new binary name, and produces the right archives. * Help output is confirmed to be correct. * Output of `git grep -w bricks` is minimal with a couple changes remaining for after the repository rename.	2023-05-16 18:35:39 +02:00
Andrew Nester	180dfc9a40	Added ability for deferred mutator execution (#380 ) ## Changes Added `DeferredMutator` and `bundle.Defer` function which allows to always execute some mutators either in the end of execution chain or after error occurs in the middle of execution chain. Usage as follows: ``` deferredMutator := bundle.Defer([]bundle.Mutator{ lock.Acquire() transform.DoSomething(), //... }, []bundle.Mutator{ lock.Release(), }) ``` In such case `lock.Release()` will always be executed: either when all operations above succeed or when any of them fails ## Tests Before the change ``` andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files! Error: terraform not initialized andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy Error: deploy lock acquired by andrew.nester@databricks.com at 2023-05-10 16:41:22.902659 +0200 CEST. Use --force to override ``` After the change ``` andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files! Error: terraform not initialized andrew.nester@HFW9Y94129 multiples-tasks % bricks bundle deploy Starting upload of bundle files Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/simple-task/development/files! Error: terraform not initialized ```	2023-05-16 18:01:50 +02:00
shreyas-goenka	9e16140b6e	Add git config block to bundle config (#356 ) ## Changes This config block contains commit, branch and remote_url which will be automatically loaded if specified in the repo, and can also be specified by the user ## Tests Unit and black-box tests	2023-04-26 16:54:36 +02:00
Serge Smertin	4c4a293015	Added OpenAPI command coverage (#357 ) This PR adds the following command groups: ## Workspace-level command groups * `bricks alerts` - The alerts API can be used to perform CRUD operations on alerts. * `bricks catalogs` - A catalog is the first layer of Unity Catalog’s three-level namespace. * `bricks cluster-policies` - Cluster policy limits the ability to configure clusters based on a set of rules. * `bricks clusters` - The Clusters API allows you to create, start, edit, list, terminate, and delete clusters. * `bricks current-user` - This API allows retrieving information about currently authenticated user or service principal. * `bricks dashboards` - In general, there is little need to modify dashboards using the API. * `bricks data-sources` - This API is provided to assist you in making new query objects. * `bricks experiments` - MLflow Experiment tracking. * `bricks external-locations` - An external location is an object that combines a cloud storage path with a storage credential that authorizes access to the cloud storage path. * `bricks functions` - Functions implement User-Defined Functions (UDFs) in Unity Catalog. * `bricks git-credentials` - Registers personal access token for Databricks to do operations on behalf of the user. * `bricks global-init-scripts` - The Global Init Scripts API enables Workspace administrators to configure global initialization scripts for their workspace. * `bricks grants` - In Unity Catalog, data is secure by default. * `bricks groups` - Groups simplify identity management, making it easier to assign access to Databricks Workspace, data, and other securable objects. * `bricks instance-pools` - Instance Pools API are used to create, edit, delete and list instance pools by using ready-to-use cloud instances which reduces a cluster start and auto-scaling times. * `bricks instance-profiles` - The Instance Profiles API allows admins to add, list, and remove instance profiles that users can launch clusters with. * `bricks ip-access-lists` - IP Access List enables admins to configure IP access lists. * `bricks jobs` - The Jobs API allows you to create, edit, and delete jobs. * `bricks libraries` - The Libraries API allows you to install and uninstall libraries and get the status of libraries on a cluster. * `bricks metastores` - A metastore is the top-level container of objects in Unity Catalog. * `bricks model-registry` - MLflow Model Registry commands. * `bricks permissions` - Permissions API are used to create read, write, edit, update and manage access for various users on different objects and endpoints. * `bricks pipelines` - The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines. * `bricks policy-families` - View available policy families. * `bricks providers` - Databricks Providers REST API. * `bricks queries` - These endpoints are used for CRUD operations on query definitions. * `bricks query-history` - Access the history of queries through SQL warehouses. * `bricks recipient-activation` - Databricks Recipient Activation REST API. * `bricks recipients` - Databricks Recipients REST API. * `bricks repos` - The Repos API allows users to manage their git repos. * `bricks schemas` - A schema (also called a database) is the second layer of Unity Catalog’s three-level namespace. * `bricks secrets` - The Secrets API allows you to manage secrets, secret scopes, and access permissions. * `bricks service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms. * `bricks serving-endpoints` - The Serving Endpoints API allows you to create, update, and delete model serving endpoints. * `bricks shares` - Databricks Shares REST API. * `bricks storage-credentials` - A storage credential represents an authentication and authorization mechanism for accessing data stored on your cloud tenant. * `bricks table-constraints` - Primary key and foreign key constraints encode relationships between fields in tables. * `bricks tables` - A table resides in the third layer of Unity Catalog’s three-level namespace. * `bricks token-management` - Enables administrators to get all tokens and delete tokens for other users. * `bricks tokens` - The Token API allows you to create, list, and revoke tokens that can be used to authenticate and access Databricks REST APIs. * `bricks users` - User identities recognized by Databricks and represented by email addresses. * `bricks volumes` - Volumes are a Unity Catalog (UC) capability for accessing, storing, governing, organizing and processing files. * `bricks warehouses` - A SQL warehouse is a compute resource that lets you run SQL commands on data objects within Databricks SQL. * `bricks workspace` - The Workspace API allows you to list, import, export, and delete notebooks and folders. * `bricks workspace-conf` - This API allows updating known workspace settings for advanced users. ## Account-level command groups * `bricks account billable-usage` - This API allows you to download billable usage logs for the specified account and date range. * `bricks account budgets` - These APIs manage budget configuration including notifications for exceeding a budget for a period. * `bricks account credentials` - These APIs manage credential configurations for this workspace. * `bricks account custom-app-integration` - These APIs enable administrators to manage custom oauth app integrations, which is required for adding/using Custom OAuth App Integration like Tableau Cloud for Databricks in AWS cloud. * `bricks account encryption-keys` - These APIs manage encryption key configurations for this workspace (optional). * `bricks account groups` - Groups simplify identity management, making it easier to assign access to Databricks Account, data, and other securable objects. * `bricks account ip-access-lists` - The Accounts IP Access List API enables account admins to configure IP access lists for access to the account console. * `bricks account log-delivery` - These APIs manage log delivery configurations for this account. * `bricks account metastore-assignments` - These APIs manage metastore assignments to a workspace. * `bricks account metastores` - These APIs manage Unity Catalog metastores for an account. * `bricks account networks` - These APIs manage network configurations for customer-managed VPCs (optional). * `bricks account o-auth-enrollment` - These APIs enable administrators to enroll OAuth for their accounts, which is required for adding/using any OAuth published/custom application integration. * `bricks account private-access` - These APIs manage private access settings for this account. * `bricks account published-app-integration` - These APIs enable administrators to manage published oauth app integrations, which is required for adding/using Published OAuth App Integration like Tableau Cloud for Databricks in AWS cloud. * `bricks account service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms. * `bricks account storage` - These APIs manage storage configurations for this workspace. * `bricks account storage-credentials` - These APIs manage storage credentials for a particular metastore. * `bricks account users` - User identities recognized by Databricks and represented by email addresses. * `bricks account vpc-endpoints` - These APIs manage VPC endpoint configurations for this account. * `bricks account workspace-assignment` - The Workspace Permission Assignment API allows you to manage workspace permissions for principals in your account. * `bricks account workspaces` - These APIs manage workspaces for this account.	2023-04-26 13:06:16 +02:00
shreyas-goenka	43bc9a0d9d	Use cmdio logger to log bricks cmd execution errors (#348 ) ## Changes Uses the cmdio logger to log the execution error ## Tests Manually by making the root command return fake errors. Here is the output: ``` shreyas.goenka@THW32HFW6T bricks % bricks bundle validate Error: my foo error ``` ``` shreyas.goenka@THW32HFW6T bricks % bricks bundle validate --progress-format=json { "error": "my foo error" } ``` --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2023-04-24 12:11:52 +02:00
Serge Smertin	9581187c9e	Update to Go SDK v0.8.0 (#351 ) ## Changes - Update to Go SDK v0.8.0 - Fix all breaking changes ## Tests - make test	2023-04-21 10:30:20 +02:00
shreyas-goenka	ddc0237468	Error out if question prompts are used in json mode (#340 ) ## Changes This PR disallows questions in json mode ## Tests Manually and unit test ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle destroy --progress-format=json The following resources will be removed: { "resource_type": "databricks_job", "action": "delete", "resource_name": "foo" } Error: question prompts are not supported in json mode ```	2023-04-18 17:13:49 +02:00
shreyas-goenka	598ad62688	Log mutator messages using progress logger (#312 ) This PR uses progress logger to log messages inside mutators	2023-04-18 16:55:06 +02:00
shreyas-goenka	85889dffb1	Move state to event for whether they support inplace progress logging (#339 ) ## Changes Adds a IsInplaceSupported() function to the event interface. Any event that now uses the progress logger has to declare whether they support in place logging ## Tests Manually	2023-04-18 14:20:35 +02:00
shreyas-goenka	b9c68b4bd5	Fix wrap around issues with inplace logging (#334 ) ## Changes We deal with wraparounds for long lines of text in a bad way. This PR fixes that by saving the cursor position ## Tests Manually	2023-04-14 13:06:04 +02:00
shreyas-goenka	bd11da88eb	Do not fail snapshot destroy if snapshot does not exist (#328 ) ## Changes `bricks bundle destroy` would fail if the sync snapshot did not exist ## Tests Manually After: ``` shreyas.goenka@THW32HFW6T bundle-destroy % bricks bundle destroy --auto-approve No resources to destroy! Remote directory /Users/shreyas.goenka@databricks.com/.bundle/destroy/default will be deleted Successfully deleted files! ``` Before: ``` shreyas.goenka@THW32HFW6T bundle-destroy % bricks bundle destroy --auto-approve No resources to destroy! Remote directory /Users/shreyas.goenka@databricks.com/.bundle/destroy/default will be deleted Error: failed to destroy sync snapshot file: remove /Users/shreyas.goenka/projects/bundle-destroy/.databricks/bundle/default/sync-snapshots/a5bd1966cb8980a9.json: no such file or directory ```	2023-04-12 21:37:01 +02:00
shreyas-goenka	42cd405eba	Add tests for fileSet adding `databricks` to .gitignore (#325 ) ## Changes <!-- Summary of your changes that are easy to understand --> These are flows that were earlier only being tested in package `project`. Since package `project` has been deleted in https://github.com/databricks/bricks/pull/321, we needed to add coverage as done here ## Tests <!-- How is this tested? -->	2023-04-12 12:04:10 +02:00
Miles Yucht	946906221d	Delete sync snapshots file when destroying a bundle (#323 ) ## Changes This PR changes the files.Delete() mutator to delete the sync snapshots file on destroy. This ensures that files will be uploaded when the bundle is uploaded again. ## Tests - [x] Manual test: Ran `bricks bundle destroy`, observed that the sync snapshots file was deleted.	2023-04-11 16:57:01 +02:00
shreyas-goenka	4871f7bc8a	Add bundle destroy command (#300 ) Adds bundle destroy capability to bricks	2023-04-06 12:54:58 +02:00
Pieter Noordhuis	33645ae6ef	Revert "Configure log level to info by default (#267 )" (#307 ) ## Changes This reverts commit `e7a7e5b95a`. Job and pipeline runs print progress information now. No need to continue to rely on logging for this. ## Tests	2023-04-05 15:37:09 +02:00
Serge Smertin	02d9f877b5	Make `bricks auth` use `all-apis` scope (#304 ) ## Changes Use `all-apis` scope, so that we can use the issued token for SCIM APIs. The production environment has to be tuned in order to enable `all-apis` scope for a specific account. ## Tests Manual	2023-04-05 10:18:13 +02:00
shreyas-goenka	902813a490	Hardcode `.databricks` ignore pattern to ensure we never sync the cache directory (#295 ) ## Changes <!-- Summary of your changes that are easy to understand --> 1. Add pattern to always ignore .databricks 2. Best effort creation of .gitignore with .databricks if it's needed ## Tests <!-- How is this tested? -->	2023-04-04 15:44:57 +02:00
dependabot[bot]	57cf66d3a8	Bump github.com/databricks/databricks-sdk-go from 0.5.0 to 0.6.0 (#299 )	2023-04-03 21:33:21 +02:00
Pieter Noordhuis	cfd32c9602	Try to resolve a profile if only the host is specified (#287 ) ## Changes This improves out of the box usability where a user who already configured a `.databrickscfg` file will be able to reference the workspace host in their `bundle.yml` and it will automatically pick up the right profile. ## Tests * Newly added tests pass. * Manual testing confirms intended behavior. --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2023-03-29 20:44:19 +02:00
Pieter Noordhuis	8af934bbbb	Function to find the Git repository containing a bundle (#289 ) ## Changes Useful functions from #277. ## Tests Tests pass.	2023-03-29 16:36:35 +02:00
shreyas-goenka	8fd3dccca9	Add progress logs for job runs (#276 )	2023-03-29 14:58:09 +02:00
Pieter Noordhuis	1b47dd3af7	Trim log source field to basename of file (#273 ) This makes logs more readable and avoids leaking paths. Before: ``` time=2023-03-22T16:38:30.238+01:00 level=INFO source=/Users/pieter.noordhuis/dev/bricks/bundle/phases/phase.go:30 msg="Phase: initialize" time=2023-03-22T16:38:31.303+01:00 level=INFO source=/Users/pieter.noordhuis/dev/bricks/bundle/phases/phase.go:30 msg="Phase: build" time=2023-03-22T16:38:31.303+01:00 level=INFO source=/Users/pieter.noordhuis/dev/bricks/bundle/phases/phase.go:30 msg="Phase: deploy" ``` After: ``` time=2023-03-22T17:02:47.290+01:00 level=INFO source=phase.go:30 msg="Phase: initialize" time=2023-03-22T17:02:48.171+01:00 level=INFO source=phase.go:30 msg="Phase: build" time=2023-03-22T17:02:48.171+01:00 level=INFO source=phase.go:30 msg="Phase: deploy" ```	2023-03-23 08:56:39 +01:00
Pieter Noordhuis	123a5e15e9	Acquire lock prior to deploy (#270 ) Add configuration: ``` bundle: lock: enabled: true force: false ``` The force field can be set by passing the `--force` argument to `bricks bundle deploy`. Doing so means the deployment lock is acquired even if it is currently held. This should only be used in exceptional cases (e.g. a previous deployment has failed to release the lock).	2023-03-22 16:37:26 +01:00
shreyas-goenka	75d516939b	Error out if notebook file does not exist locally (#261 ) Adds check for whether file exists locally case 1: local (relative) file does not exist ``` foo: name: "[job-output] test-job by shreyas" tasks: - task_key: my_notebook_task existing_cluster_id: * notebook_task: notebook_path: "./doesnotexist" ``` output: ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle deploy Error: notebook ./doesnotexist not found. Error: open /Users/shreyas.goenka/projects/job-output/doesnotexist: no such file or directory ``` case 2: remote (absolute) file does not exist ``` foo: name: "[job-output] test-job by shreyas" tasks: - task_key: my_notebook_task existing_cluster_id: * notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/doesnotexist" ``` output: ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle deploy shreyas.goenka@THW32HFW6T job-output % bricks bundle run foo Error: failed to reach TERMINATED or SKIPPED, got INTERNAL_ERROR: Task my_notebook_task failed with message: Notebook not found: /Users/shreyas.goenka@databricks.com/doesnotexist. This caused all downstream tasks to get skipped. ``` case 3: remote exists Successful deploy and run	2023-03-21 18:13:16 +01:00
Pieter Noordhuis	7dcc0d4b41	Fix test (#268 ) Follow up to #267.	2023-03-21 16:34:16 +01:00
Pieter Noordhuis	e7a7e5b95a	Configure log level to info by default (#267 ) Note: we log at INFO level by default until we implement progress reporting to stdout/stderr.	2023-03-21 16:14:20 +01:00
shreyas-goenka	ae09eb02d5	Path escape file path in filer interface (#254 )	2023-03-17 17:42:35 +01:00
Pieter Noordhuis	ad666ff796	Use new logger throughout codebase (#256 )	2023-03-17 15:17:31 +01:00
Pieter Noordhuis	c9340d6317	Drain sync event channel before returning (#253 ) Not waiting means the last few events may or may not be printed. This is relevant in the mode where sync runs once and then terminates.	2023-03-16 17:48:17 +01:00
Pieter Noordhuis	32a29c6af4	Add structured logging infrastructure (#246 ) New global flags: * `--log-file FILE`: can be literal `stdout`, `stderr`, or a file name (default `stderr`) * `--log-level LEVEL`: can be `error`, `warn`, `info`, `debug`, `trace`, or `disabled` (default `disabled`) * `--log-format TYPE`: can be `text` or `json` (default `text`) New functions in the `log` package take a `context.Context` and retrieve the logger from said context. Because we carry the logger in a context, adding [attributes](https://pkg.go.dev/golang.org/x/exp/slog#hdr-Attrs_and_Values) to the logger can be done as follows: ```go ctx = log.NewContext(ctx, log.GetLogger(ctx).With("foo", "bar")) ```	2023-03-16 14:46:53 +01:00
shreyas-goenka	715a4dfb21	Path escape filepaths in the URL (#250 ) Before we were using url query escaping to escape the file path. This is wrong since the file path is a part of the URL path rather than URL query. These encoding schemes are similar but do not have identical encodings which was why we got these weird edge cases Fixed, and added nightly test for assert for this ``` 2023/03/15 16:07:50 [INFO] Action: PUT: .gitignore, a b/bar.py, c+d/uno.py, foo.py 2023/03/15 16:07:51 [INFO] Uploaded foo.py 2023/03/15 16:07:51 [INFO] Uploaded a b/bar.py 2023/03/15 16:07:51 [INFO] Uploaded .gitignore 2023/03/15 16:07:51 [INFO] Uploaded c+d/uno.py 2023/03/15 16:07:51 [INFO] Initial Sync Complete ``` ``` [VSCODE] bricks cli path: /Users/shreyas.goenka/.vscode/extensions/databricks.databricks-0.3.4-darwin-arm64/bin/bricks [VSCODE] sync command args: sync,.,/Repos/shreyas.goenka@databricks.com/sync-fail.ide,--watch,--output,json -------------------------------------------------------- Starting synchronization (4 files) Uploaded .gitignore Uploaded foo.py Uploaded c+d/uno.py Uploaded a b/bar.py Completed synchronization ```	2023-03-15 17:25:57 +01:00
shreyas-goenka	316a006125	Add check for file exists incase of conflicting remote names (#244 ) Before: ``` shreyas.goenka@THW32HFW6T deco-538-pipeline-error % bricks bundle deploy Error: both myNb.py and myNb.sql point to the same remote file location myNb. Please remove one of them from your local project ``` Even though myNb.sql was created by renaming myNb.py Now deployments are successful	2023-03-10 11:52:45 +01:00
Pieter Noordhuis	fe738ede6a	Let sync return early if an error occurs (#235 ) The previous approach would proceed to execute all requests prior to returning the first error. This is solved with `errgroup.WithContext` that cancels the context if a routine returns an error.	2023-03-09 13:29:05 +01:00
Fabian Jakobs	f0c35a2b27	Initialize BRICKS_CLI_PATH and increase default OAuth timeout (#237 ) related to https://github.com/databricks/databricks-sdk-go/pull/330	2023-03-08 16:14:24 +01:00
Pieter Noordhuis	65b3f998ba	Escape URL in filer (#236 ) Also see #228.	2023-03-08 14:27:05 +01:00
Fabian Jakobs	da4b58a897	Fix link to workspace after AWS OAuth login (#234 ) `Host` is already normalized and always has the `https://` prefix.	2023-03-08 11:56:46 +01:00
Pieter Noordhuis	e872b587cc	Add optional JSON output for sync command (#230 ) JSON output makes it easy to process synchronization progress information in downstream tools (e.g. the vscode extension). This changes introduces a `sync.Event` interface type for progress events as well as an `sync.EventNotifier` that lets the sync code pass along progress events to calling code. Example output in text mode (default, this uses the existing logger calls): ```text 2023/03/03 14:07:17 [INFO] Remote file sync location: /Repos/pieter.noordhuis@databricks.com/... 2023/03/03 14:07:18 [INFO] Initial Sync Complete 2023/03/03 14:07:22 [INFO] Action: PUT: foo 2023/03/03 14:07:23 [INFO] Uploaded foo 2023/03/03 14:07:23 [INFO] Complete 2023/03/03 14:07:25 [INFO] Action: DELETE: foo 2023/03/03 14:07:25 [INFO] Deleted foo 2023/03/03 14:07:25 [INFO] Complete ``` Example output in JSON mode: ```json {"timestamp":"2023-03-03T14:08:15.459439+01:00","seq":0,"type":"start"} {"timestamp":"2023-03-03T14:08:15.459461+01:00","seq":0,"type":"complete"} {"timestamp":"2023-03-03T14:08:18.459821+01:00","seq":1,"type":"start","put":["foo"]} {"timestamp":"2023-03-03T14:08:18.459867+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":0} {"timestamp":"2023-03-03T14:08:19.418696+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":1} {"timestamp":"2023-03-03T14:08:19.421397+01:00","seq":1,"type":"complete","put":["foo"]} {"timestamp":"2023-03-03T14:08:22.459238+01:00","seq":2,"type":"start","delete":["foo"]} {"timestamp":"2023-03-03T14:08:22.459268+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":0} {"timestamp":"2023-03-03T14:08:22.686413+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":1} {"timestamp":"2023-03-03T14:08:22.688989+01:00","seq":2,"type":"complete","delete":["foo"]} ``` --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2023-03-08 10:27:19 +01:00
shreyas-goenka	5166055efb	[DECO-553] Escape file path strings in URL (#228 ) Tested manually Before: ``` shreyas.goenka@THW32HFW6T test-dbx % bricks sync --full . /Repos/shreyas.goenka@databricks.com/test-dbx 2023/02/27 19:51:17 [INFO] Remote file sync location: /Repos/shreyas.goenka@databricks.com/test-dbx 2023/02/27 19:51:17 [INFO] Action: PUT: #foo.py, .gitignore 2023/02/27 19:51:19 [INFO] Uploaded .gitignore Error: Creating file failed. An item with path /Repos/shreyas.goenka@databricks.com/test-dbx already exists ``` After: ``` shreyas.goenka@THW32HFW6T test-dbx % bricks sync --full . /Repos/shreyas.goenka@databricks.com/test-dbx 2023/02/27 19:51:46 [INFO] Remote file sync location: /Repos/shreyas.goenka@databricks.com/test-dbx 2023/02/27 19:51:46 [INFO] Action: PUT: #foo.py, .gitignore 2023/02/27 19:51:47 [INFO] Uploaded .gitignore 2023/02/27 19:51:47 [INFO] Uploaded #foo.py ```	2023-02-28 03:17:13 +01:00
shreyas-goenka	2615d66945	[DECO-531] Increase timeout for file import api calls (#223 ) This PR increases the client side timeout for upload API calls to 10 minutes to give sync enough time to import larger files	2023-02-22 16:01:58 +01:00
Pieter Noordhuis	9d3a0da073	Detect Jupyter notebook files (#219 ) Files with extension `.ipynb` are imported are Jupyter notebooks. This code detects 1) if the file is a valid Jupyter notebook and 2) the Databricks specific language it contains.	2023-02-21 13:49:01 +01:00
Pieter Noordhuis	7398a6d1e4	Add sample ipynb files (#218 ) Co-authored-by: pietern <pietern>	2023-02-20 20:03:20 +01:00
Pieter Noordhuis	414ea4f891	Bump databricks-sdk-go to 0.3.2 (#215 )	2023-02-20 16:00:20 +01:00
Pieter Noordhuis	584c8d1b0b	Allow synchronization to a directory inside a repo (#213 ) Before this commit this would error saying that the repo doesn't exist yet. With this commit it creates the directory, but only after checking that the repo exists.	2023-02-20 14:34:48 +01:00
Pieter Noordhuis	1715a987cf	Make sync command work in bundle context; reorder args (#207 ) Invoke with `bricks sync SRC DST`. In bundle context `SRC` and `DST` arguments are taken from bundle configuration. This PR adds `bricks bundle sync` to disambiguate between the two. Once the VS Code extension is bundle aware they can again be consolidated. Consolidating them today would regress the VS Code experience if a `bundle.yml` file is present in the file tree.	2023-02-20 11:33:30 +01:00
Pieter Noordhuis	58950ce507	Move notebook detection logic to package (#206 )	2023-02-15 17:14:59 +01:00
Fabian Jakobs	8c1b620b17	Don't sync symlink folders (#205 ) Fixes https://github.com/databricks/databricks-vscode/issues/452	2023-02-15 17:02:54 +01:00
Pieter Noordhuis	abb1de99ba	Locate and use global excludes file (#191 ) This implements rudimentary gitconfig loading as specified at https://git-scm.com/docs/git-config.	2023-02-02 12:25:53 +01:00
Pieter Noordhuis	241562e2b1	Move git package to libs/git (#189 ) Fixes #185.	2023-01-31 19:19:16 +01:00
Pieter Noordhuis	a7bf7ba6c5	Reload .gitignore files if they have changed (#190 ) This commit changes the code in repository.go to lazily load gitignore files as opposed to the previous eager approach. This means that the signature of the `Ignore` function family has changed to return `(bool, error)`. This lazy approach fits better when other code is responsible for recursively walking the file tree, because we never know up front which gitignore files need to be loaded to compute the ignores. It also means we no longer have to "prime" the `Repository` instance with a particular directory we're interested in and rather let calls to `Ignore` load whatever is needed. The fileset wrapper under `git/` internally taints all gitignore objects to force a call to [os.Stat] followed by a reload if they have changed, before calling into the [fileset.FileSet] functions for recursively listing files.	2023-01-31 18:34:36 +01:00
Pieter Noordhuis	eb76e5d3e8	Move git.FileSet to libs/fileset and make it aware of gitignores (#184 ) This moves `git.FileSet` to `libs/fileset` and decouples it from the Git package. It is made aware of gitignore rules in parent directories up to the repository root as well as gitignore files in underlying directories through the `fileset.Ignorer` interface. The recursive directory walker is reimplemented with [filepath.WalkDir]. Follow up to #182.	2023-01-27 16:04:58 +01:00
Pieter Noordhuis	03c863f49b	Update sync defaults (#177 ) By default the command runs an incremental, one-time sync, similar to the behavior of rsync. The `--persist-snapshot` flag has been removed and the command now always saves a synchronization snapshot. * Add `--full` flag to force full synchronization * Add `--watch` flag to run continuously and watch the local file system for changes This builds on #176.	2023-01-24 15:06:59 +01:00
Pieter Noordhuis	077304ffa1	Move path checking logic for sync command to libs/sync (#176 ) This change also adds testcases for checking if the specified path is nested under the valid base paths and fixes an edge case where the user could synchronize into their home directory directly. Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2023-01-24 13:58:10 +01:00
Pieter Noordhuis	c777a703cf	Move diff struct to its own file (#175 )	2023-01-24 11:06:14 +01:00
Pieter Noordhuis	015a2bf9bb	Remove dependency on project package in libs/sync (#174 ) The code depended on the project package for: * git.FileSet in the watchdog * project.CacheDir to determine snapshot path These dependencies are now denormalized in the SyncOptions struct. Follow up for #173.	2023-01-24 08:30:10 +01:00
Pieter Noordhuis	fc46d21f8b	Move sync logic from cmd/sync to libs/sync (#173 ) Mechanical change. Ported global variables the logic relied on to a new `sync.Sync` struct.	2023-01-23 13:52:39 +01:00
shreyas-goenka	0d9ecb5643	Refactor and cover edge cases in sync integration tests (#160 ) This PR: 1. Refactors the sync integration tests to make them more readable 2. Adds additional tests for edge cases we encountered during vscode runs 3. Intensional side effect: sync integration tests are also green on windows (see https://github.com/databricks/eng-dev-ecosystem/actions/runs/3817365642/jobs/6493576727) Change in coverage - We now test for python notebook <-> python file interconversion and python notebook deletion being synced to workspace - Tests are split up and are more focused on testing specific edge cases	2023-01-10 13:16:30 +01:00
Serge Smertin	b87b4b0f40	Added `bricks auth login` and `bricks auth token` (#158 ) # Auth challenge (happy path) Simplified description of [PKCE](https://oauth.net/2/pkce/) implementation: ```mermaid sequenceDiagram autonumber actor User User ->> CLI: type `bricks auth login HOST` CLI ->>+ HOST: request OIDC endpoints HOST ->>- CLI: auth & token endpoints CLI ->> CLI: start embedded server to consume redirects (lock) CLI -->>+ Auth Endpoint: open browser with RND1 + SHA256(RND2) User ->>+ Auth Endpoint: Go through SSO Auth Endpoint ->>- CLI: AUTH CODE + 'RND1 (redirect) CLI ->>+ Token Endpoint: Exchange: AUTH CODE + RND2 Token Endpoint ->>- CLI: Access Token (JWT) + refresh + expiry CLI ->> Token cache: Save Access Token (JWT) + refresh + expiry CLI ->> User: success ``` # Token refresh (happy path) ```mermaid sequenceDiagram autonumber actor User User ->> CLI: type `bricks token HOST` CLI ->> CLI: acquire lock (same local addr as redirect server) CLI ->>+ Token cache: read token critical token not expired Token cache ->>- User: JWT (without refresh) option token is expired CLI ->>+ HOST: request OIDC endpoints HOST ->>- CLI: auth & token endpoints CLI ->>+ Token Endpoint: refresh token Token Endpoint ->>- CLI: JWT (refreshed) CLI ->> Token cache: save JWT (refreshed) CLI ->> User: JWT (refreshed) option no auth for host CLI -X User: no auth configured end ```	2023-01-06 16:15:57 +01:00
Pieter Noordhuis	a59136f77f	Use []byte for files in workspace (#162 )	2023-01-05 12:03:31 +01:00
Pieter Noordhuis	32a37c1b83	Use filer.Filer in bundle/deployer/locker (#136 ) Summary: * All remote path arguments for deployer and locker are now relative to root specified at initialization * The workspace client is now a struct field so it doesn't have to be passed around	2022-12-15 17:16:07 +01:00
Pieter Noordhuis	4e834857e6	Extract filer path handling into separate type (#138 ) This makes it reusable for the DBFS filer.	2022-12-14 23:41:37 +01:00
Pieter Noordhuis	12aae35519	Abstract over file handling with WSFS or DBFS through filer interface (#135 )	2022-12-14 15:37:14 +01:00

... 3 4 5 6 7

347 Commits