Commit Graph

50 Commits

Author SHA1 Message Date
Pieter Noordhuis 16bb224108
Add directory tracking to sync (#425)
## Changes

This change replaces usage of the `repofiles` package with the `filer`
package to consolidate WSFS code paths.

The `repofiles` package implemented the following behavior. If a file at
`foo/bar.txt` was created and removed, the directory `foo` was kept
around because we do not perform directory tracking. If subsequently, a
file at `foo` was created, it resulted in an `fs.ErrExist` because it is
impossible to overwrite a directory. It would then perform a recursive
delete of the path if this happened and retry the file write.

To make this use case work without resorting to a recursive delete on
conflict, we need to implement directory tracking as part of sync. The
approach in this commit is as follows:

1. Maintain set of directories needed for current set of files. Compare
to previous set of files. This results in mkdir of added directories and
rmdir of removed directories.
2. Creation of new directories should happen prior to writing files.
Otherwise, many file writes may race to create the same parent
directories, resulting in additional API calls. Removal of existing
directories should happen after removing files.
3. Making new directories can be deduped across common prefixes where
only the longest prefix is created recursively.
4. Removing existing directories must happen sequentially, starting with
the longest prefix.
5. Removal of directories is a best effort. It fails only if the
directory is not empty, and if this happens we know something placed a
file or directory manually, outside of sync.

## Tests

* Existing integration tests pass (modified where it used to assert
directories weren't cleaned up)
* New integration test to confirm the inability to remove a directory
doesn't fail the sync run
2023-06-12 11:44:00 +00:00
shreyas-goenka 4818541062
Add workspace export-dir command (#449)
## Changes
This PR:
1. Adds the export-dir command
2. Changes filer.Read to return an error if a user tries to read a
directory
3. Adds returning internal file structures from filer.Stat().Sys()

## Tests
Integration tests and manually
2023-06-08 18:15:12 +02:00
Pieter Noordhuis be10ff9a75
Include recursive deletion in filer interface (#442)
## Changes

This captures the recursive deletion of a directory tree in the filer interface.

Prompted by #433.

## Tests

Integration tests pass (ran the filer ones on AWS and Azure).
2023-06-06 06:27:47 +00:00
shreyas-goenka d6d35e314f
Add fs rm command for dbfs (#433)
## Changes
Please look at the title

## Tests
Integration tests
2023-06-05 23:21:47 +00:00
shreyas-goenka ae10419eb8
Add fs cat command for dbfs files (#430)
## Changes
TSIA

## Tests
Manually and integration tests
2023-06-06 01:16:23 +02:00
Andrew Nester df3f5863c7
Do not generate prompts for certain commands (#438)
## Changes
Some of the commands do not support prompts, for example `workspace
get-status` but we were wrongly suggesting customers some option.

Quick fix for this is not to provide prompts for these known commands.

Note: it uses a method from this PR in Go SDK
https://github.com/databricks/databricks-sdk-go/pull/416

## Tests
Running `workspace get-status`

Before
```
andrew.nester@HFW9Y94129 multiples-tasks % ../../cli/cli workspace get-status
Error: Path () doesn't start with '/'
```

After
```
andrew.nester@HFW9Y94129 multiples-tasks % ../../cli/cli workspace get-status
Error: accepts 1 arg(s), received 0

```
2023-06-05 19:38:45 +02:00
shreyas-goenka 6ff00122ad
Add fs ls command for dbfs (#429)
## Changes
1. Adds fs ls command
2. Adds ability to define multiple templates

## Tests
Manually and integration tests
2023-06-05 17:41:30 +02:00
Pieter Noordhuis 28d36577a2
Speed up sync integration tests (#428)
## Changes

This change implements:
* Channels for line-by-line output from stdout/stderr
* A function to wait for a sync step to complete (using above)
* Ensure all tests are prefixed `TestAccSync`
* Use temporary paths in WSFS instead of cloning a repo

## Tests

The same integration tests now pass in ~90 seconds (was ~250s).
2023-06-02 14:02:18 +00:00
shreyas-goenka 91097856b5
Add check for path is a directory in filer.ReadDir (#426)
## Tests
Integration tests
2023-06-02 12:28:35 +02:00
Pieter Noordhuis 2b56af6016
Add Stat function to filer.Filer interface (#421)
## Changes

TSIA

## Tests

New integration test passes.
2023-06-01 20:23:22 +02:00
Pieter Noordhuis 7a4ca786d8
Use cmdio in version command for `--output` flag (#419)
## Changes

Use cmdio in the version command such that it accepts the `--output` flag.

This removes the existing `--detail` flag which previously made the
command print JSON output.

## Tests

New integration test passes.
2023-06-01 12:03:22 +02:00
Pieter Noordhuis 9ae86e3ae3
Fix locker integration test (#417)
## Changes

The failure was caused by swapping out the error types returned by the filer in #139.

## Tests

Integration tests pass again.
2023-06-01 09:38:03 +02:00
Pieter Noordhuis 349e2aff40
Allow equivalence checking of filer errors to fs errors (#416)
## Changes

The pattern `errors.Is(err, fs.ErrNotExist)` is common to check for an
error type.

Errors can implement `Is(error) bool` with a custom equivalence checker.

## Tests

New asserts all pass in the integration test.
2023-05-31 20:47:00 +02:00
Pieter Noordhuis 42cd8daee0
Make filer.Filer return fs.DirEntry from ReadDir (#415)
## Changes

This allows for compatibility with the stdlib functions in io/fs.

## Tests

Integration tests pass.
2023-05-31 14:22:26 +02:00
Pieter Noordhuis 27df4e765c
Implement DBFS filer (#139)
Adds a DBFS implementation of the `filer.Filer` interface.

The integration tests are reused between the workspace filesystem and
DBFS implementations to ensure identical behavior.
2023-05-31 13:24:20 +02:00
Pieter Noordhuis 92cb52041d
Add Mkdir and ReadDir functions to filer.Filer interface (#414)
## Changes

This cherry-picks the filer changes from #408.

## Tests

Manually ran integration tests.
2023-05-31 11:11:17 +02:00
Andrew Nester aed6450baf
Do not prompt for List methods (#411)
## Changes
Do not prompt for List methods

## Tests
Running
```
cli workspace list
```

Before

```
cli workspace list
Error: Path () doesn't start with '/'
```

After

```
cli workspace list
Error: accepts 1 arg(s), received 0

```
2023-05-26 16:02:53 +02:00
Andrew Nester 3c4d6f637f
Changed service template to correctly handle required positional arguments (#405) 2023-05-26 14:46:08 +02:00
Pieter Noordhuis 854444077f
Reduce parallellism in locker integration test (#407)
## Changes

This test is hitting request throttling. A lower number of parallel
routines acquiring it should be sufficient.

## Tests

n/a
2023-05-25 15:22:51 +02:00
Pieter Noordhuis d86a1f0847
Add version flag to print version and exit (#394)
## Changes

With this PR, all of the command below print version and exit:
```
$ databricks -v
Databricks CLI v0.100.1-dev+4d3fa76
$ databricks --version
Databricks CLI v0.100.1-dev+4d3fa76
$ databricks version  
Databricks CLI v0.100.1-dev+4d3fa76
```

## Tests

Added integration test for each flag or command.
2023-05-22 20:55:42 +02:00
Pieter Noordhuis 98ebb78c9b
Rename bricks -> databricks (#389)
## Changes

Rename all instances of "bricks" to "databricks".

## Tests

* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
2023-05-16 18:35:39 +02:00
Kartik Gupta 5fc7cd0adf
Fix api post integration tests (#371) 2023-05-01 11:10:02 +02:00
Serge Smertin 9581187c9e
Update to Go SDK v0.8.0 (#351)
## Changes

- Update to Go SDK v0.8.0
- Fix all breaking changes

## Tests

- make test
2023-04-21 10:30:20 +02:00
Pieter Noordhuis 123a5e15e9
Acquire lock prior to deploy (#270)
Add configuration:

```
bundle:
  lock:
    enabled: true
    force: false
```

The force field can be set by passing the `--force` argument to `bricks
bundle deploy`. Doing so means the deployment lock is acquired even if
it is currently held. This should only be used in exceptional cases
(e.g. a previous deployment has failed to release the lock).
2023-03-22 16:37:26 +01:00
shreyas-goenka 715a4dfb21
Path escape filepaths in the URL (#250)
Before we were using url query escaping to escape the file path. This is
wrong since the file path is a part of the URL path rather than URL
query. These encoding schemes are similar but do not have identical
encodings which was why we got these weird edge cases

Fixed, and added nightly test for assert for this

```
2023/03/15 16:07:50 [INFO] Action: PUT: .gitignore, a b/bar.py, c+d/uno.py, foo.py
2023/03/15 16:07:51 [INFO] Uploaded foo.py
2023/03/15 16:07:51 [INFO] Uploaded a b/bar.py
2023/03/15 16:07:51 [INFO] Uploaded .gitignore
2023/03/15 16:07:51 [INFO] Uploaded c+d/uno.py
2023/03/15 16:07:51 [INFO] Initial Sync Complete
```

```
[VSCODE] bricks cli path: /Users/shreyas.goenka/.vscode/extensions/databricks.databricks-0.3.4-darwin-arm64/bin/bricks
[VSCODE] sync command args: sync,.,/Repos/shreyas.goenka@databricks.com/sync-fail.ide,--watch,--output,json
--------------------------------------------------------
Starting synchronization (4 files)
Uploaded .gitignore
Uploaded foo.py
Uploaded c+d/uno.py
Uploaded a b/bar.py
Completed synchronization
```
2023-03-15 17:25:57 +01:00
Pieter Noordhuis 414ea4f891
Bump databricks-sdk-go to 0.3.2 (#215) 2023-02-20 16:00:20 +01:00
Pieter Noordhuis ca04a6a1dd
Fix sync test (miss in #207) (#216) 2023-02-20 15:41:37 +01:00
Pieter Noordhuis 584c8d1b0b
Allow synchronization to a directory inside a repo (#213)
Before this commit this would error saying that the repo doesn't exist yet.

With this commit it creates the directory, but only after checking that
the repo exists.
2023-02-20 14:34:48 +01:00
Pieter Noordhuis 1715a987cf
Make sync command work in bundle context; reorder args (#207)
Invoke with `bricks sync SRC DST`.

In bundle context `SRC` and `DST` arguments are taken from bundle configuration.

This PR adds `bricks bundle sync` to disambiguate between the two.
Once the VS Code extension is bundle aware they can again be consolidated.
Consolidating them today would regress the VS Code experience if a
`bundle.yml` file is present in the file tree.
2023-02-20 11:33:30 +01:00
Pieter Noordhuis 1c27f081e0
Include build information and add version command (#194)
Includes relevant fields listed on
https://goreleaser.com/customization/templates/ into build artifacts.

The version command outputs the version by default:
```
$ bricks version
0.0.21-devel
```

Or all build information if `--json` is specified:
```
$ bricks version --json
{
  "ProjectName": "bricks",
  "Version": "0.0.21-devel",
  "Branch": "version-info",
  "Tag": "v0.0.20",
  "ShortCommit": "193b56b",
  "FullCommit": "193b56b0929128c0836d35e913c46fd66fa2a93c",
  "CommitTime": "2023-02-02T22:04:42+01:00",
  "Summary": "v0.0.20-5-g193b56b",
  "Major": 0,
  "Minor": 0,
  "Patch": 20,
  "Prerelease": "",
  "IsSnapshot": true,
  "BuildTime": "2023-02-02T22:07:36+01:00"
}
```
2023-02-03 15:38:53 +01:00
Pieter Noordhuis 03c863f49b
Update sync defaults (#177)
By default the command runs an incremental, one-time sync, similar to the
behavior of rsync. The `--persist-snapshot` flag has been removed and the
command now always saves a synchronization snapshot.

* Add `--full` flag to force full synchronization
* Add `--watch` flag to run continuously and watch the local file system for changes

This builds on #176.
2023-01-24 15:06:59 +01:00
Pieter Noordhuis fc46d21f8b
Move sync logic from cmd/sync to libs/sync (#173)
Mechanical change. Ported global variables the logic relied on to a new
`sync.Sync` struct.
2023-01-23 13:52:39 +01:00
Kartik Gupta e48eb6ff50
Fix integration tests for windows (#165)
Successful run: 
Azure:
https://github.com/databricks/eng-dev-ecosystem/actions/runs/3892131225
AWS:
https://github.com/databricks/eng-dev-ecosystem/actions/runs/3892432562
2023-01-11 13:32:02 +01:00
shreyas-goenka bf9f0c06b0
Fix sync integration test failing on aws-prod (#164)
The tests were failing because we were the list of expected files did
not include .gitignore

aws and azure test runs:

https://github.com/databricks/eng-dev-ecosystem/actions/runs/3891837322/jobs/6642552850


https://github.com/databricks/eng-dev-ecosystem/actions/runs/3891836306/jobs/6642550633
2023-01-11 11:50:27 +01:00
shreyas-goenka 0d9ecb5643
Refactor and cover edge cases in sync integration tests (#160)
This PR:
1. Refactors the sync integration tests to make them more readable
2. Adds additional tests for edge cases we encountered during vscode
runs
3. Intensional side effect: sync integration tests are also green on
windows (see
https://github.com/databricks/eng-dev-ecosystem/actions/runs/3817365642/jobs/6493576727)

Change in coverage

- We now test for python notebook <-> python file interconversion and
python notebook deletion being synced to workspace
- Tests are split up and are more focused on testing specific edge cases
2023-01-10 13:16:30 +01:00
Pieter Noordhuis a59136f77f
Use []byte for files in workspace (#162) 2023-01-05 12:03:31 +01:00
Pieter Noordhuis 32a37c1b83
Use filer.Filer in bundle/deployer/locker (#136)
Summary:
* All remote path arguments for deployer and locker are now relative to
root specified at initialization
* The workspace client is now a struct field so it doesn't have to be
passed around
2022-12-15 17:16:07 +01:00
Pieter Noordhuis 12aae35519
Abstract over file handling with WSFS or DBFS through filer interface (#135) 2022-12-14 15:37:14 +01:00
shreyas-goenka b42768801d
[DECO-396] Post mortem followups on sync deletes repo (#119)
This PR:
- Implements safeguards for not accidentally/maliciously deleting repos
by sanitizing relative paths
- Adds versioning for snapshot schemas to allow invalidation if needed
- Adds logic to delete preexisting remote artifacts that might not have
been cleaned up properly if they conflict with an upload
- A bunch of tests for the changes here

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2022-12-12 14:31:06 +01:00
Pieter Noordhuis cb16ad1184
Add RequireSuccessfulRun helper (#125) 2022-12-09 15:41:04 +01:00
Pieter Noordhuis 94f884f0a7
Run bricks sync in-process instead of through go run (#123) 2022-12-09 11:47:06 +01:00
shreyas-goenka d9d295f2a9
Implement Terraform state synchronization and deploy (#98)
https://user-images.githubusercontent.com/88374338/203669797-abebf99e-8fa6-4d6e-b57a-abd172d8020d.mov
2022-12-06 00:40:45 +01:00
Serge Smertin 487bf6fd5c
Use Databricks Go SDK v0.1.0 (#110)
This PR pins the version of Databricks SDK for Go to v0.1.0
2022-12-01 12:17:36 +01:00
Pieter Noordhuis 8e786d76a9
Update databricks-sdk-go to latest (#102) 2022-11-24 21:41:57 +01:00
shreyas-goenka d587531cbd
Added creation of .gitignore for bricks project with cache dir path (#88)
It works, please trust me
2022-11-08 13:51:08 +01:00
shreyas-goenka 18dae73505
[DECO-79][DECO-165] Incremental sync with support for multiple profiles (#82)
This PR does multiple things, which are:

1. Creates .databricks dir according to outcomes concluded in "bricks
configuration principles"
2. Puts the sync snapshots into a file whose names is tagged with
md5(concat(host, remote-path))
3. Saves both host and username in the bricks snapshot for debuggability

Tested manually:


https://user-images.githubusercontent.com/88374338/195672267-9dd90230-570f-49b7-847f-05a5a6fd8986.mov
2022-10-19 16:22:55 +02:00
Pieter Noordhuis 38a9dabcbe
Add command to make API calls (#80)
Not settled whether this should live as a top level command or hidden
under some debug scope. Either way, the ability to make arbitrary API
calls and leverage unified auth is a super useful tool.
2022-10-10 10:27:45 +02:00
shreyas-goenka ed56fc01cb
[DECO-68] Modify acceptance test to work with deco testing infra and represent vscode usecase (#78)
Contains changes to make this integration test work on our GitHub
actions testing env

1. use go run main.go to run bricks sync to run the latest bricks from
master
2. Log the output from the bricks sync process to allow for debugging
3. removed databricks.yml and instead rely on BRICKS_ROOT and other env
vars for auth and bricks sync
4. Added --persist-snapshot set to false to test full sync (same as is
used in the vscode extension

<img width="898" alt="Screenshot 2022-09-27 at 4 26 18 PM"
src="https://user-images.githubusercontent.com/88374338/192553769-7af08ca0-b73a-4cf6-a214-8c58edc4c3e5.png">

The additional logs in the picture above are from a wip PR in deco cli
that I made some changes to in order to make deco cli work with bricks :
https://github.com/databricks/eng-dev-ecosystem/pull/97
2022-10-05 13:28:53 +02:00
Pieter Noordhuis a1b6fdb2e8
Update SDK (#79) 2022-09-27 09:58:55 -07:00
shreyas-goenka f9b66b3536
Make `bricks sync` feature work (#48)
Tested manually and partially by unit tests
2022-09-14 17:50:29 +02:00