## Changes
Rename all instances of "bricks" to "databricks".
## Tests
* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
## Changes
These are unlikely to ever be DBFS paths so we can remove this level of indirection to simplify.
**Note:** this is a breaking change. Downstream usage of these fields must be updated.
## Tests
Existing tests pass.
JSON output makes it easy to process synchronization progress
information in downstream tools (e.g. the vscode extension).
This changes introduces a `sync.Event` interface type for progress events as
well as an `sync.EventNotifier` that lets the sync code pass along
progress events to calling code.
Example output in text mode (default, this uses the existing logger calls):
```text
2023/03/03 14:07:17 [INFO] Remote file sync location: /Repos/pieter.noordhuis@databricks.com/...
2023/03/03 14:07:18 [INFO] Initial Sync Complete
2023/03/03 14:07:22 [INFO] Action: PUT: foo
2023/03/03 14:07:23 [INFO] Uploaded foo
2023/03/03 14:07:23 [INFO] Complete
2023/03/03 14:07:25 [INFO] Action: DELETE: foo
2023/03/03 14:07:25 [INFO] Deleted foo
2023/03/03 14:07:25 [INFO] Complete
```
Example output in JSON mode:
```json
{"timestamp":"2023-03-03T14:08:15.459439+01:00","seq":0,"type":"start"}
{"timestamp":"2023-03-03T14:08:15.459461+01:00","seq":0,"type":"complete"}
{"timestamp":"2023-03-03T14:08:18.459821+01:00","seq":1,"type":"start","put":["foo"]}
{"timestamp":"2023-03-03T14:08:18.459867+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":0}
{"timestamp":"2023-03-03T14:08:19.418696+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":1}
{"timestamp":"2023-03-03T14:08:19.421397+01:00","seq":1,"type":"complete","put":["foo"]}
{"timestamp":"2023-03-03T14:08:22.459238+01:00","seq":2,"type":"start","delete":["foo"]}
{"timestamp":"2023-03-03T14:08:22.459268+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":0}
{"timestamp":"2023-03-03T14:08:22.686413+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":1}
{"timestamp":"2023-03-03T14:08:22.688989+01:00","seq":2,"type":"complete","delete":["foo"]}
```
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
Invoke with `bricks sync SRC DST`.
In bundle context `SRC` and `DST` arguments are taken from bundle configuration.
This PR adds `bricks bundle sync` to disambiguate between the two.
Once the VS Code extension is bundle aware they can again be consolidated.
Consolidating them today would regress the VS Code experience if a
`bundle.yml` file is present in the file tree.
By default the command runs an incremental, one-time sync, similar to the
behavior of rsync. The `--persist-snapshot` flag has been removed and the
command now always saves a synchronization snapshot.
* Add `--full` flag to force full synchronization
* Add `--watch` flag to run continuously and watch the local file system for changes
This builds on #176.
This change also adds testcases for checking if the specified path is
nested under the valid base paths and fixes an edge case where the user
could synchronize into their home directory directly.
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
The code depended on the project package for:
* git.FileSet in the watchdog
* project.CacheDir to determine snapshot path
These dependencies are now denormalized in the SyncOptions struct.
Follow up for #173.
With this change:
* Paths under `/Workspace/<me>` and `/Repos/<me>` are allowed
* The sync destination is checked to be either a directory or a repository
* If it is under `/Repos` and doesn't exist, the command returns an error
This PR:
1. Refactors the sync integration tests to make them more readable
2. Adds additional tests for edge cases we encountered during vscode
runs
3. Intensional side effect: sync integration tests are also green on
windows (see
https://github.com/databricks/eng-dev-ecosystem/actions/runs/3817365642/jobs/6493576727)
Change in coverage
- We now test for python notebook <-> python file interconversion and
python notebook deletion being synced to workspace
- Tests are split up and are more focused on testing specific edge cases
Tested by running the unit and integration tests locally
Tested manually on windows
Screenshot from windows sync logs indicating that the correct slashed
for paths were used:
<img width="623" alt="Screenshot 2022-12-21 at 9 09 13 PM"
src="https://user-images.githubusercontent.com/88374338/208943937-146670b2-1afd-4e0b-8f4e-6091c8c7e17a.png">
@pietern with this the state machine for syncing becomes slightly more
complicated, indicating a stronger need for a tree based approach herre
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
This PR:
- Implements safeguards for not accidentally/maliciously deleting repos
by sanitizing relative paths
- Adds versioning for snapshot schemas to allow invalidation if needed
- Adds logic to delete preexisting remote artifacts that might not have
been cleaned up properly if they conflict with an upload
- A bunch of tests for the changes here
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Unit tests are now run in all three big OS.
Some of the changes are to make the tests green for windows while we are
skipping some of the other tests on windows/macOS to make the tests
pass. This is a temporary measure and we will incrementally migrate
these tests over so there is parity in unit testing along all three
environments!
This PR introduces tracking of remote names and local names of files in snapshots to disambiguate between files which might have the same remote name and handle clean deleting of files whose remote name changes due (eg. python notebook getting converted to a python notebook)
This PR does multiple things, which are:
1. Creates .databricks dir according to outcomes concluded in "bricks
configuration principles"
2. Puts the sync snapshots into a file whose names is tagged with
md5(concat(host, remote-path))
3. Saves both host and username in the bricks snapshot for debuggability
Tested manually:
https://user-images.githubusercontent.com/88374338/195672267-9dd90230-570f-49b7-847f-05a5a6fd8986.mov
Tested manually
We are adding this flag because the default bricks sync is not robust
against changing the profile and other project config changes. This will
be used in the initial version of the vscode extention
This PR:
1. Replaces scim.Me call to use the go SDK instead of the terraform
client
2. Removes terraform client from bricks project
Tested manually that the scim.Me call works now and returns the correct
user
go build works
By:
* Add .gitkeep to retain test fixture directories under
./python/testdata
* Move GitHub related functionality to ./experimental (it is not in use)
* Comment out test in ./cmd/sync
* Fix test in ./git
Unexported fields are skipped during marshalling, so this would be a
nop.
The code in `./cmd/sync/github*.go` is currently unused.
Confirmed that staticcheck passes when run with:
```
staticcheck -checks SA9005 ./...
```
Functionality from `io/ioutil` has moved to the `io` and `os` packages
in go1.16 ([reference](https://pkg.go.dev/io/ioutil)).
Confirmed that staticcheck passes when run with:
```
staticcheck -checks SA1019 ./...
```