This PR:
1. Adds autogeneration of descriptions for `resources` field
2. Autogenerates empty descriptions for any properties in DABs
3. Defines SOPs for how to refresh these descriptions
4. Adds command to generate this documentation
5. Adds Automatically copy any descriptions over to `environments`
property
Basically it provides a framework for adding descriptions to the
generated JSON schema
Tested manually and using unit tests
Before:
```
shreyas.goenka@THW32HFW6T deco-538-pipeline-error % bricks bundle deploy
Error: both myNb.py and myNb.sql point to the same remote file location myNb. Please remove one of them from your local project
```
Even though myNb.sql was created by renaming myNb.py
Now deployments are successful
The previous approach would proceed to execute all requests prior to
returning the first error. This is solved with `errgroup.WithContext`
that cancels the context if a routine returns an error.
JSON output makes it easy to process synchronization progress
information in downstream tools (e.g. the vscode extension).
This changes introduces a `sync.Event` interface type for progress events as
well as an `sync.EventNotifier` that lets the sync code pass along
progress events to calling code.
Example output in text mode (default, this uses the existing logger calls):
```text
2023/03/03 14:07:17 [INFO] Remote file sync location: /Repos/pieter.noordhuis@databricks.com/...
2023/03/03 14:07:18 [INFO] Initial Sync Complete
2023/03/03 14:07:22 [INFO] Action: PUT: foo
2023/03/03 14:07:23 [INFO] Uploaded foo
2023/03/03 14:07:23 [INFO] Complete
2023/03/03 14:07:25 [INFO] Action: DELETE: foo
2023/03/03 14:07:25 [INFO] Deleted foo
2023/03/03 14:07:25 [INFO] Complete
```
Example output in JSON mode:
```json
{"timestamp":"2023-03-03T14:08:15.459439+01:00","seq":0,"type":"start"}
{"timestamp":"2023-03-03T14:08:15.459461+01:00","seq":0,"type":"complete"}
{"timestamp":"2023-03-03T14:08:18.459821+01:00","seq":1,"type":"start","put":["foo"]}
{"timestamp":"2023-03-03T14:08:18.459867+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":0}
{"timestamp":"2023-03-03T14:08:19.418696+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":1}
{"timestamp":"2023-03-03T14:08:19.421397+01:00","seq":1,"type":"complete","put":["foo"]}
{"timestamp":"2023-03-03T14:08:22.459238+01:00","seq":2,"type":"start","delete":["foo"]}
{"timestamp":"2023-03-03T14:08:22.459268+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":0}
{"timestamp":"2023-03-03T14:08:22.686413+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":1}
{"timestamp":"2023-03-03T14:08:22.688989+01:00","seq":2,"type":"complete","delete":["foo"]}
```
---------
Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
Files with extension `.ipynb` are imported are Jupyter notebooks.
This code detects 1) if the file is a valid Jupyter notebook and 2) the
Databricks specific language it contains.
PR for how to render errors on console for jobs.
Here is the bundle used for the logs below:
```
bundle:
name: deco-438
workspace:
host: https://adb-309687753508875.15.azuredatabricks.net
resources:
jobs:
foo:
name: "[${bundle.name}][${bundle.environment}] a test notebook"
tasks:
- task_key: alpha
existing_cluster_id: 1109-115254-ox7poobk
notebook_task:
notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] invalid notebook"
- task_key: beta
existing_cluster_id: 1109-115254-ox7poobk
notebook_task:
notebook_path: "/does-not-exist"
- task_key: gamma
existing_cluster_id: 1109-115254-ox7poobk
notebook_task:
notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] valid notebook"
```
And this is a screenshot of the logs from the console:
<img width="1057" alt="Screenshot 2023-02-17 at 7 12 29 PM"
src="https://user-images.githubusercontent.com/88374338/219744768-ab7f1e79-db8f-466a-ad6d-f2b6f85ed17c.png">
Here are the logs when only tasks gamma is executed (successfully):
<img width="1059" alt="Screenshot 2023-02-17 at 7 13 04 PM"
src="https://user-images.githubusercontent.com/88374338/219744992-011d8b91-ec1d-44f0-a849-83c81816dd9f.png">
TODO: Investigate more possible job errors, and make sure state for them
is handled in a robust way here
1. Perform file synchronization on deploy
2. Update notebook file path translation logic to point to the
synchronization target rather than treating the notebook as an artifact
and uploading it separately.
Before this commit this would error saying that the repo doesn't exist yet.
With this commit it creates the directory, but only after checking that
the repo exists.
Invoke with `bricks sync SRC DST`.
In bundle context `SRC` and `DST` arguments are taken from bundle configuration.
This PR adds `bricks bundle sync` to disambiguate between the two.
Once the VS Code extension is bundle aware they can again be consolidated.
Consolidating them today would regress the VS Code experience if a
`bundle.yml` file is present in the file tree.
Example when called from vscode (and everything is hooked up):
```
> * User-Agent: bricks/0.0.21-devel databricks-sdk-go/0.2.0 go/1.19.4 os/darwin upstream/databricks-vscode
```
This configures the user agent with the bricks version and the name of
the command being executed.
Example user agent value:
```
> * User-Agent: bricks/0.0.21-devel databricks-sdk-go/0.2.0 go/1.19.4 os/darwin cmd/sync auth/pat
```
This is a follow up for #194.
Includes relevant fields listed on
https://goreleaser.com/customization/templates/ into build artifacts.
The version command outputs the version by default:
```
$ bricks version
0.0.21-devel
```
Or all build information if `--json` is specified:
```
$ bricks version --json
{
"ProjectName": "bricks",
"Version": "0.0.21-devel",
"Branch": "version-info",
"Tag": "v0.0.20",
"ShortCommit": "193b56b",
"FullCommit": "193b56b0929128c0836d35e913c46fd66fa2a93c",
"CommitTime": "2023-02-02T22:04:42+01:00",
"Summary": "v0.0.20-5-g193b56b",
"Major": 0,
"Minor": 0,
"Patch": 20,
"Prerelease": "",
"IsSnapshot": true,
"BuildTime": "2023-02-02T22:07:36+01:00"
}
```
This commit changes the code in repository.go to lazily load gitignore
files as opposed to the previous eager approach. This means that the
signature of the `Ignore` function family has changed to return `(bool,
error)`.
This lazy approach fits better when other code is responsible for
recursively walking the file tree, because we never know up front which
gitignore files need to be loaded to compute the ignores. It also means
we no longer have to "prime" the `Repository` instance with a particular
directory we're interested in and rather let calls to `Ignore` load
whatever is needed.
The fileset wrapper under `git/` internally taints all gitignore objects
to force a call to [os.Stat] followed by a reload if they have changed,
before calling into the [fileset.FileSet] functions for recursively
listing files.