databricks-cli

Commit Graph

Author	SHA1	Message	Date
shreyas-goenka	bfa20cdec9	Add json tags to output fields (#269 ) output now: ``` { "run_page_url": "https://adb-309687753508875.15.azuredatabricks.net/?o=309687753508875#job/1077573342009637/run/19099317", "task_outputs": { "my_notebook_task": { "result": "computed results from notebook." } } }% ```	2023-03-21 18:38:11 +01:00
shreyas-goenka	75d516939b	Error out if notebook file does not exist locally (#261 ) Adds check for whether file exists locally case 1: local (relative) file does not exist ``` foo: name: "[job-output] test-job by shreyas" tasks: - task_key: my_notebook_task existing_cluster_id: * notebook_task: notebook_path: "./doesnotexist" ``` output: ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle deploy Error: notebook ./doesnotexist not found. Error: open /Users/shreyas.goenka/projects/job-output/doesnotexist: no such file or directory ``` case 2: remote (absolute) file does not exist ``` foo: name: "[job-output] test-job by shreyas" tasks: - task_key: my_notebook_task existing_cluster_id: * notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/doesnotexist" ``` output: ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle deploy shreyas.goenka@THW32HFW6T job-output % bricks bundle run foo Error: failed to reach TERMINATED or SKIPPED, got INTERNAL_ERROR: Task my_notebook_task failed with message: Notebook not found: /Users/shreyas.goenka@databricks.com/doesnotexist. This caused all downstream tasks to get skipped. ``` case 3: remote exists Successful deploy and run	2023-03-21 18:13:16 +01:00
Pieter Noordhuis	9100680162	Allow logger defaults to be configured through environment variables (#266 ) These environment variables configure defaults for the logger related flags: * `BRICKS_LOG_FILE` * `BRICKS_LOG_LEVEL` * `BRICKS_LOG_FORMAT`	2023-03-21 17:05:04 +01:00
Pieter Noordhuis	7dcc0d4b41	Fix test (#268 ) Follow up to #267.	2023-03-21 16:34:16 +01:00
shreyas-goenka	047a189c1e	Add job run output logging (#260 ) This PR adds output logging for job runs Tested using unit tests and manually	2023-03-21 16:25:18 +01:00
Pieter Noordhuis	e7a7e5b95a	Configure log level to info by default (#267 ) Note: we log at INFO level by default until we implement progress reporting to stdout/stderr.	2023-03-21 16:14:20 +01:00
shreyas-goenka	4ac2e33def	Throw error when job run is skipped due to max_concurrent_runs (#257 ) Tested manually: Before we did not have get any errors/logs and silently failed in this case ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle run foo Error: run skipped: Skipping this run because the limit of 1 maximum concurrent runs has been reached. ```	2023-03-21 13:17:15 +01:00
Pieter Noordhuis	66ca9ec266	Add permissions block to each resource (#264 ) Example: ```yaml resources: jobs: my_job: name: "[${bundle.environment}] My job" permissions: - level: CAN_VIEW group_name: users ```	2023-03-21 10:58:16 +01:00
Pieter Noordhuis	58563b1ea9	Add resources for mlflow models and experiments (#263 ) Manually confirmed that both can be deployed.	2023-03-20 21:28:43 +01:00
Pieter Noordhuis	077ab8b864	Update Terraform provider schema structs (#265 ) Generated from provider version 1.13.0.	2023-03-20 17:22:55 +01:00
shreyas-goenka	ae09eb02d5	Path escape file path in filer interface (#254 )	2023-03-17 17:42:35 +01:00
Pieter Noordhuis	ad666ff796	Use new logger throughout codebase (#256 )	2023-03-17 15:17:31 +01:00
Pieter Noordhuis	c9340d6317	Drain sync event channel before returning (#253 ) Not waiting means the last few events may or may not be printed. This is relevant in the mode where sync runs once and then terminates.	2023-03-16 17:48:17 +01:00
Pieter Noordhuis	32a29c6af4	Add structured logging infrastructure (#246 ) New global flags: * `--log-file FILE`: can be literal `stdout`, `stderr`, or a file name (default `stderr`) * `--log-level LEVEL`: can be `error`, `warn`, `info`, `debug`, `trace`, or `disabled` (default `disabled`) * `--log-format TYPE`: can be `text` or `json` (default `text`) New functions in the `log` package take a `context.Context` and retrieve the logger from said context. Because we carry the logger in a context, adding [attributes](https://pkg.go.dev/golang.org/x/exp/slog#hdr-Attrs_and_Values) to the logger can be done as follows: ```go ctx = log.NewContext(ctx, log.GetLogger(ctx).With("foo", "bar")) ```	2023-03-16 14:46:53 +01:00
shreyas-goenka	7faa9dea9b	Use tracker for reference loop tracking (#252 ) We incorrectly relied on map key iteration order to print debug trace. This PR switches over to using the tracker struct to allow more reliable json schema reference loop detection and logging This also fixes the failing TestSelfReferenceLoopErrors and TestCrossReferenceLoopErrors tests	2023-03-16 12:57:57 +01:00
shreyas-goenka	207777849b	Log latest error event on pipeline run fail (#239 ) DAB config used to test this: bundle.yml ``` workspace: host: <deco-azure-prod> bundle: name: deco-538 resources: pipelines: foo: name: "[${bundle.name}] log pipeline errors" libraries: - notebook: path: ./myNb.py development: true ``` myNb.py ``` # Databricks notebook source print(1/0) ``` Before: ``` 2023/03/09 01:28:44 [INFO] [pipelines.foo] Update available at * 2023/03/09 01:28:44 [INFO] [pipelines.foo] Update status: CREATED 2023/03/09 01:28:46 [INFO] [pipelines.foo] Update status: INITIALIZING 2023/03/09 01:28:52 [INFO] [pipelines.foo] Update status: FAILED 2023/03/09 01:28:52 [INFO] [pipelines.foo] Update has failed! Error: update failed ``` Now: ``` 2023/03/09 01:29:31 [INFO] [pipelines.foo] Update available at * 2023/03/09 01:29:31 [INFO] [pipelines.foo] Update status: CREATED 2023/03/09 01:29:33 [INFO] [pipelines.foo] Update status: INITIALIZING 2023/03/09 01:29:40 [INFO] [pipelines.foo] Update status: FAILED 2023/03/09 01:29:40 [INFO] [pipelines.foo] Update has failed! 2023/03/09 01:29:40 [ERROR] [pipelines.foo] Update 27bc77 is FAILED. trace for most recent exception: Failed to execute python command for notebook '/Users/shreyas.goenka@databricks.com/.bundle/deco-538/default/files/myNb' with id RunnableCommandId(9070319781942164851) and error AnsiResult(--------------------------------------------------------------------------- ZeroDivisionError Traceback (most recent call last) <command--1> in <cell line: 1>() ----> 1 print(1/0) ZeroDivisionError: division by zero,Map(),Map(),List(),List(),Map()) Error: update failed ```	2023-03-16 12:23:46 +01:00
dependabot[bot]	dccb0aafce	Bump github.com/fatih/color from 1.14.1 to 1.15.0 (#245 )	2023-03-15 19:02:08 +01:00
dependabot[bot]	5b0e62d37f	Bump github.com/databricks/databricks-sdk-go from 0.4.1 to 0.5.0 (#247 )	2023-03-15 19:01:05 +01:00
shreyas-goenka	715a4dfb21	Path escape filepaths in the URL (#250 ) Before we were using url query escaping to escape the file path. This is wrong since the file path is a part of the URL path rather than URL query. These encoding schemes are similar but do not have identical encodings which was why we got these weird edge cases Fixed, and added nightly test for assert for this ``` 2023/03/15 16:07:50 [INFO] Action: PUT: .gitignore, a b/bar.py, c+d/uno.py, foo.py 2023/03/15 16:07:51 [INFO] Uploaded foo.py 2023/03/15 16:07:51 [INFO] Uploaded a b/bar.py 2023/03/15 16:07:51 [INFO] Uploaded .gitignore 2023/03/15 16:07:51 [INFO] Uploaded c+d/uno.py 2023/03/15 16:07:51 [INFO] Initial Sync Complete ``` ``` [VSCODE] bricks cli path: /Users/shreyas.goenka/.vscode/extensions/databricks.databricks-0.3.4-darwin-arm64/bin/bricks [VSCODE] sync command args: sync,.,/Repos/shreyas.goenka@databricks.com/sync-fail.ide,--watch,--output,json -------------------------------------------------------- Starting synchronization (4 files) Uploaded .gitignore Uploaded foo.py Uploaded c+d/uno.py Uploaded a b/bar.py Completed synchronization ```	2023-03-15 17:25:57 +01:00
shreyas-goenka	c40e428469	skip flaky cross reference test (#251 )	2023-03-15 17:09:52 +01:00
shreyas-goenka	92d1dd7e48	skip failing test for now (#249 )	2023-03-15 16:57:41 +01:00
shreyas-goenka	18a216bf97	Add openapi descriptions to bundle resources (#229 ) This PR: 1. Adds autogeneration of descriptions for `resources` field 2. Autogenerates empty descriptions for any properties in DABs 3. Defines SOPs for how to refresh these descriptions 4. Adds command to generate this documentation 5. Adds Automatically copy any descriptions over to `environments` property Basically it provides a framework for adding descriptions to the generated JSON schema Tested manually and using unit tests	2023-03-15 03:18:51 +01:00
dependabot[bot]	54f6f69cb8	Bump github.com/hashicorp/terraform-json from 0.15.0 to 0.16.0 (#241 )	2023-03-10 11:57:33 +01:00
shreyas-goenka	316a006125	Add check for file exists incase of conflicting remote names (#244 ) Before: ``` shreyas.goenka@THW32HFW6T deco-538-pipeline-error % bricks bundle deploy Error: both myNb.py and myNb.sql point to the same remote file location myNb. Please remove one of them from your local project ``` Even though myNb.sql was created by renaming myNb.py Now deployments are successful	2023-03-10 11:52:45 +01:00
shreyas-goenka	c4c8f944f3	Remove redundant terraform initialize mutator (#238 ) Tested manually that bricks bundle run runs a pipeline. phases.Initialize already has terraform.Initialize mutator	2023-03-09 15:05:02 +01:00
Pieter Noordhuis	c567b2abc0	Bump databricks-sdk-go to 0.4.1 (#240 )	2023-03-09 14:29:53 +01:00
Pieter Noordhuis	fe738ede6a	Let sync return early if an error occurs (#235 ) The previous approach would proceed to execute all requests prior to returning the first error. This is solved with `errgroup.WithContext` that cancels the context if a routine returns an error.	2023-03-09 13:29:05 +01:00
dependabot[bot]	7ade32c734	Bump golang.org/x/mod from 0.8.0 to 0.9.0 (#232 )	2023-03-09 10:29:51 +01:00
dependabot[bot]	16efd6960a	Bump github.com/hashicorp/terraform-exec from 0.17.3 to 0.18.1 (#233 )	2023-03-09 10:29:36 +01:00
dependabot[bot]	1530065e87	Bump golang.org/x/text from 0.7.0 to 0.8.0 (#231 )	2023-03-09 10:28:53 +01:00
dependabot[bot]	c5ffa60ec1	Bump github.com/stretchr/testify from 1.8.1 to 1.8.2 (#227 )	2023-03-09 10:27:51 +01:00
dependabot[bot]	0ea3d28c3e	Bump github.com/databricks/databricks-sdk-go from 0.3.2 to 0.3.3 (#225 )	2023-03-09 10:27:33 +01:00
Pieter Noordhuis	46cfa747ac	Move and hide launch and test commands (#222 ) Semantics in the context of a bundle are not yet clearly defined. Moving and hiding these commands until then.	2023-03-09 10:26:56 +01:00
Fabian Jakobs	f0c35a2b27	Initialize BRICKS_CLI_PATH and increase default OAuth timeout (#237 ) related to https://github.com/databricks/databricks-sdk-go/pull/330	2023-03-08 16:14:24 +01:00
Pieter Noordhuis	65b3f998ba	Escape URL in filer (#236 ) Also see #228.	2023-03-08 14:27:05 +01:00
Fabian Jakobs	da4b58a897	Fix link to workspace after AWS OAuth login (#234 ) `Host` is already normalized and always has the `https://` prefix.	2023-03-08 11:56:46 +01:00
Pieter Noordhuis	e872b587cc	Add optional JSON output for sync command (#230 ) JSON output makes it easy to process synchronization progress information in downstream tools (e.g. the vscode extension). This changes introduces a `sync.Event` interface type for progress events as well as an `sync.EventNotifier` that lets the sync code pass along progress events to calling code. Example output in text mode (default, this uses the existing logger calls): ```text 2023/03/03 14:07:17 [INFO] Remote file sync location: /Repos/pieter.noordhuis@databricks.com/... 2023/03/03 14:07:18 [INFO] Initial Sync Complete 2023/03/03 14:07:22 [INFO] Action: PUT: foo 2023/03/03 14:07:23 [INFO] Uploaded foo 2023/03/03 14:07:23 [INFO] Complete 2023/03/03 14:07:25 [INFO] Action: DELETE: foo 2023/03/03 14:07:25 [INFO] Deleted foo 2023/03/03 14:07:25 [INFO] Complete ``` Example output in JSON mode: ```json {"timestamp":"2023-03-03T14:08:15.459439+01:00","seq":0,"type":"start"} {"timestamp":"2023-03-03T14:08:15.459461+01:00","seq":0,"type":"complete"} {"timestamp":"2023-03-03T14:08:18.459821+01:00","seq":1,"type":"start","put":["foo"]} {"timestamp":"2023-03-03T14:08:18.459867+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":0} {"timestamp":"2023-03-03T14:08:19.418696+01:00","seq":1,"type":"progress","action":"put","path":"foo","progress":1} {"timestamp":"2023-03-03T14:08:19.421397+01:00","seq":1,"type":"complete","put":["foo"]} {"timestamp":"2023-03-03T14:08:22.459238+01:00","seq":2,"type":"start","delete":["foo"]} {"timestamp":"2023-03-03T14:08:22.459268+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":0} {"timestamp":"2023-03-03T14:08:22.686413+01:00","seq":2,"type":"progress","action":"delete","path":"foo","progress":1} {"timestamp":"2023-03-03T14:08:22.688989+01:00","seq":2,"type":"complete","delete":["foo"]} ``` --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2023-03-08 10:27:19 +01:00
shreyas-goenka	5166055efb	[DECO-553] Escape file path strings in URL (#228 ) Tested manually Before: ``` shreyas.goenka@THW32HFW6T test-dbx % bricks sync --full . /Repos/shreyas.goenka@databricks.com/test-dbx 2023/02/27 19:51:17 [INFO] Remote file sync location: /Repos/shreyas.goenka@databricks.com/test-dbx 2023/02/27 19:51:17 [INFO] Action: PUT: #foo.py, .gitignore 2023/02/27 19:51:19 [INFO] Uploaded .gitignore Error: Creating file failed. An item with path /Repos/shreyas.goenka@databricks.com/test-dbx already exists ``` After: ``` shreyas.goenka@THW32HFW6T test-dbx % bricks sync --full . /Repos/shreyas.goenka@databricks.com/test-dbx 2023/02/27 19:51:46 [INFO] Remote file sync location: /Repos/shreyas.goenka@databricks.com/test-dbx 2023/02/27 19:51:46 [INFO] Action: PUT: #foo.py, .gitignore 2023/02/27 19:51:47 [INFO] Uploaded .gitignore 2023/02/27 19:51:47 [INFO] Uploaded #foo.py ```	2023-02-28 03:17:13 +01:00
shreyas-goenka	2615d66945	[DECO-531] Increase timeout for file import api calls (#223 ) This PR increases the client side timeout for upload API calls to 10 minutes to give sync enough time to import larger files	2023-02-22 16:01:58 +01:00
Pieter Noordhuis	9d3a0da073	Detect Jupyter notebook files (#219 ) Files with extension `.ipynb` are imported are Jupyter notebooks. This code detects 1) if the file is a valid Jupyter notebook and 2) the Databricks specific language it contains.	2023-02-21 13:49:01 +01:00
shreyas-goenka	f93b541b63	Show detailed error logs for jobs (#209 ) PR for how to render errors on console for jobs. Here is the bundle used for the logs below: ``` bundle: name: deco-438 workspace: host: https://adb-309687753508875.15.azuredatabricks.net resources: jobs: foo: name: "[${bundle.name}][${bundle.environment}] a test notebook" tasks: - task_key: alpha existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] invalid notebook" - task_key: beta existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/does-not-exist" - task_key: gamma existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] valid notebook" ``` And this is a screenshot of the logs from the console: <img width="1057" alt="Screenshot 2023-02-17 at 7 12 29 PM" src="https://user-images.githubusercontent.com/88374338/219744768-ab7f1e79-db8f-466a-ad6d-f2b6f85ed17c.png"> Here are the logs when only tasks gamma is executed (successfully): <img width="1059" alt="Screenshot 2023-02-17 at 7 13 04 PM" src="https://user-images.githubusercontent.com/88374338/219744992-011d8b91-ec1d-44f0-a849-83c81816dd9f.png"> TODO: Investigate more possible job errors, and make sure state for them is handled in a robust way here	2023-02-20 23:40:14 +01:00
Pieter Noordhuis	ae9d6883ee	Complete argument for the environment flag (#221 ) Command completion can be configured through `bricks completion`.	2023-02-20 21:56:31 +01:00
Pieter Noordhuis	dd95668474	Complete positional argument to bundle run (#220 ) Command completion can be configured through `bricks completion`.	2023-02-20 21:55:06 +01:00
Pieter Noordhuis	9912ee1f92	Materialize glob expansion in configuration struct (#217 ) This is needed to figure out which files should adhere to the schema.	2023-02-20 21:01:28 +01:00
Pieter Noordhuis	7398a6d1e4	Add sample ipynb files (#218 ) Co-authored-by: pietern <pietern>	2023-02-20 20:03:20 +01:00
Pieter Noordhuis	a0ed02281d	Execute file synchronization on deploy (#211 ) 1. Perform file synchronization on deploy 2. Update notebook file path translation logic to point to the synchronization target rather than treating the notebook as an artifact and uploading it separately.	2023-02-20 19:42:55 +01:00
Pieter Noordhuis	414ea4f891	Bump databricks-sdk-go to 0.3.2 (#215 )	2023-02-20 16:00:20 +01:00
Pieter Noordhuis	5b56b3e815	Include commit hash in snapshot version (#193 ) After this change the version for snapshot builds looks like `0.0.21-dev+65020f3`. This is valid semver per the regexp on https://semver.org/.	2023-02-20 15:46:57 +01:00
Pieter Noordhuis	ca04a6a1dd	Fix sync test (miss in #207 ) (#216 )	2023-02-20 15:41:37 +01:00
Pieter Noordhuis	584c8d1b0b	Allow synchronization to a directory inside a repo (#213 ) Before this commit this would error saying that the repo doesn't exist yet. With this commit it creates the directory, but only after checking that the repo exists.	2023-02-20 14:34:48 +01:00

1 2 3 4 5 ...

287 Commits All Branches Search

287 Commits

All Branches